EDK : FSL macros defined by Xilinx are wrong

Jon Beniston wrote:
Jedi <me@aol.com> wrote in message news:<hgMMd.1754$zk.836@read3.inet.fi>...

Anybody has an idea why the NIOSII 1.1 toolchain build fails
on Linux/BSD systems with:

*** ld does not support target nios2-elf
*** see ld/configure.tgt for supported targets
make: *** [configure-ld] Error 1

Somehow it looses "nios2-unknown-elf"...


It builds fine on Win2k under Cygwin...


Have you used the same value for $prefix for both binutils and gcc? Is
$prefix/bin in your path?
Of course (o;

And building binutils doesn't require ¤prefix/bin at all in this stage (o;


rick
 
Marc Randolph wrote:

newman5382 wrote:

There is a school of thought that all off chip IO should be
inferred/instantiated at the top level, and not in sub-modules.
...

Is there a good reason for this school of thought?
Mainly ASICs are the reason. Many ASIC tools expect that the
topmost level is used for IO-ring, test structures, plls etc.
And that level instiates the real functional logic.

Also in that style it's easier to do technology indenepdent
designs. For example if the IO-pads need to instantiated manually
they are all in the same place, and only that code needs changes
if FPGA vendor is changed. As far as I know DDR style IO-pads
can't be inferred easily for example.

And for example in hierarchical ASIC design you can't easily move some cells
from the already laid out hard macro to the toplevel etc. So they should
be in the topmost level already, which is last one to be completed.

Using that concept, when I go to take that top level and create a 4x
version of it, I can't just create a new top level with a generate
statement. Now I have to go edit a completely working design and
convert all the inouts to seperate in's and out's. And if that
original block is still being used in the original design, I now have
two different versions of the exact same thing that I have to maintain.
You generate a new toplevel that instantiates 4 versions of the core logic
that has still the in and out ports separated. And then just recreate the
IO-drivers in the new 4x toplevel. So you have only one functional logic
block to maintain, and two different toplevel designs.

--Kim
 
"Marc Randolph" <mrand@my-deja.com> wrote in message
news:1108993065.915981.200990@o13g2000cwo.googlegroups.com...
newman5382 wrote:

This is probably the price to pay for such a cheap tool, so I
should not
really complain. Synplify will allow you to use inouts in sub
modules, but
it costs much more than XST.

There is a school of thought that all off chip IO should be
inferred/instantiated at the top level, and not in sub-modules.

-Newman

Is there a good reason for this school of thought?

Using that concept, when I go to take that top level and create a 4x
version of it, I can't just create a new top level with a generate
statement. Now I have to go edit a completely working design and
convert all the inouts to seperate in's and out's. And if that
original block is still being used in the original design, I now have
two different versions of the exact same thing that I have to maintain.

Have fun,

Marc
I/O cells tend to be vendor specific. Grouping vendor specific items into
specific areas rather having them scattered through out a design would seem
to be a good thing. One reasonable place to collect the external I/O is at
the top level. One project that was prototyped as a Xilinx FPGA, but slated
for ASIC conversion instantiated I/O at the top level so the I/O cells could
be easily swapped out with the vendor's ASIC I/O with a minimum chance of
error. I have found that there are issues when you selectively flatten
different sub modules when I/O are inferred within
sub modules. It used to be that in some ASIC designs, there are test
structures that link all the IO cells together, if these cells are scattered
through the hiearchy, it is more difficult to chain them together. In
addition, I guess there are designs that may or may not have external
bidirect busses, and some with internal tristate busses. I personally do
not like to see inout in the sub hierarchy if there are no (evil) bidirect
busses.

Do what you want to do. If it makes sense to you, and you can justify it,
go for it. Skill is knowing how to do it. Leadership is knowing what to do
and why.

-Newman
 
We discovered the source of our 'cant wake up' problems late Friday....
unfortunately, I have not determined how, or even if, the problem can be or
should be circumvented....

Here's the story of what was/is:


1)We put uP to sleep


2)An ext int happens.


3)The uP wakes up:

3a)Stores the context, all regs, etc on his ISR context stack-- including
MSR register.

3b)We mod the TCR to re-enable timer interrupts

3c)The ISR is serviced

3d)The original context is restored.

4)Now, we're back asleep since orig context is restored.


So we're continuously being put back to sleep and our mods to the TCR/MSR
are promptly overwritten by the ISR context restore at the interrupt exit.
We tried modifying SRR1 in the ISR to clear the WE bit--but that doesn't
work. It apparently uses the ISR stack context copy and then restores SRR1
from the stack, then SRR1 to MSR upon rfi.

I'm looking at how to circumvent it in asm. Your thoughts?

I understand some VHDL needs to be written for a full implementation of
power management--but I'm not at all sure how it is all supposed to play
together. I read about the sleep req and other signals--but I don't follow
how this comes into play with the level of management I am currently trying
to get working. The VHDL does make sense to me when you start actually
physically changing clk freq or disabling clk into the 405 core. An example
from Xilinx is apparently too much to hope for... to my and my FAE's
knowledge there are no CPM examples or std. CPM core--not even for their
ML310 development boards, but I digress.

As an aside, I cannot seem to find an asm instruction that allows you to
store the contents of a given register at a desired location (ie offset from
stack context register R31). In other words, what instruction would I use to
store contents of R20 at address specified by (R31 + offset)? And the doc
from Xilinx does not have all instructions that the compiler is generating
(like "lis").

Paul
 
"Bo" <bo@cephus.com> wrote in message
news:Y9nSd.1103$i32.86@fe40.usenetserver.com...
We discovered the source of our 'cant wake up' problems late Friday....
unfortunately, I have not determined how, or even if, the problem can be
or should be circumvented....

Here's the story of what was/is:


1)We put uP to sleep


2)An ext int happens.


3)The uP wakes up:

3a)Stores the context, all regs, etc on his ISR context stack--
including MSR register.

3b)We mod the TCR to re-enable timer interrupts

3c)The ISR is serviced

3d)The original context is restored.

4)Now, we're back asleep since orig context is restored.


So we're continuously being put back to sleep and our mods to the TCR/MSR
are promptly overwritten by the ISR context restore at the interrupt exit.
We tried modifying SRR1 in the ISR to clear the WE bit--but that doesn't
work. It apparently uses the ISR stack context copy and then restores SRR1
from the stack, then SRR1 to MSR upon rfi.

I'm looking at how to circumvent it in asm. Your thoughts?

I understand some VHDL needs to be written for a full implementation of
power management--but I'm not at all sure how it is all supposed to play
together. I read about the sleep req and other signals--but I don't follow
how this comes into play with the level of management I am currently
trying to get working. The VHDL does make sense to me when you start
actually physically changing clk freq or disabling clk into the 405 core.
An example from Xilinx is apparently too much to hope for... to my and my
FAE's knowledge there are no CPM examples or std. CPM core--not even for
their ML310 development boards, but I digress.

As an aside, I cannot seem to find an asm instruction that allows you to
store the contents of a given register at a desired location (ie offset
from stack context register R31). In other words, what instruction would I
use to store contents of R20 at address specified by (R31 + offset)? And
the doc from Xilinx does not have all instructions that the compiler is
generating (like "lis").

Paul
I don't pretend to completely comprehend what you found, but I think a "Nice
Catch" is in order.

I saw in the ppc_ref_guide.pdf that

lis has an equivalent mnemonic addlis page 534. The user is required to
skip around this document to look up a simple thing, and it is quite
annoying.

-Newman
 
Have you re-run timing analysis on the 5.3 design, but using the latest
timing analyser and latest speed files?
No, because I don't think there's any timing issue here. The logic is
trivial and runs at low speed. We are using the same "clock generation"
module in several other designs, without any issues. We have products
running 24/7 for two years now without a single issue. As I stated before,
the problem appeared only with the selected chips.

But, I will test our new Virtex-II designs with the latest timing analyzer
and latest speed files as you suggested. It's a good idea for new designs.

With 6.1, have you tried MPPR (multi-pass pacement and routing)?
Sometimes modifying the placement (in FPGA editor) of failing paths and
re-running "re-entrant routing" can fix problems, if there are only a
small number of failing paths.
Yes, I have. I tried 6.1, 6.2 and 6.3. It's always the same story.
Placer/Router does a lousy job. Either the constraints can't be met or the
router can't connect all the nets. ISE 5.2 SP3 completes without any errors
and reports 7 logic levels for the constraint. On the other hand ISE 6.x
reports 16 logic levels for the same constraint.

In my experience (for the Virtex-II family) if the design takes less than
~90% of chip resources then the results of ISE 6.x are similar to the ISE
5.2 SP3, sometimes even better, but as soon as design takes more than 95% of
all chip resources then ISE 6.x gives up. Similarly I still use ISE 3.3 for
SpartanXL and Spartan2 designs, because ISE 4.2 or newer don't produce the
desired results. I know a lot depends on the synthesis tool (I'm using
synplicity)...

Thanks for you suggestions,
Igor Bizjak
 
On Mon, 21 Feb 2005 17:30:42 +0100, "IgI" <igorsath@hotmail.com> wrote:

Have you re-run timing analysis on the 5.3 design, but using the latest
timing analyser and latest speed files?

No, because I don't think there's any timing issue here. The logic is
trivial and runs at low speed.
Maybe not, unless there are hold time or skew issues, not properly
covered by older speed files. I would try the newer speed files on the
"suspect" design just to check.

With 6.1, have you tried MPPR (multi-pass pacement and routing)?
Sometimes modifying the placement (in FPGA editor) of failing paths and
re-running "re-entrant routing" can fix problems, if there are only a
small number of failing paths.

Yes, I have. I tried 6.1, 6.2 and 6.3. It's always the same story.
Placer/Router does a lousy job. Either the constraints can't be met or the
router can't connect all the nets. ISE 5.2 SP3 completes without any errors
and reports 7 logic levels for the constraint. On the other hand ISE 6.x
reports 16 logic levels for the same constraint.
That can be illusory, if 9+ of those 16 levels are carry logic. It may
reflect relatively small differences in placement or routing getting
on/off the carry chain.

But I have found (a) a LUT connected to a long carry chain but placed on
the other side of the chip ... and (with a heavily floorplanned design,
where the placer can't do that) (b) a signal taking 3ns to get from one
CLB to its immediate neighbour.

The former (if an isolated incident) can be fixed in FPGA editor,
the latter either reflects severe congestion (whatever happened to
"view/congestion map" in the floorplanner?) or a very lazy router.

In my experience (for the Virtex-II family) if the design takes less than
~90% of chip resources then the results of ISE 6.x are similar to the ISE
5.2 SP3, sometimes even better, but as soon as design takes more than 95% of
all chip resources then ISE 6.x gives up. Similarly I still use ISE 3.3 for
SpartanXL and Spartan2 designs, because ISE 4.2 or newer don't produce the
desired results. I know a lot depends on the synthesis tool (I'm using
synplicity)...
Interesting. I didn't know that about the 5.x-6.x problems, but Ray
Andraka has commented on the relative performance of 3.3 vs later in the
past. (Google may help a little) I'm still using 3.3 in "production"!

My experience so far with 6.x (Webpack) is that it will never meet
reasonable constraints, but radically overconstraining it will improve
results.

For example, if I want 10 ns, and ask for it, I get 10.5 ns. But if I
ask for 9 ns I get 9.8, (or 10.1) and if I ask for 8 ns I get 9.5 ns...
I just made up those numbers but they represent the trend I've seen.
If the resulting design passes timing analysis at 10.0 ns, I can't see
any reason not to use it...

Thanks for you suggestions,
Igor Bizjak
Thanks. I've not pushed such high resource usages, so it's interesting
to hear tales from people pushing the chips hard in other respects.

- Brian
 
any news Sylvain ?

have you play with wb_gpio ipcore ?

regards

Jonathan
"Sylvain Munaut" <tnt_at_246tNt_dot_com@reducespam.com> a écrit dans le
message de news: 420fb7ae$0$22479$ba620e4c@news.skynet.be...
Hello,

I'm trying since yesterday to interconnect the opencore mac to a
microblaze design.
After several problems solved, I'm stuck.

The "Generate netlist now works fine" but When I try to "Generate
bitstream",
I have three errors from NgdBuild :


ERROR:NgdBuild:604 - logical block 'wb2opb_0/wb2opb_0' with type 'wb2opb'
could
not be resolved. A Pin name mispelling can cause this, a missing edif or
ngc
file, or the mispelling of a type name. Symbol 'wb2opb' is not supported
in target
'spartan 3'.
ERROR:NgdBuild:604 - logical block 'opb2wb_0/opb2wb_0' with type 'opb2wb'
could
not be resolved. A Pin name mispelling can cause this, a missing edif or
ngc
file, or the mispelling of a type name. Symbol 'opb2wb' is not supported
in target
'spartan 3'.
ERROR:NgdBuild:604 - logical block 'wb_ethermac_0/wb_ethermac_0/maccore'
with
type 'eth_top' could not be resolved. A Pin name mispelling can cause
this, a
missing edif or ngc file, or the mispelling of a type name. Symbol
'eth_top' is
not supported in target 'spartan 3'.


For the wb_ethermac core, I've created a file that includes the eth_top of
the
ethernet mac core on opencore and present the interface to the outside
world.
I've done this as a ISE project then I synthetized it to have a .ngc file
(because
I have both VHDL & Verilog there) then I created an IP from this netfile
and my vhdl top file.

Any one has a clue on what to do ? Has anyone make this work ? (I'm using
ISE/EDK 6.3)


Thanks,

Sylvain
 
This exists because some IP is very tightly bound to its IO pin
type,
such as PCI or DDR SDRAM. We requested this feature from Xilinx a
few
years ago specifically so that we could use the PCI logicore
netlist
inside of a wrapper in EDK, without hand editing the top level
VHDL.

YES and NO.
EDK can work as described, YES.
but the IO buffers are NOT inside the PCI logicore netlist!
PCI IO buffers for Xilinx OPB PCI core are instantiated in the VHDL
code
as can be seen above and NOT inside the PCI logicore netlist!
and as of DDR SDRAM core there the _I _O _T is used as normal
I think you have missed the point. I make absolutely no mention of EDK
IP.

First with respect to the PCI logicore, which if purchased from Xilinx
is provided with an EDF netlist, and a VHDL netlist (or as some would
say "structural VHDL") that instantiates this core netlist, the IO
pins, and a user configurable module. To use this entire package,
without editing any of the IP core from the vendor, which is delivered
as both VHDL and a netlist, one must do as I described. This still
requires some editing of the contraints that are provided by the
vendor, but only for location in the hierarchy.

Secondly, DDR SDRAM was cited as an example because separating out the
tristate buffer from the DDR register is counter to the architecture of
the FPGA. The IO cell in virtex2 contatains a tristate buffer and a
DDR register, each of which only exist in this type of cell. Breaking
off the tristate buffer from the DDR register does not provide for any
sort of meaningful improvement in the hierarchy. Rather, it confuses
the issue. Arguments for breaking the hierarchy at this point that
have been presented in this newsgroup are: using chipscope at the top
level of a hierarchy; or the creation of a vendor agnostic design.
Given that the signal between the DDR register and the OBUF, or the
IBUF and the other DDR register are not observable, as the wires simply
do not exist in the architecture, obviates the first argument. The
fact that the DDR register and the IO buffer are both vendor specific,
and vendor unique, obviates the second argument.

As the FPGAs get more encompassing IO structures, as with V4, it makes
more and more sense to bind the IO structures which will include shift
register, differential pins, etc, into the cores themselves. One is
kidding himself if he believes he can do a high performance and cost
effective design without expressly instantiating specific architectural
elements. The IO structures are no exception.

This is not to say what Xilinx did with the EDK and separating out
tristates did not make sense. It did, when specific IO pin types and
IO structures did not need to be called out from the underlying core.
For example, many people new to FPGAs and the coreconnect bus
architecture have started by using the GPIO module to interface to
their own IP inside the chip. This is a good example of where one may
want to use the same IP for on chip or off chip interconnect, and as
such, not binding the IO pins to the core results in one rather than
two cores being created.



Regards,
Erik.

---
Erik Widding
President
Birger Engineering, Inc.

(mail) 100 Boylston St #1070; Boston, MA 02116
(voice) 617.695.9233
(fax) 617.695.9234
(web) http://www.birger.com
 
All,

Igor has his case now submitted, and it was escalated due to its nature
(basically saying "lines down" makes that happen).

As of 8:30 AM PST 2/22/2005 here in San Jose we have folks on it.

Thanks to all who posted. For those interested, I will probably post
the results here (if Igor agrees) as there seems to be some interest in
lot related failures.

Generally speaking, lot related failures are almost always design
related: either the lot silicon is a little faster, or a little slower
(but within spec's) than the previous lot, and an unconstrained timing
path doesn't work. Sometimes IOs are a little stronger, or a little
weaker, and that too is within spec but makes a difference in a design.

The fabric speed was the case of the customer who designed their own
FIFO (and didn't understand schrnoization circuits), and "lot" related
problems.

In this case, we have just started, so Igor will learn failry quickly
what the differences are between the lots, and we will help resolve what
the cause of the problem is, and provide solutions.

Thanks again to all who have interest in this sort of posting, as it
gives us a chance to educate folks on the services we offer (the
hotline), the escalation procedures for hot cases (lines down), and the
nature of this particular kind of problem, and the types of likely
resolutions we often find.

In no way am I implying that Igor has a funny path in his design: I am
only suggesting that this is often our experience. Rarely (VERY RARELY)
we have lot quality problems, test escapes, etc. that all manufacturers
occasionally have when something doesn't go right in the test group. Of
course, each time that happens, it is cause for reviews of quality and
proceedures so we never make that mistake again!

So to all of you who think you might have a lot quality problem, again,
that is so rare that I only mention it here to be accurate and honest.

Often mentioning something in the news group is like describing a new
rare illness to a hypochondriac, suddenly everyone thinks they are sick
with the new rare disease!

(In which case it isn't rare anymore....)

Austin
 
John_H wrote:

Gaussian jitter is a statistical number. If the peak-to-peak is specified
at 6-sigma (which it often is) the probability is 0.00034% that either
jitter value is at its peak. The probability that *both* values are at
their peaks is below 0.000000000012%
Hmm. That's every five minutes for a 400MHz clock?
So if I use this for my timing budget the chip might fail every five
minutes?

Kolja Sulimma
 
AL wrote:

When I put a constant number into the register and read it
back, it works, but when I have that number changed depending
on an if else statement, it doesn't work anymore.

For example, in the following code:

always @(posedge CLK_IN) begin if(RESET) begin
num = 20+1;
end
else begin
num = 1+1;
end

It would give me 00010011 or 21 even though the
RESET signal has changed.
This is a synchronous reset, which says to set num to 21 if
RESET is high on a rising edge of CLK_IN. If RESET is low on a
rising edge it sets num to 2.

In logic terms, RESET is the enable for a FF, and CLK_IN is the
clock.

(Also, RESET is usually used to describe setting to zero.)

-- glen
 
Stephen Williams wrote:
Unfortunately, our engineer did exactly that: a DCM (used to be a
chain of DCMs) multiplies a PCI 33MHz clock up to 100MHz and sends
it off the chip. The return 100MHz clock is connected to another
DCM which is used as an internal 100MHz clock phased with the SDRAMS.
Another thing to watch out for in certain cascaded and
external DCM constructs, where absolute delay from input_dcm_1
to output_dcm_2 is important, is the intentionally early
( ~1.5ns in V2 ) DCM output clock arrival time when using the
default SYSTEM_SYNCHRONOUS mode of the DCM.

DCM in->out delays were modeled oddly in older versions of the SW,
but I haven't checked how 6.3 handles it- see the Answer Records
listed at the bottom of this old post:

http://groups-beta.google.com/group/comp.arch.fpga/msg/7f691bfe47996336

Brian
 
Kolja,

Exactly. Often folks do not realize that 12, or even 14 sigma is
required if you wish to be 'error free'.

The great news is that the difference between real jitter in physical
systems, and a theoretical gaussian distribution is that in the real
world, we do not have infinite energy.

Real world systems begin to have real physical limits that prevent you
from having to worry about 14 sigma cases.

But, you should worry out to 12 sigma (or be able to tolerate occasional
errors).

Often you hear people say that "jitter is unbounded" which is not
exactly true. What is true is that the longer you measure, the more
jitter you get - up to a point (the peak to peak grows increasingly
slowly as the number of samples increases, so at some point, it isn't
worth the time to wait, as you will only get 10 more ps p-p by waiting
another day!).

"Tail fitting" fits the tails of a gaussian curve (right and left) to
the measured histogram, which is useful for not waiting around forever,
and getting your 12 to 14 sigma results. Or you can add another 10%
after 2 million samples, and be very very close to the "right" answer
(and a little conservative).

We only measure jitter in this way (tail fitting), because all other
methods have the problem that you detailed in your posting: don't you
have to sample (wait) longer to know the 'truth'?

The other point is that in digital systems, it is not the positive
excursion (the lowest or longest period) that gets you in trouble, it is
the shortes (or fastest).

Take the peak to peak jitter, divide it by two, and take that away from
the clock period to find your worst case min clock period. That is the
constraint you need to have slack for, not the clock period itself.

http://tinyurl.com/3jfq6

Austin


Kolja Sulimma wrote:
John_H wrote:

Gaussian jitter is a statistical number. If the peak-to-peak is
specified
at 6-sigma (which it often is) the probability is 0.00034% that either
jitter value is at its peak. The probability that *both* values are at
their peaks is below 0.000000000012%

Hmm. That's every five minutes for a 400MHz clock?
So if I use this for my timing budget the chip might fail every five
minutes?

Kolja Sulimma
 
Hi Glen, I still don't understand what the problem is though :-( So when I changed RESET, shouldn't the register be changing as well???? Thanks, Ann
 
Hi Glen, I still don't understand what the problem is though. So when I changed RESET, shouldn't the register be changing as well???? Right now it's not doing that, I kept getting 21 out for some reason. Thanks, Ann
 
Hi Glen, I still don't understand what the problem is though. So when I changed RESET, shouldn't the register be changing as well???? Right now it's not doing that, I kept getting 21 out for some reason. If I change always @(posedge CLK_IN) to always @(CLK_IN), then it works, do you know why???? Thanks, Ann
 
AL wrote:

Hi Glen, I still don't understand what the problem is though.
So when I changed RESET, shouldn't the register be changing
as well???? Right now it's not doing that, I kept getting 21
out for some reason. If I change always @(posedge CLK_IN) to
always @(CLK_IN), then it works, do you know why???? Thanks,
always @(CLK_IN) begin

means to do those operations on either edge of the clock.
As far as I know, real flip-flops don't do that, and the
synthesis programs won't compile it, but it will simulate.

No, it does not sense change in RESET. An asynchronous
reset should be

always @(posedge CLK_IN or posedge RESET) begin
if(RESET) cnt <= 0;
else cnt <= cnt+1;
end

Well, that is a counter with asynchronous reset, but you can
figure out the rest from there. In this case, if RESET goes
high, or if RESET is high while CLK_IN goes high it sets cnt
to zero, which is the normal property of asynchronous reset.
(Except that it doesn't model metastability, but then simulation
normally doesn't.)

(It should work with either blocking or non-blocking assignment,
but it is a little nicer with non-blocking.)

This is similar to what a 74LS74 flip-flop does, and what most
FPGA's do.

-- glen
 
But the FPGA communicates with JTAG somehow, and JTAG is connected to the parallel port of the PC, so can't we use that connection to write code to get the PC to display something after some kind of event? Thanks, AL
 
Hi Zerang Shah, But the FPGA communicates with JTAG somehow, and JTAG is connected to the parallel port of the PC, so can't we use that connection to write code to get the PC to display something after some kind of event? Thanks, AL
 

Welcome to EDABoard.com

Sponsor

Back
Top