EDK : FSL macros defined by Xilinx are wrong

Apr 21, 2006

Gabor wrote:

Browsing through the "about" links I found this brief note
on Usenet posting:

http://groups.google.com/support/bin/static.py?page=basics.html#flamed

actually, more readers need to learn more about trolls, and just not
feed
them.

http://kb.iu.edu/data/afhc.html
http://en.wikipedia.org/wiki/Internet_troll

Apr 21, 2006

fpga_toys@yahoo.com wrote:

absoulutly true ... but not the whole thread, or even vary many of the
authors in the thread ... the brash branding of the entire thread, even

For those that missed it, there is a long wonderful thread in the
middle of the junk regarding EE's and PE's by people like Ray that are
always generally respected in this forum. The brash branding of the
entire thread is knee jerk - ignore the trolls, follow the content.

Austin Lesea · Apr 21, 2006

toys,

Our FPGAs do not thermally "run away."

Yes, you can melt solder, with enough power.

But, we also provide the tools (web/spreadsheet/xpower) to predict that
the power might be excessive. That is engineering.

There is no way to 'derate' a design, until we know the entire design
(or at least enough of it to run the web/spreadsheet power tools)

Sorry, there is just no other way to deal with the 200,000+ seats of
software, and the 20,000+ designs happening at any given moment.

It is not an ASIC nor an ASSP that does one thing, one way.

It is a programmable device that can do just about anything from
applications just below 1 watt, to applications just above 25 or 30 watts.

Why don't you try using the power prediction tool, and get a feel for
what you can, and can not do?

How would you put that in a data sheet?

The recommended use of the tools is already in the users guides.

Austin

Peter Alfke · Apr 21, 2006

Ray Andraka wrote:>

Yeah, I should have changed the subject in my first reply.

I will do everything I can to keep this newsgroup focused on FPGAs. It
is already a wide span from naive newbie questions and homework
assignments to spicy comments about data sheet content. But that is
what we can and should handle in this newsgroup.

I issued my warning when I saw that mountain of crud descending on us:
off-topic, xenophobic, rambling like a drunken sailor, and often
downright stupid. The fact that our beloved and respected Ray Andraka
inserted some pearls of wisdom does not change the general view. The
thread started with a provocative question, and then degraded into
rambling chaos.
I have watched other newsgroups fall into a similar trap, and I suggest
we band together to avoid this here. The subject is FPGAs, and we
should not get dragged into semi-intelligent (often un-informed and
even stupid) discussions about completely different subjects. There are
thousands of other newsgroups that do ventilate all sorts of ideas.
We have avoided spam, we should also be able to avoid an invasion like
the one I objected to.
Peter Alfke

Georg Acher · Apr 21, 2006

fpga_toys@yahoo.com writes:

And yes, I've since cooked a couple XCV800's, so I have the sense now
to check FPGA's when testing a new application or demo. When RC cards
do not even have the temp diode monitor IC connected to the FPGA, it's
difficult for an RC programmer even to be able and check the chip temp
when it's stuffed in a box, and automatically shut the app down or
throttle back the clocks.

But you can measure the core supply current. Should be easy enough to do some
correlation between current und temperature, at least for a user warning.

--
Georg Acher, acher@in.tum.de
http://www.lrr.in.tum.de/~acher
"Oh no, not again !" The bowl of petunias

John_H · Apr 21, 2006

"Peter Alfke" <peter@xilinx.com> wrote in message
news:1138123958.101066.307150@g43g2000cwa.googlegroups.com...

<snip>

I issued my warning when I saw that mountain of crud descending on us:
off-topic, xenophobic, rambling like a drunken sailor, and often
downright stupid. The fact that our beloved and respected Ray Andraka
inserted some pearls of wisdom does not change the general view. The
thread started with a provocative question, and then degraded into
rambling chaos.
I have watched other newsgroups fall into a similar trap, and I suggest
we band together to avoid this here. The subject is FPGAs, and we
should not get dragged into semi-intelligent (often un-informed and
even stupid) discussions about completely different subjects. There are
thousands of other newsgroups that do ventilate all sorts of ideas.
We have avoided spam, we should also be able to avoid an invasion like
the one I objected to.
Peter Alfke

I started watching sci.electronics.design since analog and RF stuff tends to
show up there, not here. It was sad to see just how many "OT:" posts there
are and how miserable they get. Politics, bigotry, and general downright
ugliness. The only reason we got the inappropriate thread - as far as I can
tell - is because the original poster cross-posted to three groups. The
scum-factor in sci.electronics.design spilled over full-force into our
otherwise helpful usenet forum.

I don't even bother looking for the pearls of wisdom in anything with "OT:"
in the header unless the rest of the subject catches my attention. There
are better places for those rants but some people feel "obliged" to "sling
poo" through the internet where it shouldn't be slung.

- John_H

John_H · Apr 21, 2006

<fpga_toys@yahoo.com> wrote in message
news:1138123483.242619.84800@g49g2000cwa.googlegroups.com...
<snip>

I haven't tried hard yet, and easily got "close" with several large
pipelined demos. I am concerned about what someone really concerned
about performance may fairly easily do. And in particular what does
that translate into regarding design envelopes for those that make PCI
add in FPGA
accelerator boards, and what should the customer clearly be aware of
and have disclosed for selection guidelines in Reconfigurable Computing
(RC) platforms.
snip

For anyone using a PCI add-in FPGA, the FPGA still needs to be configured.
If they're using Xilinx P&R tools, they're responsible for substantial
engineering aspects of the design. The board manufacturer should have
specific published design limits for the board. Armed with the
junction-to-ambiant thermal resistance of the realized design and the
engineering tools provided with the Xilinx toolset, proper engineering
analysis can be done.

If the RC is provided through alternative tools, those tools should be able
to deal with the design limits of the device; in this case the person
reconfiguring the PCI add-in board certainly won't be able to draw useful
information from the data sheet and may not know which data sheet values
even apply to them. They're dealing with the board, not the part. The
board data is what guides them.

Austin Lesea · Apr 21, 2006

toys,

A comment about modeling the FPGA and package:

Look up Howard Johnson's presentations. Ball/pad inductance is useless
(last year's solution). I am surprised you bring it up. HJ proves that
it is the loops and their topology that matter. And without a 3D field
solver, you have no hope of learning anything at all.

Except if you use our FPGAs: we do the hard stuff so you don't have to
(e.g ParseChevron (tm), patents pending).

The number of planes, and bumps/balls for Vcc's and ground ensure that
the voltage drops are kept within the required limits, even for your 40
watt design. It is up to you to design the PDS for your board, and we
have applications notes to help you in that regard.

Comments we have received include folks who now tell us that the
SparseChevron packages(tm) are superior to IBM's "cross" design.

I'm happy to just be in the same league.

http://www.xilinx.com/products/silicon_solutions/fpgas/virtex/virtex4/advantages/sipi.htm

http://www.xilinx.com/bvdocs/appnotes/xapp623.pdf

Austin

Austin Lesea · Apr 21, 2006

Oops,

SparseChevron(tm).

Austin

Austin Lesea wrote:

toys,

A comment about modeling the FPGA and package:

Look up Howard Johnson's presentations. Ball/pad inductance is useless
(last year's solution). I am surprised you bring it up. HJ proves that
it is the loops and their topology that matter. And without a 3D field
solver, you have no hope of learning anything at all.

Except if you use our FPGAs: we do the hard stuff so you don't have to
(e.g ParseChevron (tm), patents pending).

The number of planes, and bumps/balls for Vcc's and ground ensure that
the voltage drops are kept within the required limits, even for your 40
watt design. It is up to you to design the PDS for your board, and we
have applications notes to help you in that regard.

Comments we have received include folks who now tell us that the
SparseChevron packages(tm) are superior to IBM's "cross" design.

I'm happy to just be in the same league.

http://www.xilinx.com/products/silicon_solutions/fpgas/virtex/virtex4/advantages/sipi.htm

http://www.xilinx.com/bvdocs/appnotes/xapp623.pdf

Austin

Apr 21, 2006

Ray Andraka wrote:

So my point is, the FPGA vendors give you the information they can about
power dissipation. They can't know your design to provide you better
numbers without you actually simulating the post PAR design with vectors
that accurately reflect the actual usage. For a general purpose board,
the board designer can limit the FPGA dissipation to whatever is safe
for the board cooling environment by using thermal diodes or power
supply current limits to avoid damage to the FPGA or board, and he can
do that without knowing anything about your design. Isn't that a better
tree to bark up?

So, my point is, and nobody seems to disagree, that it's unrealistic to
assume that the devices can be 100% packed, use the marketing numbers
for system design clock speeds, at a modest toggle rate, and not blow
right thru the power the device can handle. Disagree?

If not, then the device HAS TO BE DERATED from marketing numbers for RC
use. Disagree?

Apr 21, 2006

Austin Lesea wrote:

Look up Howard Johnson's presentations. Ball/pad inductance is useless
(last year's solution). I am surprised you bring it up. HJ proves that
it is the loops and their topology that matter. And without a 3D field
solver, you have no hope of learning anything at all.

I've watched his presentation ... and it explained the horrible
problems I had
with Virtex and Virtex-II packages when pushing the power envelope.

The number of planes, and bumps/balls for Vcc's and ground ensure that
the voltage drops are kept within the required limits, even for your 40
watt design. It is up to you to design the PDS for your board, and we
have applications notes to help you in that regard.

The point of the 40w design, is that it's derated. Doubling, or better,
the clock rate to get best possible device performance per P&R timings
with a netlist that is busier than 25% and you quickly hit the power
limits of even an active cooler. The data sheets imply clock rates that
would easily produce designs well over a 100w in this package ... and
that is where I start to seriously worry about the cross section of
copper/lead in the package.

Worry, is because none of the data is specified to know where the
limits are ... IE max currents for gnd and each power group, and what
those current profiles look like in rise times.

That provokes the question of is derating necessary, if so how much,
and can we easily get numbers to calculate that? not currently.

John

Peter Alfke · Apr 21, 2006

I suggest we wait for the original poster to clarify.
I think there is no real need for a dual-edge triggered flip-flop.
Definitely not at the low frequenciesmentioned.
There is also a simple circuit that XOR differentiates the clock, thus
generating a clock pulse at both the rising and the falling edge. (See,
among others, at "Six Easy Pieces" in TechXclusives)
Peter Alfke

Jerry Coffin · Apr 21, 2006

fpga_toys@yahoo.com wrote:

[ ... ]

So, my point is, and nobody seems to disagree, that it's unrealistic to
assume that the devices can be 100% packed, use the marketing numbers
for system design clock speeds, at a modest toggle rate, and not blow
right thru the power the device can handle. Disagree?

If not, then the device HAS TO BE DERATED from marketing numbers for RC
use. Disagree?

Looking in from the sidelines, it seems to me that quite a bit of this
conversation is taking place more or less cross-purposes.

First of all, I think "derating" is a poor term -- though I tend to
agree that they might be able to provide more useful numbers. One
possibility might be to more or less directly specify the heat output
from the chip (as a whole) per million (or whatever) transitions per
second. This might give a better idea about trade-offs between faster
clocks vs. more gates. Unfortunately, it has a substantial problem
(thats been alluded to elsethread): it's basically dealing with the
power consumed by logic, not by routing, so in any given design it
might be off by a fairly large factor. I can believe that it could be
reasonably useful for things like product selection though -- if you're
planning to encrypt at a rate of X gigabytes per second (for example)
it's fairly easy to figure a rough idea of the number of bit
transitions involved and see if you're at least in the right ballpark.
This wouldn't tell you that a design _will_ work, but it'd at least let
you separate things that stand a reasonable chance from those that
don't.

Second, in terms of providing a general-purpose computing resource, I
don't think anything Xilinx (or anybody else) can provide in a
datasheet is going to mean a whole lot. If you're providing a product
for end users (instead of engineers) you need to make it foolproof.
Nothing in the datasheet is going to

Whether Xilinx should provide a circuit like that on-chip (e.g. like
most CPUs now have) is open to some question -- it would likely add a
more or less fixed amount to the product price. An amount that would
hardly be noticeable in a big Virtex would be utterly prohibitive on a
small Spartan. Perhaps this would be a reasonable feature to add on the
next generation of Virtex chips though...

--
Later,
Jerry.

yyqonline · Apr 21, 2006

Thanks a lot for replying.
\quote
There is also a simple circuit that XOR differentiates the clock, thus
generating a clock pulse at both the rising and the falling edge. (See,

among others, at "Six Easy Pieces" in TechXclusives)
\quote
I have found the circuit and thanks for information.
I am checking the stability of this circuit.
If this circuit is reliable, I think this may be a good idea.

Philip Freidin · Apr 21, 2006

On Mon, 23 Jan 2006 09:43:02 +0100, "Frank Schreiber" <frankschr@googlemail.com> wrote:

Dear all,
I am using Virtex 4 from Xillinx, and I really missed the clock for LVDS.
So, should I transfer data to LVDS each time posedge of the clock.
The clock should be LVDS clock, LTTL clock or any clock is possible.
Many thanks
Frank

I don't understand you Frank! Multiple times others have
explained to you that if you don't give sufficient information,
it is IMPOSSIBLE to answer your questions.

Give the following information, and maybe you can be helped:

1) Which exact Xilinx part number are you using.

2) What EXACT device (part number) are you connecting it to

3) How many wires total have you connected between these chips,
(LVDS should be 2 wires per signal, 8 data + clock would total
to 18 wires)

4) Bonus info would be the manufacturer of the board (or if it
is your own design, some more details of the design and the
purpose), the clock rate you are trying to use, whether the
data is single or double data rate.

Philip Freidin
Fliptronics

Peter Alfke · Apr 21, 2006

Unless you compare two related FPGA devices, gate count will be rather
meaningless, since it is interpreted differently by the ASICand FPGA
communities.
I would elaborate on:
Flip-flop count,
I/O count (including required standards including bidirectional LVDS,)
gigabit serial I/O.
on-chip memory (width and depth),
multipliers/accumulators,
potential need for an on-chip microcontroller.
Those six items should quantify almost any design.
Peter Alfke, Xilinx

Jim Granville · Apr 21, 2006

Peter Alfke wrote:

The circuit is reliable, although the generated pulse width is
determined by gate delays. But it is self-compensating, since the clock
pulse will not end until the flip-flop has toggled.

It probably needs some care, to ensure CLK_min times are ok ?

eg if you drive a large clock tree, it would be better to not
use a local FF clk, and then buffer, but to buffer, and then
FF.CLK from the CLK tree output, with optional additional
delays, if you want even more margin.

It's kind of clever, if I am allowed to say so...
Yes, I recall a similar [XOR-Q] clock scheme many, many years ago, on

a circuit ( from HP?) for Biphase decode.

-jg

Nitesh · Apr 21, 2006

I changed the front end core given by amirix for dma functionality.
For a dma transfer what is the destination address that I should
specify? How can I get this address.
I am sending data from the card to the host memory.
Nitesh

Apr 21, 2006

Ray Andraka wrote:

fpga_toys@yahoo.com wrote:
So, my point is, and nobody seems to disagree, that it's unrealistic to
assume that the devices can be 100% packed, use the marketing numbers
for system design clock speeds, at a modest toggle rate, and not blow
right thru the power the device can handle. Disagree?

You won't get the density and performance without handcrafting to meet
both the density and performance limits. The typical user is going to
run into place and route issues before he even gets close to the high
density, high clock rate corner. I don't care if it is an RC
application or not, you just don't get into that corner unless you do a
considerable amount of handcrafting on the design.

Thanks for making my point. The Xilinx product chips + ISE is unable to
route
designs which have a high usage level, which is believe is because it
both
lacks routing resources and P&R needs improvement. You are probably one
of better 1/4 of percent of engineers that might have the experience to
beat
P&R regularly ... but for the rest of us mortals the product isn't
usabel as
you get close to 100% packing.

For RC uses I have a laundry list of things that are wrong with P&R,
some of
which are WHY you can not get high density designs routed with ISE. P&R
fails to pack FF's with the LUT that has it's input term ... choosing
instead to
use another LUT as a pass thru that is several CLB's away. Given a
netlist
that is obviously a 6x15 mesh from the routing, it tends to place the
parts in
an arc around the center of the chip instead. ... and a number of other
observations
that says it's costing algorithms have a very different goal, and fail
because of
it for some designs.

The problem is that P&R is not optional, they will not release the doc
for an
open source implementation which is turned to other applications, like
RC.
So until P&R can automatically route the same dense designs you pull
off,
I say the product chips+ISE isn't usable for dense designs.

The reason they get away with it today is that for hardware design
there is
a VERY strong incentive to buy up ... purchase a larger device, just to
make
sure that in the future changes will fit. So many designs will always
have the
headroom, and presure on Xilinx to improve P&R for high density routing
is
relatively low, as few designs will cross above 95% use.

With RC there is a completely different goal, and that is to use the
entire chip,
in fact, all of every chip in an FPGA processor array. High density
designs are
the norm with RC, and half device designs will be relatively rare.

Apr 21, 2006

fpga_toys@yahoo.com wrote:

The reason they get away with it today is that for hardware design
there is
a VERY strong incentive to buy up ... purchase a larger device, just to
make
sure that in the future changes will fit. So many designs will always
have the
headroom, and presure on Xilinx to improve P&R for high density routing
is
relatively low, as few designs will cross above 95% use.

With RC there is a completely different goal, and that is to use the
entire chip,
in fact, all of every chip in an FPGA processor array. High density
designs are
the norm with RC, and half device designs will be relatively rare.

Worse yet, RC will tend to use the largest chips available, or the
largest
chip with a reasonable cost performace. Buying up is not an option.

In this arena, we are talking about fiting designs to a half million or
more
on chip resources (LUT's, FF's , MUX's, etc) -- and for multichip RC
platforms target environments with easily 20M LUT's or more. This is so
far past hand routing, it's beyond even suggesting.

Automated tools are necessary to partition, route, place and optimize
these large multichip projects ... P&R isn't even close to the right
tool.
Wanting a vendor to open up their tool chain to allow open source P&R
at some point will not be just a request, or an option, it will become
manditory for implementation dyamically loaded incrementally place and
routed designs with libraries. The vendors that recognize that, and can
produce large chips, will in the end own this high end commodity
market.

Being able to compile, load and go with 20M LUT netlists in a few
seconds
is what is necessary ... five days of P&R is not an option.

EDK : FSL macros defined by Xilinx are wrong

Guest

Guest

Austin Lesea

Guest

Peter Alfke

Guest

Georg Acher

Guest

John_H

Guest

John_H

Guest

Austin Lesea

Guest

Austin Lesea

Guest

Guest

Guest

Peter Alfke

Guest

Jerry Coffin

Guest

yyqonline

Guest

Philip Freidin

Guest

Peter Alfke

Guest

Jim Granville

Guest

Nitesh

Guest

Guest

Guest

Welcome to EDABoard.com

Sponsor

Online statistics

Forum statistics

EDK : FSL macros defined by Xilinx are wrong

Guest

Guest

Austin Lesea

Guest

Peter Alfke

Guest

Georg Acher

Guest

John_H

Guest

John_H

Guest

Austin Lesea

Guest

Austin Lesea

Guest

Guest

Guest

Peter Alfke

Guest

Jerry Coffin

Guest

yyqonline

Guest

Philip Freidin

Guest

Peter Alfke

Guest

Jim Granville

Guest

Nitesh

Guest

Guest

Guest

Log in

Welcome to EDABoard.com

Sponsor