EDK : FSL macros defined by Xilinx are wrong

Is this PCI card plugged into the same system that you are running the
download program on?

If so... Don't do that.

The card will kill the bus during a program cycle, and take the host
machine down.
 
This conference looks suspicious. How can one manage so many
conferences at once?

Since the automated paper generator was in the news last year, some
conferences are discovered and identified as "fake conferences". Check
out http://fakeconferences.org/. Don't get fooled by conferences that
have no scientific value!

-- S
 
Thanks for that nice tidbit. There has been a dream that reconfigurable
computing using FPGA's as computing engines for quite some time, and as
it seems fostered this news group some time ago. Would be interesting
to dig out the archives and see what the topics for this group where
for the first year. I have to admit, that I'm a new comer, only taking
up the dream some five or so years back, but very hopeful at this
point.

"At first, dreams seem impossible, then improbable, and eventually
inevitable." Christopher Reeve

Writing an example PCI bus interface this week in FpgaC has been
something of an eye opener for me ... feels just like writing device
drivers again :) Diddling (creating) the "hardware registers" for the
PCI bus interface in FpgaC really isn't that different than many other
low level device drivers I've written for 30 years. I'm pretty certain
as we train other device driver and embedded guys in this, they will
feel the same way about C and FPGAs.

My target is 64bit at 66mhz for the XCV2000E-8's on the Dini DN2K
boards I have. First try with ISE PAR was pretty horrible, as even with
high effort selected, it was stumbling again with horrible placement
choices with less than 1% of the device in use - not even 33 Mhz
performance. Hacking on Mike Dini's fpga-f usf file to get the pin
assignments to match his board and going thru route/place again last
night, missed the 66mhz timing budget by about 2% on the second try.

Placement for some variables used at the top of the pci process created
some longer combinatorial chains than I had expected. Placing them at
the bottom of the function will remove the deep combinatorials, and
leave that path much shorter from the registered version of the
variables.

I spent some time in FPGA-Editor again last night, just to discover the
pin locking in the ucf file brought the logic down near the bottom of
the chip and near the pads, but ISE 6.1i par failed horribly to do a
best effort set of assignments WITH NOTHING in the way. Clearly
excessive routing incurred do to poor placement. You would expect that
it would at least put the LUT's driving the pads in the CLB nearest the
IOB, not a half dozen away, crossing others. And then place the read
LUT and FF for the PAD into the same CLB, along with the associated
access logic, not scattered half way across the chip.

FPGA-Editor really is the schematic design tool for Xilinx Fpga's of
choice, except that it will not let you just use a slice that doesn't
already have a connection. Is there some way around this, other than
creating a dummy netlist to put something in each of the slices you
want to use?

Colin Paul Gloster wrote:
Nial Stewart replied:
This newsgroup and FPGAs were around long before some numpty at Google
decided what their description should be. [..]

This has nothing to do with Google. Check your newsgroups file, the
description of this group is "Field Programmable Gate Array
based computing systems." See
WWW.SLRN.org/manual/slrn-manual-6.html#ss6.117
or
HTTP://Quimby.Gnus.org/gnus/manual/gnus_358.html#SEC358
or something similar.
 
Placement for some variables used at the top of the pci process created
some longer combinatorial chains than I had expected. Placing them at
the bottom of the function will remove the deep combinatorials, and
leave that path much shorter from the registered version of the
variables.

You've had to understand the target architecture and what's causing your
timing constraint to fail, then re-jig your HDL to reduce the number of
levels of logic to achieve timing closure.

I thought that one of the arguments for using a C based HDL was you can avoid
this level of design implementaion detail?

(Serious question, I'm not being facetious).



Nial.
 
fpga_toys@yahoo.com wrote:
[snip]
Most of the early computers I worked with in Dr. Dicky's lab with
tubes, were bit serial executing directly off multi-channel read/write
from the drums (head per track, and several concurrent channels). It
would literally be fetching instruction, data, operating on the data,
and writing all at the same time. Track down the schematics for the
Bendix G15 - I saw them on the web somewhere last fall. It's a tube
based drum computer, but was simple, and they made a lot of them.
There is probably some software kicking around for it as well.
http://members.iinet.net.au/~dgreen/
The Australian Computer Museum (WA Branch) is rebuilding one.
[snip]

Being really retro would be taking a hand full of floppy drives apart
for their heads, collecting some 1" mag tape and glue to a cylinder,
and building a drum.
Or salvage some hard drives (the older the better: bigger tracks, larger
bit sizes, generally less precision required).
Mount the spin motor on a base, spin it up, then use a crystal
oscillator & dividers to get a 3-phase supply.
Fit up a single platter on the motor, with some additional weight (ie
unused platters) as a flywheel, unless you have a truly sinusoidal
3-phase supply (the torque will flutter otherwise).
Mount the heads on a fixed platform around the periphery, as head per
track. You need a few 10's of mA to drive those heads.
You will need some arrangement to lift the heads off the disk when it
stops, else the combined stiction of all those heads will see it doesn't
start again. Best if you do this by flexing the spring mounting (a
fraction of a mm is enough), rather than bodily moving the head mounts;
that will upset registration.
Or maybe machine up a nice drum, and have it
nickle plated, for the real thing :)

It's not too hard (if you have access to a lathe) to make up the drum, &
there's usually a chrome plater's around who will nickel it (that's a
prerequisite for chrome plating, anyway). The hard part is making the heads.

(Yes, I have done the above things...)
 
Can you post the error, so we can take a look ?

Aurash

nezhate wrote:
Hi all, I want to use a small cricuit (written in verilog and was
designed using ISE 3) in an other project using ISE 8.1. the problem is
that under ISE 3 the circuit worked perfectly, and under ISE 8.1 the is
an error. why this occur ?
 
If you want to have a tiny and very fast solution you can use the OCM
interface and directly connect logic resources to it.

UltraController-II should give you a good starting point to accomplish
this. See XAPP575 for more information
(http://www.xilinx.com/bvdocs/appnotes/xapp575.pdf).

- Peter


munch wrote:
Hi
i was wondering which would be the best way to get the PPC on a virtex
II pro to control the 4 inputs on a LUT, so far i have tried using
GPIO/IPIF (using xilinx EDK create import peripheral dialogue) to
control registers over the plb or opb bus but how would i instantiate a
xilinx primitive element lets say a srl16 and connect 4 of the register
values to its inputs and one to its output?

Is this approach even possible has anyone tried driving the inputs on
internal fpga elements?

any ideas help would be greatly appreciated
thanks
shane
 
zhangweidai@gmail.com wrote:
I have a Spartan 3 starter kit. Im going to build an expansion board
that will have some more components on it. does anyone know where i can
get some examples? or guides to do this? i know im going to use a
standard 2x20 right male connector, and i the pin functions. but an
example board would be most helpful. i also want to attach some sma
connectors on it to get testpoints.

the plan might be.... build pcb layout, have expert review design,
produce gerber, and send to be printed. someone recommended 33each.com

pz
Hmm... 2x20.. Sounds like the connector that Digilent uses on its
boards. They also sell accessory boards, which seemed to be reasonable
priced. They won't have SMA connectors, though. Even if you don't buy
from them, you may get some ideas from looking at their site
(www.digilentinc.com). And no, I don't work for them, but I have
bought some boards from them.

-Dave Pollum
 
Nial Stewart wrote:
Placement for some variables used at the top of the pci process created
some longer combinatorial chains than I had expected. Placing them at
the bottom of the function will remove the deep combinatorials, and
leave that path much shorter from the registered version of the
variables.

You've had to understand the target architecture and what's causing your
timing constraint to fail, then re-jig your HDL to reduce the number of
levels of logic to achieve timing closure.

I thought that one of the arguments for using a C based HDL was you can avoid
this level of design implementaion detail?
Not at all. Programmers juggle instruction/statement flow all the time
to reach timing closure in C and asm for device drivers and embedded
applications in many fields .... such as software driven stepper motor
controls. Such low level programmers also have training in
understanding bit level device interfaces, and related timing. They
frequently do not have board level hardware training or experience, to
deal with power, board level signal integrity, and related issues.

The reason for C based HDL/HLL's is to expand the field to include
related sytems and embedded programming talent to be able to
effectively, and comfortably, write "programs" for HDL/HLL FPGA based
designs which effectively instantiate the same hardware that
VHDL/Verilog would with similar syntax statements. We do this by
preserving sequential semantics with parallel statement execution of
standards based C. Also removing fine grained access to creating
multiple fine grained clocking sematics that are standard for pure
HLLs. For those that are doing device level or machine level
interfaces, writing C FSMs to netlists is substantially the same as
coding asm or C for similar low level hardware interfaces. Remarkably
the same task and skill set as writing drivers or embedded hardware
controls.

Programs are FSM's and data paths.
 
Not sure if this is relevant to your problem anymore, but I found the
problem with my design (briefly described above). There is only one
signal that can be assigned in the dsbram_if_cntlr, that is
BRAMDSOCMCLK. We were running our PPCs at 200MHz and our PLB bus at
100MHz using proc_clk_s and sys_clk_s, respectively. It was such a
habit to assign all non-processor clks to sys_clk_s, that we did so for
BRAMDSOCMCLK. After reading the data sheet for dsbram_if_cntlr, we
found that the BRAMDSOCMCLK signal needed to be 1-4X the processor clk.
The slow clock we gave to the BRAM caused our unpredictable behavior.
If anything, I have learned to read those data sheets a little better.
This may or may not be the same as your problem, Jeff, but thought I
would follow up with the fix we came up with for our system. We are
running a prodcuer/consumer type system as well and haven't had any
problem with inconsistencies. The shared BRAM is a circular FIFO in
our system. I could provide details on its operation if you still have
trouble with your system.
 
Nial Stewart wrote:
You've had to understand the target architecture and what's causing your
timing constraint to fail, then re-jig your HDL to reduce the number of
levels of logic to achieve timing closure.
Looking at the problem a little more this afternoon, the C based 66mhz
PCI core is looking much more viable. The combinatorial length was
actually from unnecessarily including the main references to the pci
bus signals in the else side of the reset conditional. Breaking that
dependency changed the base FSM speed from 63mhz to better than 73mhz,
making it likely the other setup and hold times can be met as well. I
suspect the remaining code to be written will not change this, and I
can refine what is there a bit better, possibly improving this so my
other board with -6 parts can be used at 66mhz as well. Probably by
hacking the ucf file some more to optimize placement for the bus
interface slices to use the local IOB to CLB routing, and get rid of
the longer general routing paths ISE is using, since routing delays
appear to be better than 60% of the timing budget.

Timing summary:
---------------

Timing errors: 0 Score: 0

Constraints cover 1569 paths, 0 nets, and 1215 connections

Design statistics:
Minimum period: 13.556ns (Maximum frequency: 73.768MHz)
Maximum path delay from/to any node: 13.556ns
 
Michael Hennebry wrote:
fpga_toys@yahoo.com wrote:


Being really retro would be taking a hand full of floppy drives apart
for their heads, collecting some 1" mag tape and glue to a cylinder,
and building a drum. Or maybe machine up a nice drum, and have it
nickle plated, for the real thing :)


Being really retro would be building an ABC computer.

Has anyone since Babbage tried to build a Babbage Engine?
I had a working model of the Babbage computer somewhere around 1972 made
of plastic. As I recall, it took a fair bit of sanding and lubricating
to make it work smoothly. I don't recall who made the kit, or even
where I got it.
 
clarence_grapes@hotmail.com wrote:
This conference looks suspicious. How can one manage so many
conferences at once?
I think that one answer is to accept most, if not all, papers, charge
high registration fees, and hope that the people who attend
don't care about the quality of the papers, as long as the food
is good.

A glance at the proceedings of past conferences suggests that
they are not taken seriously by people at "elite" computer science
departments. Note the almost total absence of authors from top-ranked
schools. They presumably know that their resume would be damaged by
publishing in these conferences.


Since the automated paper generator was in the news last year, some
conferences are discovered and identified as "fake conferences". Check
out http://fakeconferences.org/. Don't get fooled by conferences that
have no scientific value!

-- S
 
thanks for the reply. yes ive looked at them. but they dont share pcb
layout files. i would like examples that i can use for like...
dimensions and component placement examples. maybe something that
utilizes different layers.
 
You've had to understand the target architecture and what's causing your
timing constraint to fail, then re-jig your HDL to reduce the number of
levels of logic to achieve timing closure.

Not at all. Programmers juggle instruction/statement flow all the time
to reach timing closure in C and asm for device drivers and embedded
applications in many fields
But you're not just juggling lines of code about so the order of
execution is different (ie to make sure things are picked up
quickly enough in an ISR or whatever).


Looking at the problem a little more this afternoon, the C based 66mhz
PCI core is looking much more viable. The combinatorial length was
actually from unnecessarily including the main references to the pci
bus signals in the else side of the reset conditional. Breaking that
dependency changed the base FSM speed from 63mhz to better than 73mhz,
making it likely the other setup and hold times can be met as well.
I still think this is an accurate observation...

"You've had to understand the target architecture and what's causing your
timing constraint to fail, then re-jig your HDL to reduce the number of
levels of logic to achieve timing closure."


Nial.
 
Symon wrote:
It's a shame that this kind of thing tends not to appear from the FPGA manufacturers.

Yes, one would certainly expect to see some IBIS or SPICE
simulations provided along with XAPP774.

Unlike the current crop of app notes and marketing fluff, the
original Virtex-E LVDS app notes weren't afraid to plot LVDS
waveforms at points other than only the on-die receiver input of
a perfectly back terminated line. ( Speaking here of the general
purpose LVDS I/O app notes; the Rocket I/O stuff is generally
better done. )

For a real chuckle, read :

XAPP756 Transmitting DDR Data Between LVDS and RocketIO CML Devices

Surely you'd see plenty of reflections with a 60 ps rise time
CML driver whacking a 10 pf LVDS input pin, right?

But not when the only waveform plotted is a worst-case loss situation,
showing only the receiver inputs, after driving four feet of FR4...

p.s. And ta for the reference[4]!

I liked that "c{r}appy LVDS" pun.

Brian
 
On 01 Mar 2006 15:37:30 -0800, Eric Smith <eric@brouhaha.com> wrote:

+<"nezhate" <mazouz.nezhate@gmail.com> writes:
+<> Hi all, I want to use a small cricuit (written in verilog and was
+<> designed using ISE 3) in an other project using ISE 8.1. the problem is
+<> that under ISE 3 the circuit worked perfectly, and under ISE 8.1 the is
+<> an error. why this occur ?
+
+<Probably because you're not rubbing together a regurgitative purwell and
+<a supramitive wennelsprock.
+
+<You might get better results after reading:
+
+< http://www.catb.org/~esr/faqs/smart-questions.html
*****

Oh no the net police!!!

james
 
Hi Joseph,

Hmmmn, let me try that. Nothing else has seemed to work so far, aside from
stripping out everything in the system except for the processors and OCM
which obviously is no "solution". That just happened to have "magic"
placement and no problems. I have my system setup the same way you had
your's originally (with the dsbram_if_cntrl at 100 MHz and the PowerPC at
300 MHz). I'll let you know what happens next week. We've got a big
project deadline in the next few days that's keeping me busy, and, aside
from this OCM issue, the system is performing great. We optimized our
software enough to not need the second processor at the moment, so I can
safely ignore this for a little while. :)

Thanks,

Jeff


"Joseph" <joeylrios@gmail.com> wrote in message
news:1141335040.252179.271460@t39g2000cwt.googlegroups.com...
Not sure if this is relevant to your problem anymore, but I found the
problem with my design (briefly described above). There is only one
signal that can be assigned in the dsbram_if_cntlr, that is
BRAMDSOCMCLK. We were running our PPCs at 200MHz and our PLB bus at
100MHz using proc_clk_s and sys_clk_s, respectively. It was such a
habit to assign all non-processor clks to sys_clk_s, that we did so for
BRAMDSOCMCLK. After reading the data sheet for dsbram_if_cntlr, we
found that the BRAMDSOCMCLK signal needed to be 1-4X the processor clk.
The slow clock we gave to the BRAM caused our unpredictable behavior.
If anything, I have learned to read those data sheets a little better.
This may or may not be the same as your problem, Jeff, but thought I
would follow up with the fix we came up with for our system. We are
running a prodcuer/consumer type system as well and haven't had any
problem with inconsistencies. The shared BRAM is a circular FIFO in
our system. I could provide details on its operation if you still have
trouble with your system.
 
zhangweidai@gmail.com wrote:
thanks for the reply. yes ive looked at them. but they dont share pcb
layout files. i would like examples that i can use for like...
dimensions and component placement examples. maybe something that
utilizes different layers.
You sound vague about what you want the expansion board to do. I know
I have the equivalent of "writer's block" when starting a new design.
I often start with a vauge idea of what I want, and it takes me a while
to decide what I really want. I sketch a lot of block diagrams, most
of which I toss. It's tempting to add feature after feature, but start
with simple designs at first. Be prepared to make mistakes. Hopefully
you'll learn from those mistakes. The CAD package I use is EAGLE. The
version I bought has a board size limit of 100mm x 160mm, which is
roughly 4" x 6". 4x5 or 4x6 is also a good board size if you use the
$33/bd company, and other PCB companies. When it comes to placing
parts, look at your block diagram and schematic diagram. Parts that
are nearby on the schematic should be nearby on the board. Connectors
should be near the board's edge, etc. You can make paper cutouts of
the parts, which you place on a piece of paper that represents your
board. You can then shuffle parts around until you are satisfied with
the placement. Again, looking at web sites that sell board-level
products (e.g. Digilent), should give you some ideas on placing parts.


HTH
-Dave Pollum
 
Hi (again!)

As we're still looking at this device on quite a low level (we're trying
to look at implementing a model of neurons in the brain on the device,
and in particular the connectivity) we've come across another problem in
our understanding...

When looking at the 'Hierarchical Routing Resources', the paragraph
states "... a number of resources counted between any two adjacent
switch matrix rows or columns.". Therefore, are we right in thinking
that in our device (which has 56x48 CLBs) for the '40 horizontal double
lines', from each row we can send out 40 connections (giving a total of
40x48 double connections), or is there something we've missed (as
depending on which way we look at it, it's either *loads* of
connections, or *very* few connections!). The same goes for the long
lines and hex lines etc.

Again, any pointers to documentation would be appreciated, or if someone
has the time to type a concise answer all the better (the reason we're
posting is because we can't find anything helpful, so we're hoping to
learn from your experience, rather than just leeching off you!) :)

Thanks again
Chris
 

Welcome to EDABoard.com

Sponsor

Back
Top