EDK : FSL macros defined by Xilinx are wrong

Kolja Sulimma · Apr 21, 2006

Austin Lesea wrote:

Since we wrote this, IEEE owns the copyrights, and we can no longer
distribute the paper.

Not true. The IEEE copyright poolicy states that

"Upon transferring copyright to IEEE, authors and/or their companies have the right to post their IEEE-copyrighted material on their own servers without
permission, provided that the server displays a prominent notice alerting readers to their obligations with respect to copyrighted material and that the posted
work includes an IEEE copyright notice."

As Xilinx employees are authors and co-authors of quite a number of papers, a collection - maybe on the XUP homepage - would be nice.

Kolja Sulimma

Robin Bruce · Apr 21, 2006

OK, now I'm confused...

If, apropos of nothing, Peter says the FIFO depth in the Virtex-4
devices are exact powers of 2, I'd be inclined to believe him.

If I factored in that he designed the "crucial asynchronous empty
arbitration logic" for said FIFOs, any remaining doubts would surely
vanish.

If however, I woke up in a sweat in the night, suddenly unsure of the
depth of Virtex-4 FIFOs, I'd probably go and read the User Guide before
posting anything here...

Cheers,
Robin

Benjamin Todd · Apr 21, 2006

Ah, i'd be careful probing the configuration signals while the device is
programming-Make sure you have the scope set up so that the impedance of the
Probe is not going to unduly effect the quality of the signals on the board.

From what you say it sounds like the probe termination is interfering with
the signals you're scoping. it can happen if you have slow probes, i.e. old
stuff, or if you have the termination of the scope set to something
untoward, (50-Ohms would probably kill your config) anyways, even if it
looks great on the scope, you might be making some standing waves later down
the line.

HTH
Ben

<nithin.pal@gmail.com> wrote in message
news:1129738420.486603.327530@o13g2000cwo.googlegroups.com...

Hello,

I am a newbie to the group and FPGAs in general. We have a board with
XC3S1500 FPGA on it. We use the master serial mode to configure the
FPGA using a platform flash (PROM) with the FPGA providing the CCLK
(confg rate -50). For debug reasons we also have the confg. signals
going to a POD header.

We see that sometimes when we probe the CCLK on the POD header the FPGA
fails to configure. On the scope , we could see the sequence of INIT
going high, CCLK going active (clocking), DONE being low and after some
time, the INIT goes high, the CCLK stops clocking but the DONE pin does
not go high. We have 330 ohm pull-up on DONE pin.

If i remove the probe from the CCLK pin on the POD header the board
starts-up with successful configuration.

I was wondering if anybody in the group has seen such behaviour before
or may be the experts in the group could guide me on the possible
issues involved.

Also, sometimes I have also seen that as the INIT goes low, CCLK goes
high but does not begin to clock . In this case, a configuration
failure is obvious. So, is this a problem with the chip not being
consistent?

Thanks a lot in advance

Regards
Nithin

vssumesh · Apr 21, 2006

Some body please advice me in this issue as i am still wondering aout
what to do ???

Mike Lewis · Apr 21, 2006

"Lionel Damez" <damez@lasmea.univ-bpclermont.fr> wrote in message
news:ee910b9.1@webx.sUN8CHnE...

Mike Treseler wrote :

Try a simpler case first. Maybe four microblazes?

I have already tested simpler designs with less microblazes and place and
route was successful with each of them.

What I want to know is how much processors can I put in a virtex4 device.
I hope to put 32 in the largest devices.

Trying with 16 processors, I encountered this place and route problem.

The par tool outputs "CHANGE PLACEMENT or EASE CONSTRAINTS".

Is it possible to do that with the EDK interface, or do I have to export
my design to Projet Navigator(ISE)?

Thanks, Lionel Damez

The tool is telling you that 16 won't fit ... I doubt there is much you can
do other than reduce the number of instances of the processor.

Mike

Apr 21, 2006

Michael Schuster wrote:

news.hinet.net wrote:

Yes it comes with a linux and solaris cd too
not mine, only device files for all plattforms (or I didnot see it
anyhow ...)

Michael
--
Remove the sport from my address to obtain email
www.enertex.de - Innovative Systemlösungen der Energie- und Elektrotechnik

For the record, mine sadly came with only win32 binaries. Can anyone
at Xilinx explain why this is? Is it just to save the cost of a few
CDs?
Are Linux users still considered second class users? I have to stoop to
running
it on VirtualPC on my Mac. Not exactly speedy.

Other the above problem, I find the Spartan-3 starter kit to be a
tremendous
value. Thanks to Xilinx for providing some really powerful tools and a
decent
hardware platform for next to nothing.

cheers,
aaron

fpgabuilder · Apr 21, 2006

Its interesting to learn about the disabling the unused inputs - how
are the inputs disabled? I am assuming a FET. In that case wouldn't
there still be some leakage especially in a 90nm process?

Austin Lesea · Apr 21, 2006

Kolja,

Interesting. I knew I could distribute it internally, but fram what you
say, it appears I may also publicly post it on the Xilinx web site
(external world)? That doesn't seem right to me...

Austin

Kolja Sulimma wrote:

Austin Lesea wrote:

Since we wrote this, IEEE owns the copyrights, and we can no longer
distribute the paper.

Not true. The IEEE copyright poolicy states that

"Upon transferring copyright to IEEE, authors and/or their companies have the right to post their IEEE-copyrighted material on their own servers without
permission, provided that the server displays a prominent notice alerting readers to their obligations with respect to copyrighted material and that the posted
work includes an IEEE copyright notice."

As Xilinx employees are authors and co-authors of quite a number of papers, a collection - maybe on the XUP homepage - would be nice.

Kolja Sulimma

Kolja Sulimma · Apr 21, 2006

Mike Lewis wrote:

Lionel Damez wrote:
Is it possible to do that with the EDK interface, or do I have to export
my design to Projet Navigator(ISE)?
The tool is telling you that 16 won't fit ... I doubt there is much you can
do other than reduce the number of instances of the processor.

Mike

- The logic for 16 microblazes fits.
- A single microblaze is routable.
- The communication between the prozessors is systolic.

From that information I would say that floorplanning is very
likely to yield a routable design.
This means that you tell the placer beforehand were in which
reagion of the chip it should put each processor and the corresponding memory.

You can not do that from EDK AFAIK.

Kolja Sulimma

David Hand · Apr 21, 2006

My experience is that the Platform USB cable is actually 2 or 3 times
slower than the Parallel Cable IV. Xilinx said it was becuase they had
not yet optimized the protocols. You might want to try the parallel
cable instead.

Waage wrote:

Thanks. When I posted I was pretty steamed was probably venting more
than
looking for help. But, here's some of the details.
I should have been more specific. It's the is the Virtex-4 LX60 Eval
Board.

I don't actaully know where the issue(s) is/are. It could be in one of
at least three places.

1. A board level issue.
2. The Platform Cable USB used to interface to JTAG
3. Xilinx's Impact software.

I have been attempting basic communication to the board. Nothing
fancy.
I tried using Impact's "Initialize Chain" command to see if the
software could
find and recognize the devices (PROM and Virtex-4) on the JTAG chain.

I get the following:

Identifying chain contents ....read count != nBytes, rc = 20000015.
read failed 20000015.
'1': : Manufacturer's ID =Unknown
INFO:iMPACT:501 - '1': Added Device UNKNOWN successfully.
----------------------------------------------------------------------
----------------------------------------------------------------------
write cmdbuffer failed 20000015.
write cmdbuffer failed 20000015.
write cmdbuffer failed 20000015.
write cmdbuffer failed 20000015.
write cmdbuffer failed 20000015.
write cmdbuffer failed 20000015.
'2': : Manufacturer's ID =Unknown
INFO:iMPACT:501 - '1': Added Device UNKNOWN successfully.
----------------------------------------------------------------------
----------------------------------------------------------------------
write cmdbuffer failed 20000015.
write cmdbuffer failed 20000015.
write cmdbuffer failed 20000015.
write cmdbuffer failed 20000015.
write cmdbuffer failed 20000015.
write cmdbuffer failed 20000015.
'3': : Manufacturer's ID =Unknown
INFO:iMPACT:501 - '1': Added Device UNKNOWN successfully.

...
And it goes on telling me that there's a huge number of UNKNOWN devices
on the board

I have also tried checking the ID_REGISTER values with Impact and got
the following.

// *** BATCH CMD : ReadIdcode -p 2
read count != nBytes, rc = 20000015.
read failed 20000015.
ERROR:iMPACT:583 - '2': The idcode read from the device does not match
the idcode in the bsdl File.
INFO:iMPACT:1578 - '2': Device IDCODE :
00000000111111000000001110010000
INFO:iMPACT:1579 - '2': Expected IDCODE:
00000001011010110100000010010011
write cmdbuffer failed 20000015.
write cmdbuffer failed 20000015.
write cmdbuffer failed 20000015.

...

Which looks to me like the communication is just Garbage.
I have tripple checked and the Board's Jumpers are all positioned
correctly for JTAG.

I have also opened up a Web Case with Xilinx, but am still waiting to
get some more information
back.

If anyone has some insight I'd be very thankful to get some feedback.

Thanks!

Dave Pollum · Apr 21, 2006

Jim Granville wrote:

luc wrote:
Jim,

If I'm not mistaken, the datasheet of the 4000Z provides both static
power, and a graph showing the dynamic power requirements.
There is even a complete technical note about the power coefficients.

Yes, but I was refering to the Web hoopla,
[ There are NO relative claims in the Data sheets ]
see
http://www.latticesemi.com/products/cpldspld/ispmach4000z.cfm

and not a squeak on mA/MHz ?

[Aug-27-04 is also getting a tad long in the tooth ? ]

ISTR a quick compare did confirm that mA/MHz number was not so great....

-jg

Take another look at that web page. In the "Family Member Selector
Guide" table the 6th column is labeled "Typical Standby Current (A)".
I'm gueesing they meant uA, not Amps!

-Dave

Albert Chang · Apr 21, 2006

Hi Kedar,

To generate a post-synthesis netlist along with an SDO timing file, use
the following commands:

quartus_map <project name> -c <revision>
quartus_tan <project name> -c <revision> --post_map --zero_ic_delays
quartus_eda <project name> -c <revision> --simulation --tool=<toolname>
--format=verilog

When you generate the VO/VHO and SDO file using the quartus_eda
executable, you will receive the following warning message:

Warning: Standard Delay Output File (.sdo) contains estimated delays --
run Fitter first to annotate SDF Output File with exact delays

After you perform your post-synthesis simulations, Altera recommends
that you complete a full compilation and regenerate the VO/VHO and SDO
files to include exact delays of your design for gate-level timing
simulations. A full compilation in the Quartus II software includes
synthesis, placement and route.

For more information, please refer to the following solution
http://www.altera.com/support/kdb/2005/10/rd10192005_405.html

If you do not want to wait for a full place-and-route, you can use the
Early Timing Estimator, in the Quartus II software, to perform a
preliminary place-and-route in a fraction of the time of a full
place-and-route. After the Early Timing Estimator, you can generate a
simulation netlist with the early timing estimates.

Albert Chang
Senior Applications Engineer
Altera Corporation

kedarpapte@gmail.com wrote:

Hi Morpheus And Ben

Thanks for your reply but my confusion is
1. In Quartus tool Full Compilation means complete implementation and
that generates both .vho and .sdo file for me after fitting and place
route.

2. But is there any option to generate a back anotated post synthesis
.vhd file or .vho file with a .sdo file after gate level synthesis

so after simulating the post synthesis .sdo file can we get only gate
delays in simulation and not the net or routing(wire) delays

thanks
regards
Kedar

Waage · Apr 21, 2006

Antti,

Thanks for all your input.

I have done as you suspected, and I have downloaded an evaluation
version of ChipScope.
And now I have one more datapoint. ChipScopePro can not find the JTAG
chain, but it can
find the USB box. It reports that a connection has been made to
Platform USB and then reports...

ERROR: Failed detecting JTAG chain.

So, at least at this point I am VERY suspecious a communication issue
from USB to JTAG.
Does anyone know if there's some way to snoop the communication inside
the Platform Cable?

Apr 21, 2006

Subhasri krishnan wrote:

Hi all,
I have this xess board which has a tool to initialise an SDRAM with
data. I have 16-bit numbers that I want to load into the SDRAM. The
tool needs .hex/.mcs format and I read that I need the promgen to
convert from .bit to .hex.

My question is how to convert from this .dat file (which is my input
data file) to the .bit file. Does anyone know of such an utility?
Otherwise can someone explain wat the .bit file contains?

Thanks

PROMGEN is only used to convert Xilinx bitstreams into a .hex/.mcs
format that can be loaded into a flash device. Since you have data
(and not a bitstream), it would take less effort to convert it directly
to .hex/.mcs rather than trying to convert it to the .bit format and
then use PROMGEN. Here is the spec for the .hex/.mcs file format:
http://www.xess.com/faq/intelhex.pdf.

If your data file is "large" (i.e. more than 64K), then you will have
to insert page address records into the .hex/.mcs file since its
standard data record is limited to a 16-bit address. It may be easier
to convert your data to the .xes format that is also supported by the
GXSLOAD/XSLOAD tool. Here is a description of the .xes format:

The .xes file formats are simple. Each line is a data record. Each
data
record is structured as follows:

- An initial letter indicates the length of the starting address for
the
data:
'-' indicates a 16-bit address is used
'=' indicates a 24-bit address is used
'+' indicates a 32-bit address is used

- Next, a two-digit hexadecimal number indicates the number of bytes in
the data record, N.

- Next, the starting address for the data is given as a 16, 24 or
32-bit
hexadecimal number.

- The remainder of the record is composed of N two-digit hexadecimal
numbers for the data.

- There is no checksum.

Here are some example data records in the XES-16, XES-24 and XES-32
file
formats:

- 10 0000 83 2C 4F 88 F2 2B B3 39 7E 1F 15 63 46 5E FB 89
= 10 000010 C4 A5 C4 C7 D2 26 A0 50 58 EA 85 66 9B C9 EE DE
+ 10 00000020 DD AC C2 94 63 5B 33 D3 6A 76 FA 20 36 F5 BC 68

Marc Randolph · Apr 21, 2006

Pasacco wrote:

dear all

I have a question on how to get maximum clock frequency of real
hardware. i am using XST and ISE6.3

Cnsider following data is obtained from RTL synthesis in XST

-------
Minimum period: 6.608ns (Maximum Frequency: 151.332MHz)
Minimum input arrival time before clock: 4.990ns
Maximum output required time after clock: 3.442ns
Maximum combinational path delay: No path found
-------

Problem is that those information are just an estimation. So I am
trying to getting information after Place and Route.

What I am doing is to put following constraint in UCF file

-----
TIMESPEC "TS_clk" = PERIOD "clk" 6.608ns HIGH 50 %;
-----

Is it a right way to get Max. frequency ?

Close. Before the above line, you probably need to add

NET "clk" TNM_NET = "clk";

Have fun,

Marc

gallen · Apr 21, 2006

I understand the basic idea. I can see how it solves a lot of problems
because the time between cycles for an individual thread is long enough
that you don't have to deal with forwarding or hazards or branch
prediction or anything like that. Each thread is something of a
multicycle architecture.

Unfortunately it seems that a multi-threaded architecture definitely
needs a new programming paradigm. I don't think your standard C
program would map well onto that. (If you were running 4 C programs,
however, I could see it working quite well). But I suppose that is a
different sort of problem to face.

Thanks for the info. I may very well look into an architecture like
this at some point.

-Arlen

Kolja Sulimma · Apr 21, 2006

fpgabuilder wrote:

Its interesting to learn about the disabling the unused inputs - how
are the inputs disabled? I am assuming a FET. In that case wouldn't
there still be some leakage especially in a 90nm process?

As far as gate leakage is concerned you worry about a few million sram bits
in the FPGA but not about a few hundred inputs. After all that's only nA
per Transistor. (BTW: Pulling the inputs to 0 does not with gate leakage either.
There are as many input transistors connected to GND as there are conneced to VCC)

But it is important that the input gates are not switching because of the random
input voltage. Disabling the input gates (e.g. by using a NAND instead of an
inverter) reliably fixes that.

Kolja Sulimma

JJ · Apr 21, 2006

gallen wrote:

I understand the basic idea. I can see how it solves a lot of problems
because the time between cycles for an individual thread is long enough
that you don't have to deal with forwarding or hazards or branch
prediction or anything like that. Each thread is something of a
multicycle architecture.

Exactly, we do this all the time in DSP to break dependancies.

Unfortunately it seems that a multi-threaded architecture definitely
needs a new programming paradigm. I don't think your standard C
program would map well onto that. (If you were running 4 C programs,
however, I could see it working quite well). But I suppose that is a
different sort of problem to face.

Typically if a processor is already running some sort of OS with time
sharing of processes, then having to deal with 4 HW threads is not a
big deal except that the threads run at 1/t of clock. But if many of
these PEs are available in each MMU cluster then that is 4N threads. It
gets much more interesting when the MMU introduces its own OS memory
management issues and the language of choice looks like a hybrid of
C/C++ with occam and Verilog.

C gives us structs with data members and usually manipulated by any old
functions, no special logical structure at all.

C/C++ gives us classes to add member functions to member data for
object oriented programming but no concurrency or liveness.

V++ (in development) give us a process which looks just like a class
with added port list and body code that can instance other process
objects ala Verilog.

// monospace

process pname1 (
in .., out ..., // just like Verilog port list, event driven
ints etc ) { // data ports not event driven

data members; // just like C vars in struct
function members; // just like C++ class methods

wires ...; // just like Verilog

process body code; // just like Verilog module body

l1: pname2(.. ); // just like Verilog instance of another
process/module
l2: pname3(.. ); // labels are used to name instances in the
hierarchy

assign ...; // just like Verilog continuous assigns
always { ;;; } // just like Verilog event driven parallel logic
} // usually endmodule

Now a process hierarchy combines C++ class OO structure with an event
driven HDL like structure with some help from processor to support many
threads or processes etc. 1) Data, 2) Objects, 3) Processes.

Thanks for the info. I may very well look into an architecture like
this at some point.

-Arlen

regards

John

transputer guy

Antti Lukats · Apr 21, 2006

"raph" <raphael.ponsard@ac-grenoble.fr> schrieb im Newsbeitrag
news:435ca17a$0$20868$636a55ce@news.free.fr...

Hi Folks,

For educational purpose (Von Neumann demo), I am looking for a System On
Chip Processor :
- very simple
- gate level design (no VHDL, only gates AND,NAND, and low level boxes as
shifter, ...)
- running system (spartan 3 or other low cost xilinx or altera demo board)

regards
Raph

there is no such thing, not directly.

some OOOLD desing may have been done on the level you wish, but none of
those designs has been ported to any modern FPGA as S-3.

what means you are stick to use VHDL code - BTW Xilinx picoblaze is pretty
low level already, still being in VHDL

and... a shifter is not generically a low level box. low level primitives
are LUT and FF, pretty much everthing else is higher level already. xilinx
SRL16 od low level shiftrt but is in xilinx only. there is no vendor neutral
low level box as shifter (shift register).

so if you need basic gates only then you need to translate some HDL code
into the primitives...

antti

JJ · Apr 21, 2006

I won't even try to understand that.

Do you know about the Verilog reduction operators.

I believe that for a wire w of any width, ^w gives the total xor over
all bits, same for most bitwise operators. Kind of APLish, can be very
powerful, should synthesize.

Read the language book again.

And please everyone, I is spelt uppercase, and my English grades were
not that good, even I can manage that.

John

EDK : FSL macros defined by Xilinx are wrong

Kolja Sulimma

Guest

Robin Bruce

Guest

Benjamin Todd

Guest

vssumesh

Guest

Mike Lewis

Guest

Guest

fpgabuilder

Guest

Austin Lesea

Guest

Kolja Sulimma

Guest

David Hand

Guest

Dave Pollum

Guest

Albert Chang

Guest

Waage

Guest

Guest

Marc Randolph

Guest

gallen

Guest

Kolja Sulimma

Guest

JJ

Guest

Antti Lukats

Guest

JJ

Guest

Log in

Welcome to EDABoard.com

Sponsor