EDK : FSL macros defined by Xilinx are wrong

Martin Euredjian · Apr 21, 2006

The constraint guide indicates that the TIG constraint can be used in HDL
(Verilog in my current design). However, an attempt to use it produces
the
following error:

ERROR:Xst:1582 - The constraint 'tig=' is not supported neither in BEGIN
MODEL/END section in the XCF file, nor in HDL code.

I have not been able to find further information on this error message or
issue in the Xilinx site. Does anyone know if TIG is truly supported in
HDL? I'd hate to place it in the UCF file, to me it feels much more
approprite to have this constraint move with the HDL source.

The form I'm using is:

// synthesis attribute TIG of <net_name> is "";

Using:

// synthesis attribute TIG of <net_name> is "TRUE";

makes the error go away. Is there a way to verify that the net is being
ignored for timing purposes? The log says:

Set user-defined property "TIG = TRUE" for signal <signal_name>.

Being that the constraints guide does not list "TRUE" as a valid value I'd
like verification that the constraint is truly doing something useful.

Thanks,

--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Martin Euredjian

To send private email:
0_0_0_0_@pacbell.net
where
"0_0_0_0_" = "martineu"

Uwe Bonnes · Apr 21, 2006

Peter Alfke <peter@xilinx.com> wrote:
: What do you mean by "context switch"?

Probably he meant the time needed for reprogramming...
--
Uwe Bonnes bon@elektron.ikp.physik.tu-darmstadt.de

Institut fuer Kernphysik Schlossgartenstrasse 9 64289 Darmstadt
--------- Tel. 06151 162516 -------- Fax. 06151 164321 ----------

Jan Panteltje · Apr 21, 2006

On a sunny day (Sun, 24 Aug 2003 18:33:25 GMT) it happened "michele bergo"
<michelebergo@libero.it> wrote in
<Vx72b.262458$Ny5.8069636@twister2.libero.it>:

I want to interface altera cyclone(3.3V) to the pc parallel port(5V). Is it
possible without using further external devices?

Thanks

I have noticed (measure a logic 1) that my PC parallel port (and it is no laptop)

gives about 3.3 V.

Bob Perlman · Apr 21, 2006

On Mon, 25 Aug 2003 21:31:53 +0000 (UTC),
nweaver@ribbit.CS.Berkeley.EDU (Nicholas C. Weaver) wrote:

In article <uepkkvo2rn871sjbdij3ltjsjes72nmfjj@4ax.com>,
Bob Perlman <bobsrefusebin@hotmail.com> wrote:
On Mon, 25 Aug 2003 18:25:30 +0000 (UTC),
nweaver@ribbit.CS.Berkeley.EDU (Nicholas C. Weaver) wrote:

In article <3F4A4A96.18037049@xilinx.com>,
Peter Alfke <peter@xilinx.com> wrote:
Sounds reasonable to me...
Peter Alfke

Just realized. How does a LUT input react to a metastable input (to
do the voting circuit)?

Voting circuits don't work as a metastability cure. Imagine a
2-out-of-3 circuit that's looking at three flip-flops. If one FF is
HIGH, another is LOW, and the third is metastable, what's the output
of the voting circuit?

As long as the voting circuit output is consistant LOW or HIGH, it
doesn't really matter, at least in the context given. If the voting
circuit output is metastable, then thats problematic.

True on both counts. But how do you do make the output stable? This
is the point at which folks have tried adding hysteresis, etc. And
we've all been down that road before, I think.

Bob Perlman
Cambrian Design Works

Peter Alfke · Apr 21, 2006

I have never seen strange levels or oscillations ( well, 25 years ago we
had TTL oscillations). Metastability just affects the delay on the Q output.
Your suggestion seems to be to do exactly what one "should not do,"
namely use multiple synchronization flip-flops with tiny input routing
delay differences, and then do majority voting on the three outputs.
Seems like a smart idea, because it does not waste extra latency.

Peter Alfke, Xilinx
======================
"Nicholas C. Weaver" wrote:

Just realized. How does a LUT input react to a metastable input (to
do the voting circuit)?
--
Nicholas C. Weaver nweaver@cs.berkeley.edu

Pablo Bleyer Kocik · Apr 21, 2006

"Jim Kearney" <replace.this.with.my.forename@jkearney.com> wrote in message news:<Exq2b.255714$YN5.175023@sccrnsc01>...

pablobleyer@hotmail.com (Pablo Bleyer Kocik) wrote in message
news:<bb2f07d6.0308242212.6707fd90@posting.google.com>...
Hello.

I have a Spartan-II device connected to a small 5V micro which does
the configuration process. I want to multiplex the same micro's pin
for CCLK generation during Spartan-II configuration and as a timer
output to another device after that. The CCLK signal is 5V -- I know

I think what's he's asking is whether the FPGA will care if the CCLK line is
toggling after configuration - note the phrase "same micro's pin ... output
to another device after".

AFAIK, This would be no problem - the FPGA ignores CCLK after its DONE goes
active so that a chained serial configuration mode works.

Exactly, this is what I was referring to. Sorry if it was not clear.
I want to share the same CCLK line for another purpose after the FPGA
has been configured. This will be a toggling 5V signal. Will the
(sleeping) CCLK will have any concerns with that?

Thanks again. Regards.

Symon · Apr 21, 2006

Hi Peter,
Compared to what this guy's research is about, maybe Xilinx's
present CMOS is slow! Try googling for SiGe FPGA. Very interesting.
Syms.

Peter Alfke <peter@xilinx.com> wrote in message news:<3F4A3115.97553C14@xilinx.com>...

What do you mean by "context switch"?
And, CMOS is not slow. We build 10 gigabit/second serial interfaces; and
the LUT and flip-flop response is well below 1 ns. Do you call that "slow"?

Peter Alfke, Xilinx
===========================
Kuan Zhou wrote:

Hi,
I am wondering how fast can the Virtex does the context switch.
I heard it's slow because the CMOS response very slowly. Is it true?

Thank you very much!

sincerely
-------------
Kuan Zhou
ECSE department

Ray Andraka · Apr 21, 2006

It infers a ripple carry adder using the fast carry chains. The FPGAs
have special logic for implementing fast ripple carry (they actually do a
2 bit carry look-ahead in hardware, but that is invisible to the user).
Other carry schemes are forced to use the much slower general purpose
routing and logic. In most cases, the adder built using the fast carry
chains is not only the minimum area, it is also the maximum performance.

Nagaraj wrote:

Hi all,
I am using XST (ISE 5.1i sp3) for my logic synthesis. If I write a
piece of VHDL code as in " c <= a + b ", an N-bit adder will be
inferred (assuming a,b,c are N bits).
As there are many types of adder algorithms/implementations
available (like Carry Look Ahead, Carry Save etc.), I want to know
which one does XST infer? Can I have a control over the type of adder
?

Regards,
Nagaraj

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930 Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

"They that give up essential liberty to obtain a little
temporary safety deserve neither liberty nor safety."
-Benjamin Franklin, 1759

Martin Euredjian · Apr 21, 2006

Sure, I could use multi-cycle in the UCF (or TIF, for that matter). For
maintainability (and reusability) purposes I wanted to include these and
other constraints with the HDL file.

--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Martin Euredjian

To send private email:
0_0_0_0_@pacbell.net
where
"0_0_0_0_" = "martineu"

"Anil Khanna" <anil_khanna@mentor.com> wrote in message
news:3f4a72c4$1@solnews.wv.mentorg.com...

Hi Martin,

TIG should work, if not, you could always try a MCP (multicycle path
constraint) if you know how many times your data is sampled.

Anil

"Allan Herriman" <allan.herriman.hates.spam@ctam.com.au.invalid> wrote in
message news:3f49e539@dnews.tpgi.com.au...
Martin Euredjian wrote:
The constraint guide indicates that the TIG constraint can be used in
HDL
(Verilog in my current design). However, an attempt to use it
produces
the
following error:

ERROR:Xst:1582 - The constraint 'tig=' is not supported neither in
BEGIN
MODEL/END section in the XCF file, nor in HDL code.

I have not been able to find further information on this error message
or
issue in the Xilinx site. Does anyone know if TIG is truly supported
in
HDL? I'd hate to place it in the UCF file, to me it feels much more
approprite to have this constraint move with the HDL source.

The form I'm using is:

// synthesis attribute TIG of <net_name> is "";

As a point of interest, the nets in question are the output of the
registers
of a microprocessor interface. The values are only sampled a few
times
per
second by the receiving module. There is not need to have any of
these
nets
meet nanosecond level timing constraints as other parts of the design
must.
Is there a better approach than "TIG"?

Hi Martin,

I think this old thread answers your question:

http://groups.google.com/groups?threadm=3dbd0daa%241_1%40lon-news.intensive.
net

Regards,
Allan.

Jon Elson · Apr 21, 2006

Ray Andraka wrote:

Rickman et al,

This is indeed fairly common. The manufacturer specifies the interface timing
only relative to signals on that interface, not relative to the DSP clock. This
often shows up on memory interfaces intended for static async RAM memory, for
example. In that case, the timing is all relative to the address, write pulse
and data and nearly never in relation to a clock. As Rickman points out, there
is no valid assumption you can make about the relationship to the DSP clock
other than the fact that these signals are synchronous to the clock unless it is
specified in the data sheet. Where it is not, even the manufacturer can't tell
you what the timing is because they don't characterize it.

Well, just because the manufacturer isn't putting the info in the data

sheets, and don't
test for it, I suspect that you could get a rough description that could
be quite helpful.
Knowing, for instance, that the strobes will always follow a CPU clock
by at least
1 ns, and never change than 5 nS after the clock, would make designing a
synchronous
memory/peripheral controller running from the same clock a lot simpler.

If you can just get the roughest of descriptions of the clock vs.
strobes circuitry, you
can improve the system performance greatly, because you don't have to
have ranked
FFs on every strobe.

Now, on the high speed stuff, with clock multipliers, etc. it can get
quite complex, and
a CPU swap to the next higher speed can throw everything off due to a
change in clock
multiplier. But, I gathered from the initial post that this wasn't a
clock multiplied CPU.

Jon

Mike Treseler · Apr 21, 2006

Bob Perlman wrote:

Voting circuits don't work as a metastability cure.

I agree.

The odds that one synchronizer will work without error is always better than
the odds that two synchronizers will work without error.

The voting circuit only works without error
only if two of the three synchronizers work without error.

-- Mike Treseler

Jim Kearney · Apr 21, 2006

"Pablo Bleyer Kocik" <pablobleyer@hotmail.com> wrote in message

Exactly, this is what I was referring to. Sorry if it was not clear.
I want to share the same CCLK line for another purpose after the FPGA
has been configured. This will be a toggling 5V signal. Will the
(sleeping) CCLK will have any concerns with that?

Here is some official documentation - look at Answer # 10046 in Xilinx's
Answer Database. If this applies to Virtex, I believe it should to the
Spartan II as well, since it is a derivative device.

Phil Hays · Apr 21, 2006

"Nicholas C. Weaver" wrote:

Does this mean that, thanks to routing delay, you could just do a 3
flip-flops in parallel for capturing, voting circuit on the other
side, and not have to worry about it?

What if one FF says '1', one FF says '0' and the last is someplace
inbetween?

--
Phil Hays

Ray Andraka · Apr 21, 2006

Good luck even getting the structure out of the manufacturer though. Even if you do,
I think you'll find a good part of the delay is in the clock tree. Unless that is
characterized, you probably won't be able to infer much more than what the data sheet
tells you.

Jon Elson wrote:

Ray Andraka wrote:

Rickman et al,

This is indeed fairly common. The manufacturer specifies the interface timing
only relative to signals on that interface, not relative to the DSP clock. This
often shows up on memory interfaces intended for static async RAM memory, for
example. In that case, the timing is all relative to the address, write pulse
and data and nearly never in relation to a clock. As Rickman points out, there
is no valid assumption you can make about the relationship to the DSP clock
other than the fact that these signals are synchronous to the clock unless it is
specified in the data sheet. Where it is not, even the manufacturer can't tell
you what the timing is because they don't characterize it.

Well, just because the manufacturer isn't putting the info in the data
sheets, and don't
test for it, I suspect that you could get a rough description that could
be quite helpful.
Knowing, for instance, that the strobes will always follow a CPU clock
by at least
1 ns, and never change than 5 nS after the clock, would make designing a
synchronous
memory/peripheral controller running from the same clock a lot simpler.

If you can just get the roughest of descriptions of the clock vs.
strobes circuitry, you
can improve the system performance greatly, because you don't have to
have ranked
FFs on every strobe.

Now, on the high speed stuff, with clock multipliers, etc. it can get
quite complex, and
a CPU swap to the next higher speed can throw everything off due to a
change in clock
multiplier. But, I gathered from the initial post that this wasn't a
clock multiplied CPU.

Jon

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930 Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

"They that give up essential liberty to obtain a little
temporary safety deserve neither liberty nor safety."
-Benjamin Franklin, 1759

Ray Andraka · Apr 21, 2006

The M512 has a much coarser granularity, and is not nearly as localized as the SRL16's in
Xilinx. This becomes very apparent with reloadable DA filters, as the size of the M512
makes for a less than optimum DA lut. In any event, to get the most from any given
architecture, you have to design to the architecture, hence my comment about staying in
whatever you are comfortable with.

Peter Sommerfeld wrote:

I haven't used Xilinx parts so I don't know how an SRL is more
reconfigurable than the Stratix's TriMatrix memories (ie M512, etc).
Using a dual-port M512 in a design means I can use it as LUT and
simultaneously reconfigure in through the other port. Why are you
saying the SRL is the only reloadable LUT out there?

-- Pete

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930 Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

"They that give up essential liberty to obtain a little
temporary safety deserve neither liberty nor safety."
-Benjamin Franklin, 1759

Ray Andraka · Apr 21, 2006

Or, you can use the fast carry chain for the and. The luts are (i0 xnor i1) and
(i2 xnor i3). Lut outputs feed into the muxcy sel inputs, muxcy di<='0' and ci
comes from previous muxcy co. First muxcy ci<='1', then your equality compare
comes out of the last muxcy. Works out faster once you go over about 20 bit
inputs.

rickman wrote:

zhengyu wrote:

I've got two quick question. I don't have FPGA yet, but I want someone to
offer me some quick comments

1. I have got to do some 64 bit integer comparison, actually I have to do up
to 64 comparisons at the same time, the output is whether there is any pair
that equals.

This is not a question...

Equality compares are easy. It uses a two input XOR for each bit with
all the results being OR'd together. This will take 32 LUTs for the XOR
and the first OR gate and 11 more LUTs to combine the rest for a total
of 43 LUTs in four levels. If the design uses the "special" features
that most chips have (ORing of LUTs within a CLB), you can use the LUTs
in pairs or even groups of four and reduce the number of levels for
speed.

2. If I want to create an 16 bit address space, that would translate to 512
k bits, does Vertex II give enough
block RAM so I don't have to use SRAM to do that? What kind of latency
performance should I expect from
typical SRAM, is 5ns read access reasonable?? what is the performance of
block ram??

Is that 16 bit address (64k words) of 8 bit words? Because 64k x 8 =
512K.

You can get this much RAM in the VirtexII if you use the XC2V500 part.
Or in the new Spartan3 you could use a XC3S1500. I am not sure which
will be cheaper, but I bet it is the Spartan3.

The speed of the block RAM will be much faster than anything external to
the FPGA. The block ram will be synchronous and lends itself well to
pipelined operations.

A lot of how you design will be implemented will depend on your data
flow which you have said nothing about. Think about how the storage
will be orgainized and accessed. Obviously one large block of memory
with one interface will not let you do 64 compares at one time. If you
rate of performing these compares is not fast, you can use one compare
logic block and run the different data through it sequentially. Then
one memory could easily do the job.

--

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design URL http://www.arius.com
4 King Ave 301-682-7772 Voice
Frederick, MD 21701-3110 301-682-7666 FAX

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930 Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

"They that give up essential liberty to obtain a little
temporary safety deserve neither liberty nor safety."
-Benjamin Franklin, 1759

Phil Hays · Apr 21, 2006

Ray Andraka wrote:

Says who? See http://www.andraka.com/gallery.htm for examples of large designs
that use placement constraints extensively. The key is to use hierarchy. In
order to do so with the current tools though, you'll have to do the placement in
the bottom levels of your code, not in the floorplanner.

Ray's impressive designs are mostly regular structures, mostly data
path. Placement in the bottom levels of code works well for such
designs. When the critical paths are in complex control logic, such
placement isn't a realistic alternative. Other techniques are more
useful. FPGA designs are not all the same sorts of things. Different
requirements lead to different usages of the parts and the tools.

--
Phil Hays

sanjay · Apr 21, 2006

Hi,
As Ray said, it actually implements 2-bit Carry Look Ahead per slice.
Definitely, you can implement any other adder which is of interest to you
using LUTs.
Pls. go to Xilinx Project Navigator => Help => Online Documentation =>XST
User Guide => HDL Coding Techniques to find various signed and Unsigned
Adder Implementation.

Regards,
Sanjay

"Ray Andraka" <ray@andraka.com> wrote in message
news:3F4A8912.FCFFC489@andraka.com...

It infers a ripple carry adder using the fast carry chains. The FPGAs
have special logic for implementing fast ripple carry (they actually do a
2 bit carry look-ahead in hardware, but that is invisible to the user).
Other carry schemes are forced to use the much slower general purpose
routing and logic. In most cases, the adder built using the fast carry
chains is not only the minimum area, it is also the maximum performance.

Nagaraj wrote:

Hi all,
I am using XST (ISE 5.1i sp3) for my logic synthesis. If I write a
piece of VHDL code as in " c <= a + b ", an N-bit adder will be
inferred (assuming a,b,c are N bits).
As there are many types of adder algorithms/implementations
available (like Carry Look Ahead, Carry Save etc.), I want to know
which one does XST infer? Can I have a control over the type of adder
?

Regards,
Nagaraj

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930 Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

"They that give up essential liberty to obtain a little
temporary safety deserve neither liberty nor safety."
-Benjamin Franklin, 1759

Nicholas C. Weaver · Apr 21, 2006

In article <3F4AE286.80E9CAE4@attbi.com>,
Phil Hays <SpamPostmaster@attbi.com> wrote:

Ray's impressive designs are mostly regular structures, mostly data
path. Placement in the bottom levels of code works well for such
designs. When the critical paths are in complex control logic, such
placement isn't a realistic alternative. Other techniques are more
useful. FPGA designs are not all the same sorts of things. Different
requirements lead to different usages of the parts and the tools.

However, the complex-control-logic hairballs are what simulated
annealing and related FPGA placement algorithms are designed to do
best. A common (and effective) technique is to hand-place the regular
datapath, and then let the placement tool arrange the control logic
around the periphery.
--
Nicholas C. Weaver nweaver@cs.berkeley.edu

Hal Murray · Apr 21, 2006

However, the complex-control-logic hairballs are what simulated
annealing and related FPGA placement algorithms are designed to do
best. A common (and effective) technique is to hand-place the regular
datapath, and then let the placement tool arrange the control logic
around the periphery.

How good is the current software?

Many years ago, I could, do a manual placement and automatic routing
if far less time than the P&R tools took to do the whole job.

My "complex-control-logic hairballs" were generally pretty simple,
mostly one-hot state machines. Maybe some pipelining to help.
(They probably have to be simple if you are going to co-exist with
a well planned data path.)

--
The suespammers.org mail server is located in California. So are all my
other mailboxes. Please do not send unsolicited bulk e-mail or unsolicited
commercial e-mail to my suespammers.org address or any of my other addresses.
These are my opinions, not necessarily my employer's. I hate spam.

EDK : FSL macros defined by Xilinx are wrong

Martin Euredjian

Guest

Uwe Bonnes

Guest

Jan Panteltje

Guest

Bob Perlman

Guest

Peter Alfke

Guest

Pablo Bleyer Kocik

Guest

Symon

Guest

Ray Andraka

Guest

Martin Euredjian

Guest

Jon Elson

Guest

Mike Treseler

Guest

Jim Kearney

Guest

Phil Hays

Guest

Ray Andraka

Guest

Ray Andraka

Guest

Ray Andraka

Guest

Phil Hays

Guest

sanjay

Guest

Nicholas C. Weaver

Guest

Hal Murray

Guest

Log in

Welcome to EDABoard.com

Sponsor