EDK : FSL macros defined by Xilinx are wrong

Duane wrote:
johnp wrote:
The suggestion to recode the Verilog to look like:
always @(posedge clk)
sig4 <= (sig1 & ~sig2) ? sig3 : sig4;

concerns me since a smart synthesizer would recognize this to be
EXACTLY the sime code, just written in an odd way.
That would require that the synthesis tool specifically look for the
default value on the right be the same signal as is being assigned to.
While I suppose it is possible that a synthesis tool might do that, I
kind of doubt it.
I'd bet it would. I am continually amazed at how good the synthesis
optimizers are getting.

OP can probably force the results he's asking for by using a 'keep'
attribute, if he really wants to. I don't know how to express it in
Verilog, but the VHDL code follows. See 'KEEP' in the constraints guide
for Verilog syntax.

I couldn't resist on this one, had to do the experiment, and Antti is
100% right in this instance. P/R into an XC2V40-5 with CE logic in the
same LUT along with the CE MUX gives _slower_ results than putting only
the CE logic into a LUT and using the CE pin and and its built-in CE
MUX:

Using CE pin: under 2 ns.
Using LUT : over 2 ns.

An odd thing is that XST infers 2 FFs for the LUT version.
(Did someone say pushing a rope?)

Nice that the synthesis tools keep getting better, and there are less
opportunities to second guess them.

Regards,
John

entity CE_Inferral is
Port ( clk : in std_logic;
rst : in std_logic;
a_in : in std_logic;
b_in : in std_logic;
c_in : in std_logic;
q_out : out std_logic);
end CE_Inferral;

architecture Behavioral of CE_Inferral is
signal q : std_logic;
signal a : std_logic;
signal b : std_logic;
signal c : std_logic;
signal d_lut : std_logic;
attribute keep : string;
attribute keep of d_lut : signal is "true";
begin
q_out <= q;
--d_lut <= c when a = '1' and b = '0' -- Uncomment these two lines
-- else q; -- for 'CE' in LUT logic (slower)
process( clk )
begin
if RISING_EDGE( clk ) then
a <= a_in; -- sync port inputs
b <= b_in;
c <= c_in;
if rst = '1' then
q <= '0';
else
-- q <= d_lut; -- Uncomment this 1 line for 'CE' in LUT logic
if a = '1' and b = '0' then -- Uncomment these 3 lines
q <= c; -- to use CE pin on FF with only
end if; -- 'a and not b' function in a LUT
end if; end if;
end process;
end Behavioral;
 
JustJohn wrote:
Duane wrote:
johnp wrote:
The suggestion to recode the Verilog to look like:
always @(posedge clk)
sig4 <= (sig1 & ~sig2) ? sig3 : sig4;

concerns me since a smart synthesizer would recognize this to be
EXACTLY the sime code, just written in an odd way.
That would require that the synthesis tool specifically look for the
default value on the right be the same signal as is being assigned to.
While I suppose it is possible that a synthesis tool might do that, I
kind of doubt it.

I'd bet it would. I am continually amazed at how good the synthesis
optimizers are getting.
The OP is using XST 6.2.03. It's not that smart.

Regards,
Allan.
 
Since my 'sig3' vector is four bits wide, the signal from the CE logic
needs to
fan out to the 4 flip flops. Now we get routing delay.

Antti's example may be correct, but for the 4 bit wide destination, I
think
I get a performance penalty.

I love synthesis, but... It sure would be nice to have any easier way
to
direct it! In any event, it sure beats schematics.

John Providenza
 
Thank you Austin.
I'll check the documentation and secure the state machines.

Regards,
Nick


On Tue, 22 Nov 2005 15:55:00 -0800, Austin Lesea <austin@xilinx.com>
wrote:

Nick,

I assume that Cyclone works similarly to our own FPGAs in that all flip
flops are inti tally set to 0 by the house-cleaning (initialization
prior to configuration) at power on.

You can check this by reading their manual on what happens during power
on and configuration.

Then, during configuration, the state of the flip flops for logic may
(or may not) be set, or reset to a state as specified by the bitstream
(depends on the device, and its options when being configured).

If you have designed the state machine with no hidden states, and in
such a way that it will always return to a known state given a set of
good inputs, there is no need for a reset.

In the case of a 1-hot state machine (very popular in FPGAs) this also
means that detection of having more than one state set (more than one
flip flop) must decode and send you back to a known state of having only
one state active!

Austin

Nick wrote:

Hello,

I'm in the final phase of a design in VHDL on a Cyclon, and i am
really puzzled by something.
I do not have an external reset pin, so how can i ensure that my
states machines start at the right state, that all values are well
initialized and everything ?

It seems to work as it is now, but i couldn't find any litterature on
this subject.

Many thanks
Nick
 
Andy Peters wrote:

Andreas wrote:
Hello,
I just got a new FPGA board (from Avnet, Xilinx Virtex4). The problem is
that I never programmend a FPGA before. I use VHDL for programming and
Precision from Mentor for synthesis. Xilinx ISE 7.1 is used for
place-and-route. On the FPGA board is a push button. Within my VHDL code
I defined a process, which is sensitive to the rising edge of the signal
associated with that push button (I want to know, when the button is
pushed).

pb_proc: process (push_button) is
begin
if push_button'event and push_button='1' then
....

The problem is, that Precision recognizes that signal and the associated
pad as a clock input and the during place-and-route operation, ISE
produces the following error message:

ERROR:place:645 - A clock IOB clock component is not placed at an
optimal
clock IOB site The clock IOB component <push_button> is placed at site
IOB_X2Y112. The clock IO site can use the fast path between the IO and
the Clock buffer/GCLK if the IOB is placed in the master Clock IOB Site.
If this sub optimal condition is acceptable for this design you may set
the environmentvariable XIL_PLACE_ALLOW_LOCAL_BUFG_ROUTING to demote this
message to a WARNING and allow your design to continue.

You got this error because your VHDL instantiates a flip-flop whose
clock is the signal push_button, and your constraints told the tools to
put the signal push_button onto a pin that's not a global clock pin.

You should spend some quality time with the XST manual, especially the
sections that detail how certain structures are inferred from VHDL.
The instantiation of a flip-flop is exactly what I wanted as well as the
connection between the push_button and the clock input of the flip-flop.
The only thing I don't want is that the synthesis tool treats the
push_button as an external clock input although it is connected to the
clock input of a flip-flop.
I've read the documentation before I wrote any posting, but I couldn't find
the answers I need.



P.S.: another question of topi: When I setup my Design, I have to chose
technology(Virtex-IV), Device(4vlx25ff668) and Speed Grade(-10 or -11).
Which speed grade do I have to choose?
Do not know it.

Choose the speed grade that matches the device installed on your board.

Does the board support -10 and -11?

Dunno, look at the chip, or read the documents that came with your
board.
I did it before my posting, but coudn't find any information.


Thanks,
Andreas
 
Nick <nick@no-domain> wrote:

I'm in the final phase of a design in VHDL on a Cyclon, and i am
really puzzled by something.
I do not have an external reset pin, so how can i ensure that my
states machines start at the right state, that all values are well
initialized and everything ?
I've never used Cyclon, but I'd expect that there is an asynchronous
reset and/or set applied as part of configuration. If so, then what
you need to worry about is the first clock. As the reset will be
released at different times across the chip, some FFs may be released
from reset before others, and a statemachine may end up in a non valid
state that will prevent correct functioning. This can be solved by
using safe statemachines, so that all non-valid states map into valid
states in at most a few clocks, or by using statemachines with no
invalid states at all. A binary counter, for example, has only valid
states.

Now, suppose there was a "reset" statemachine that held reset to the
rest of the statemachines until the configuration was released and
all?

This is fairly simple, put in a binary counter or similar safe
statemachine with more than enough counts (or states) to make sure
that the reset is released, have it hold synchronous reset to the
reset of the design until count complete, then release it. Example in
VHDL follows:


use ieee.numeric_std.all;
entity
....
architecture
....
Signal reset : std_logic := '1';
Signal count : unsigned(3 downto 0) := "0000";
begin
--
-- This counter is used to hold all statemachines in reset for the
-- first 8 or so clocks after the end of configuration.
--
RESET_STATE: process(clk)
begin
if rising_edge(clk) then
reset <= count(3);
if count(3) = '1' then
count <= count + 1;
end if;
end if;
end process;
(rest of code)


Note that not all synthesis tools can correctly handle this. Some of
the old tools would have problems with this. While I've used similar
tricks in the past, I have not verified this exact code.

Note that the number of bits in count needs to be large enough to get
well past the end of asynchronous reset, and not so large as to cause
startup delays.


--
Phil Hays to reply solve: phil_hays at not(coldmail) dot com
If not cold then hot
 
Hi Andreas,
I don't know how to do it in Precision but in XST you can tell the tool
to use no GlobalClockBuffer at all (Xilinx Specific Options Tab). Then
any Input can be (ab)used as a Clock input using normal routing
resoources instead the global clock net.
For one FF and Testing this might be OK. In a large design you will get
into big trouble.

Now, If you are a newbie you brobably intend to use the button for
manual clocking, to allow single stepping of your design. Beware!!!

Just imagine a simple counter driving some LEDs. What you expect is that
it increments with every press on the button. But what happens will be
random outputs to appear on your LEDs. Why is that? Because your Button
bounces several times each time you press it and/or release it. Not very
usefull, is it?

To overcome this problem you need two things: One is a clock divider
driven by the onboard Clock Oscillator. The Output can be something
about 100Hz and needs only to create an impulse of a single clocks length.

This signal can be used as a clock enable for a debouncing circuit which
is described in the Xilinx synthesis template. Then you can use your
button(s) for Input, and even (ab)use this Output signal as a Clock
Signal. But remember: Only for testing SMALL designs! You also need to
constrain the number of GCLK Buffers to 1.

Have a nice Synthesis

Eilert
 
When Quartus runs it prints a little message that all flip-flops that have a
reset high.. will be high after initialisation.. or words to that effect.

Simon

"Austin Lesea" <austin@xilinx.com> wrote in message
news:dm0b4s$4g56@xco-news.xilinx.com...
Nick,

I assume that Cyclone works similarly to our own FPGAs in that all flip
flops are inti tally set to 0 by the house-cleaning (initialization
prior to configuration) at power on.

You can check this by reading their manual on what happens during power
on and configuration.

Then, during configuration, the state of the flip flops for logic may
(or may not) be set, or reset to a state as specified by the bitstream
(depends on the device, and its options when being configured).

If you have designed the state machine with no hidden states, and in
such a way that it will always return to a known state given a set of
good inputs, there is no need for a reset.

In the case of a 1-hot state machine (very popular in FPGAs) this also
means that detection of having more than one state set (more than one
flip flop) must decode and send you back to a known state of having only
one state active!

Austin

Nick wrote:

Hello,

I'm in the final phase of a design in VHDL on a Cyclon, and i am
really puzzled by something.
I do not have an external reset pin, so how can i ensure that my
states machines start at the right state, that all values are well
initialized and everything ?

It seems to work as it is now, but i couldn't find any litterature on
this subject.

Many thanks
Nick
 
Just to correct you .. Just because 125 MHz is the reference... that doesn't
mean it can't be an ungated clock too!!!
you don't have to multiply the reference up .. you just run the entire
device at 125 MHz and "ignore" the other clocks.

However .. the suggestion by Vaughn is also good.. lots of clocks (if you
can use them)

Simon


"huangjie" <huangjielg@gmail.com> wrote in message
news:1132707932.396888.98280@f14g2000cwb.googlegroups.com...
I have understood your idea, and know why yours work but mine cann't .
Just because your slow clock is slow ,and mine is very fast.
How can I deal with 125M clocks just as it is 2M ? How fast my
"reference" for 125M ?
Perhaps I can use a group of some phase-shift clocks to get a clk
enable signals.
Thank you again!
 
Mark wrote:

Does anyone have any experience interfacing an FPGA to patient monitors?
http://www.fda.gov/cdrh/comp/designgd.html

Looks like there might be
a few fussy regulations.

-- Mike Treseler
 
backhus wrote:

Hi Andreas,
I don't know how to do it in Precision but in XST you can tell the tool
to use no GlobalClockBuffer at all (Xilinx Specific Options Tab). Then
any Input can be (ab)used as a Clock input using normal routing
resoources instead the global clock net.
For one FF and Testing this might be OK. In a large design you will get
into big trouble.

Now, If you are a newbie you brobably intend to use the button for
manual clocking, to allow single stepping of your design. Beware!!!

Just imagine a simple counter driving some LEDs. What you expect is that
it increments with every press on the button. But what happens will be
random outputs to appear on your LEDs. Why is that? Because your Button
bounces several times each time you press it and/or release it. Not very
usefull, is it?
Thank you for that information. The example you describe is exact the test
case I wanted to use in my first FPGA test design.


Greetings,
Andreas



To overcome this problem you need two things: One is a clock divider
driven by the onboard Clock Oscillator. The Output can be something
about 100Hz and needs only to create an impulse of a single clocks length.

This signal can be used as a clock enable for a debouncing circuit which
is described in the Xilinx synthesis template. Then you can use your
button(s) for Input, and even (ab)use this Output signal as a Clock
Signal. But remember: Only for testing SMALL designs! You also need to
constrain the number of GCLK Buffers to 1.

Have a nice Synthesis

Eilert
 
Nick wrote:
Hello,

I'm in the final phase of a design in VHDL on a Cyclon, and i am
really puzzled by something.
I do not have an external reset pin, so how can i ensure that my
states machines start at the right state, that all values are well
initialized and everything ?

It seems to work as it is now, but i couldn't find any litterature on
this subject.

Many thanks
Nick

Nick,

I don't know about Cyclone FPGAs but I suppose that they have some sort
of digital clock manager, DCM, as the Xilnx FPGAs have. In some designs
I have used the inverse of the DCM lock signal as global reset signal.
In that way all flip-flops in the design are reset simultaneously when
the clock is stable. (The DCM lock signal is asserted when all outputs
from the component are locked).

I don't know if this is considered good or bad practise, but it is
working quite good. Hasn't failed yet.

--
-----------------------------------------------
Johan Bernspĺng, xjohbex@xfoix.se
Research engineer

Swedish Defence Research Agency - FOI
Division of Command & Control Systems
Department of Electronic Warfare Systems

www.foi.se

Please remove the x's in the email address if
replying to me personally.
-----------------------------------------------
 
Thanks for Betz and Simon.
To Simon, my design have some clock at 125M without any phase and
frequence relations
but not only one, so which one should be the reference ?
To Betz, my trouble is NOT too many clocks but tow many interface
clocks not connected to the
dedicated clock pin.Although some clocks slow eg:33M PCI,but some of
very fast 125M.
I know I can use global clock,but how to calculate the delay of global
clock?
Interface has a valid data window about 4ns, how can I or how many ns
I should shift the global ?
My problem is skew between chip internal and chip external ,but not
skew in chip internal.
 
Thanks for the help guys, its exactly what I was looking for.

And Tim is right, I won't be getting realistic timing results unless I
load my inputs and outputs with registers. But at least I won't have
to worry about 132 I/O pin mappings that won't even be there affecting
my delay.

Thanks again

Jeremy
 
This can be done if you manually encode the state-machine and then in
the testbench apply descriptive ASCII text to the state-machine states
in a separate variable.

For instance, if you describe the following state-machine in your source
code:

parameter START = 3'b0001;
parameter WAIT = 3'b0010;
parameter START_OVER = 3'b0100;

(* FSM_ENCODING="USER" *) reg [2:0] state = BEGIN;

always@(posedge CLK) begin
if (RST) begin
state <= START;
state_out <= 1'b0;
end
else
case (state) (* FULL_CASE, PARALLEL_CASE *)
START : begin
if (state_input)
state <= WAIT;
else
state <= START;
state_out <= 1'b0;
end
WAIT : begin
if (!state_input)
state <= START_OVER;
else
state <= WAIT;
state_out <= 1'b0;
end
START_OVER : begin
state <= START;
state_out <= 1'b1;
end
endcase

And add the following to your testbench:

reg [8*10:0] state_decode;

always @(testbench.uut.state)
if (testbench.uut.state == 3'h001)
state_decode = "BEGIN";
else if (testbench.uut.state == 3'h010)
state_decode = "WAIT";
else if (testbench.uut.state == 3'h100)
state_decode = "START_OVER";
else
state_decode = "UNKNOWN";


You can then either add the state_decode to your waveform and tell the
simulator to display it as ASCII or you can place a $monitor onto the
state_decode and have it display this information to the console. This
should work for both functional and place & route simulation if you
control the state-mapping which is done above by implicitly stating the
mapping and using the synthesis option FSM_ENCODING="USER" (for XST in
this case) which says to the synthesis tool, do not mess with my
state-machine mapping. I have used this technique on a few designs and
do find it useful but you need to make sure that the mapping you choose
for your state-machine is optimal or else you may not get the best
implementation when doing this. This also works best when using a
hierarchical simulation method but is not necessary if you want to
slightly adjust your testbench between behavioral and timing simulation.

-- Brian



Bob Perlman wrote:
On 21 Nov 2005 18:33:09 -0800, ajeetha@gmail.com wrote:


Bob,
I also follow that in old Verilog. Thanks to SV, we have enums.
However the OP asked about Post place-and-route sim, hence this trick
won't help much. One needs to build equivalent signal names and enum
mapping. I believe MTI's virtual bus fits the bill better.

Regards
Ajeetha
www.noveldv.com


My mistake--I missed the part about post-place-and-route.

Bob Perlman
Cambrian Design Works
 
Phil Hays wrote:
Nick <nick@no-domain> wrote:
snipped
use ieee.numeric_std.all;
entity
...
architecture
...
Signal reset : std_logic := '1';
Signal count : unsigned(3 downto 0) := "0000";
begin
--
-- This counter is used to hold all statemachines in reset for the
-- first 8 or so clocks after the end of configuration.
--
RESET_STATE: process(clk)
begin
if rising_edge(clk) then
reset <= count(3);
if count(3) = '1' then
count <= count + 1;
end if;
end if;
end process;
(rest of code)


Note that not all synthesis tools can correctly handle this. Some of
the old tools would have problems with this. While I've used similar
tricks in the past, I have not verified this exact code.

Note that the number of bits in count needs to be large enough to get
well past the end of asynchronous reset, and not so large as to cause
startup delays.


--
Phil Hays to reply solve: phil_hays at not(coldmail) dot com
If not cold then hot
Shouldn't:
if count(3) = '1' then
count <= count + 1;
be:
if count(3) = '0' then
count <= count + 1;

-Dave Pollum
 
However, more realistic timing can often be obtained by placing you design unit (presumably will ultimately be located deep within
your design, eg a super fast multiplier) between registers, adding timing constraints and use the static timing report....
For this kind of exercise I usually use even two registers between
the pins and the DUT. When the P&R places the first register in the
IO pad the second register avoids a probably long path from the IO
to the DUT (assuming that the second register gets placed near the
DUT).

Another tip: If you have more input and output signals than pins just
add more registers and use signals from different pipeline stages.
Synthesizer are (not yet) smart enough to optimize this away ;-)

Martin
 
Hi Marco,

Try
when others =>
display7 <= "1111001"; -- use <= for assignment not =>

I know you new this one really ;-)

Enjoy

Tim



Marco wrote:
Starting from the ISE quick up/down 4bit counter tutorial, I inserted a
case statement to handle a single digit 7-segment display, but keep on
getting a parse error on the last line of the statement and don't know
why.

entity counter is
Port ( clock : in std_logic;
direction : in std_logic;
count_out : out std_logic_vector(3 downto 0);
display7_out: out std_logic_vector(6 downto 0));
end counter;

architecture Behavioral of counter is

signal count_int : std_logic_vector(3 downto 0) := "0000";
signal display7 : std_logic_vector(6 downto 0);

begin

process (clock)
begin

if clock ='1' and clock'event then

if direction = '1' then
count_int <= count_int + 1;
else
count_int <= count_int - 1;
end if;

case count_int is
when "0000" => -- Indicazione "0" display7 <= "0111111";
when "0001" => -- Indicazione "1"
display7 <= "0000110";
when "0010" => -- Indicazione "2"
display7 <= "1011011";
when "0011" => -- Indicazione "3"
display7 <= "1001111";
when "0100" => -- Indicazione "4"
display7 <= "1100110";
when "0101" => -- Indicazione "5"
display7 <= "1101101";
when "0110" => -- Indicazione "6"
display7 <= "1111100";
when "0111" => -- Indicazione "7"
display7 <= "0000111";
when "1000" => -- Indicazione "8"
display7 <= "1111111";
when "1001" => -- Indicazione "9"
display7 <= "1100111";
when "1010" => -- Indicazione "A"
display7 <= "1110111";
when "1011" => -- Indicazione "b"
display7 <= "1111100";
when "1100" => -- Indicazione "C"
display7 <= "0111001";
when "1101" => -- Indicazione "d"
display7 <= "1011110";
when "1110" => -- Indicazione "e"
display7 <= "1111011";
when "1111" => -- Indicazione "F"
display7 <= "1110001";
when others => -- Indicazione "E" -> fault
display7 => "1111001"; -- HERE SHOULD BE
THE PARSE ERROR ACCORDING TO XST
end case;

end if;

end process;

display7_out <= display7;
count_out <= count_int;

end Behavioral;


When I check syntax with XST I get "HDLParsers:164 <path> Line 89.
parse error, unexpected ROW, expecting OPENPAR or TICK or LSQBRACK".
It seems I'm wrong with my CASE statement, but where?
Thanks, Marco
 
I would like to code the on-chip memory in vendor neutral VHDL.
I got it running for a dual-port memory with single clock and
same port sizes for the read and write port.

However, I need a memory with a 32-bit write port and an 8-bit
read port. So far I was not able to code it in VHDL in a way
that the Synthesizer inferres the correct block ram without
an extra read MUX.
I'll give up one this vendor independent block RAM project. For
the 32-bit write data, 8-bit read data with registered address,
in data and unregistered out data RAM coded in VHDL I got:

On the Altera Cyclone: generates a 32-bit dual port RAM with an
external 4:1 MUX. This MUX hurts fmax (from 94MHz down to 84MHz)!

On the Xlinix Spartan-3: The RAM gets implemented as distributed
RAM! Uses a lot of LCs and the fmax goes from 65MHz down to
50MHz

So I will bite the bullet and use two vendor specific VHDL files.
However, there is one open issue: I want the memory size be
configurable via a generic. This is possible with Alteras
altsyncram.

For Xilinx I only know those RAMB16_S9_S36 components where
the memory size is part of the component name. Is there a
a Xilinx block RAM component where I can specify the size?

Thanks,
Martin
 
The RAMB16 elements are the raw RAM macros. The Sx part of the name
indicates the port width. You can build up bigger memories in Coregen which
is a bit like the Altera Megawizard tool or build them up yourself using
generic statements using the raw macros.

If you looking at switching between vendors one trick is to hide a RAM
inside a wrapper file. If you use the wrapper level as the RAM component for
instantiation then you will only have to change the technology based memory
element in one place i.e. the the wrapper file.

Some synthesisers are capable of inferring RAM usually using an indexed
array of something like VHDL's "std_logic_vector". I can't tell you much
about the results as it isn't my own preferred method but a non-vendor
synthesiser may do better than one offered by the silicon vendors.

John Adair
Enterpoint Ltd. - Home of FPGA PCI Development Boards.
http://www.enterpoint.co.uk


"Martin Schoeberl" <mschoebe@mail.tuwien.ac.at> wrote in message
news:4385cd12$0$8024$3b214f66@tunews.univie.ac.at...
I would like to code the on-chip memory in vendor neutral VHDL.
I got it running for a dual-port memory with single clock and
same port sizes for the read and write port.

However, I need a memory with a 32-bit write port and an 8-bit
read port. So far I was not able to code it in VHDL in a way
that the Synthesizer inferres the correct block ram without
an extra read MUX.


I'll give up one this vendor independent block RAM project. For
the 32-bit write data, 8-bit read data with registered address,
in data and unregistered out data RAM coded in VHDL I got:

On the Altera Cyclone: generates a 32-bit dual port RAM with an
external 4:1 MUX. This MUX hurts fmax (from 94MHz down to 84MHz)!

On the Xlinix Spartan-3: The RAM gets implemented as distributed
RAM! Uses a lot of LCs and the fmax goes from 65MHz down to
50MHz

So I will bite the bullet and use two vendor specific VHDL files.
However, there is one open issue: I want the memory size be
configurable via a generic. This is possible with Alteras
altsyncram.

For Xilinx I only know those RAMB16_S9_S36 components where
the memory size is part of the component name. Is there a
a Xilinx block RAM component where I can specify the size?

Thanks,
Martin
 

Welcome to EDABoard.com

Sponsor

Back
Top