Post-map simulation: timing violation and delays

S

sdaau

Guest
Hi all,

I am trying to implement a custom counter (with clock and enable inputs)
synthesis and behavioral & post-translate simulation pass just fine (usin
ISE WebPack 13.2). On post-map simulation, I get this:

at 271179 ps(5), Instance /my_counter_test/UUT/c_0/ : Warning: /X_FF SETU
High VIOLATION ON CE WITH RESPECT TO CLK;
Expected := 0.428 ns; Observed := 0.144 ns; At : 271.179 ns

... as well as X values in my output. I am already aware that I can avoi
the X's by doing `INST "c_0" ASYNC_REG = TRUE;` in the constraints .uc
file; but that simply gets rid of the X's (in which case I do get correc
values) - however, I'd like to tackle the timing violation.

I was looking for a while into this, and I interpret it like so: th
variable c that I have in my counter code, has been converted by synthesi
process into (at least) one flipflop for each 'bit', c_0 being the F
corresponding to bit 0 (of variable c). After some searching, I found tha
this FF has its own clk and CE inputs - the relationship between the thes
signals, and the 'master' clock and enable is shown in the below screensho
from isim:

http://sdaaubckp.sf.net/post/img/cntr_timing_violation.png

So, I can see that:

* wclk and wenbl are the 'master' signals, and they are synchronous (the
both rise at exactly the same time)
* The delay between wclk(wenbl) and c_0.ce is some 1.035 ns
* The delay between wclk(wenbl) and c_0.clk is some 1.179 ns
* (Thus, the delay between is c_0.ce and c_0.clk is 0.144 ns)

... and so, I gather, the violation tells us that c_0.ce must be high fo
at least 0.428 ns before c_0.clk goes high (i.e. the setup time); howeve
that state lasted for only 0.144 ns in the simulation, so the simulato
complains.

Now, the most obvious thing would be to insert a delay of at leas
0.428-0.144= 0.284 ns between c_0.clk and c_0.ce (or between c_0.clk an
wclk), and I guess then the timing violation would be gone, is tha
correct?

However, the problem is that I would not want to move the first clk afte
enable in the next period using the state machine - and I have no idea ho
to otherwise implement such a delay of ~ 0.3 ns.

I was thinking that timing constraints in the .ucf file would help, and
was experimenting with some `OFFSET = IN 8 ns VALID 6 ns BEFORE "clk
RISING;` - and while this helps with Timing Analysis report errors, ther
is no change in the simulator. Then again, here I want to *increase* th
(minimum) delay - and as far as I can tell, timing constraints in the .uc
file serve to limit ("decrease") the (maximum) delay. If that is so, the
.ucf file constraints cannot help much with introducing delay, I guess...

So I was wandering - what would be the appropriate method to handle thes
timing violations? And have I understood the above situation correctly?

Thanks in advance for any answers,
Cheers!


---------------------------------------
Posted through http://www.FPGARelated.com
 
Hi again,

* wclk and wenbl are the 'master' signals, and they are synchronous (they
both rise at exactly the same time)
* The delay between wclk(wenbl) and c_0.ce is some 1.035 ns
* The delay between wclk(wenbl) and c_0.clk is some 1.179 ns
* (Thus, the delay between is c_0.ce and c_0.clk is 0.144 ns)

....

Now, the most obvious thing would be to insert a delay of at least
0.428-0.144= 0.284 ns between c_0.clk and c_0.ce ...

However, the problem is that I would not want to move the first clk after
enable in the next period using the state machine - and I have no ide
how
to otherwise implement such a delay of ~ 0.3 ns.
Well, I remembered the old "two inverters in cascade = delay"; so I decide
to try that:

...
-- simulate delay for enable with two inverters
ATTRIBUTE keep : STRING;
SIGNAL wi1_enbl, wi2_enbl: STD_LOGIC := 'Z';
ATTRIBUTE keep of wi1_enbl, wi2_enbl: SIGNAL IS "true" ;
begin
wi1_enbl <= not(enbl);
wi2_enbl <= not(wi1_enbl);
...

... and got the stats: wclk to .clk: 1.179 ns; wclk to .ce: 2.158 ns; .cl
to .ce: 0.979 ns! So definitely more than the 0.428 ns required (note th
delay per inverter gate would be some 0.979-0.144 = 0.4175 ns ; on its ow
it probably wouldn't be enough - but with the 0.144 built-in from before
if it still exists as a wire delay with these changes, it would have don
it). And, needless to say, no more X's in the output, nor violatio
complaints by ISIM (at least to those related to enbl signal)...

Well, that is at least something, I guess - but is there a more appropriat
way to solve something like this?

Thanks,
Cheers!

---------------------------------------
Posted through http://www.FPGARelated.com
 
sdaau <sd@n_o_s_p_a_m.n_o_s_p_a_m.imi.aau.dk> wrote:

(snip)
Now, the most obvious thing would be to insert a delay of at least
0.428-0.144= 0.284 ns between c_0.clk and c_0.ce ...
(snip)
Well, I remembered the old "two inverters in cascade = delay";
so I decided to try that:
I presume you forced it to keep the inverters, otherwise they
will usually optimize away. You might try with only one forced,
in which case it will optmize the other by inverting the signal
somewhere else. Or with a forced non-inverting gate.
(snip)

Well, that is at least something, I guess - but is there a
more appropriate way to solve something like this?
-- glen
 
On Jul 22, 11:53 am, "sdaau" <sd@n_o_s_p_a_m.n_o_s_p_a_m.imi.aau.dk>
wrote:
I am trying to implement a custom counter (with clock and enable inputs);
synthesis and behavioral & post-translate simulation pass just fine (using
ISE WebPack 13.2). On post-map simulation, I get this:

at 271179 ps(5), Instance /my_counter_test/UUT/c_0/ : Warning: /X_FF SETUP
High VIOLATION ON CE WITH RESPECT TO CLK;
  Expected := 0.428 ns; Observed := 0.144 ns; At : 271.179 ns

snip

Now, the most obvious thing would be to insert a delay of at least
0.428-0.144= 0.284 ns between c_0.clk and c_0.ce (or between c_0.clk and
wclk),
No, the most obvious thing would be to check your testbench and
validate that your inputs meet the timing requirements because that's
where the problem likely lies.

and I guess then the timing violation would be gone, is that
correct?
No it is not correct...unless you're only interested in covering up
the problem and pushing it down the road to be fixed later.


However, the problem is that I would not want to move the first clk after
enable in the next period using the state machine - and I have no idea how
to otherwise implement such a delay of ~ 0.3 ns.
In FPGAs, you can't implement controlled time delays. Delay lines are
not a primitive element in the device.

I was thinking that timing constraints in the .ucf file would help
Timing constraints should have already been specified, but if you
haven't done so yet, then yes you should specify them.

So I was wandering - what would be the appropriate method to handle these
timing violations? And have I understood the above situation correctly?
I'm guessing based on what you described from the error message to
signals in your design that you may understand the failing path, but
what you're not understanding is what really needs to be fixed. The
problem could very likely be in your testbench rather than the design
but below I've listed the basic steps you need to follow:

1. Did you enter setup time constraints for all inputs? Did you
setup clock to output delay time constraints for all outputs? (Note:
For your particular problem, the cause is likely on the 'input' side)
2. What is the basis for the time constraints in #1? The correct
answer to this question is the datashee(s) of any device(s) that are
connected to the FPGA.
3. Are you sure you used the datasheet(s) timing constraints
properly? Setup time (Tsu) for the FPGA will be clock to output (Tco)
of the external device less any clock skew (Tskew) of the clock
(period T). In other words, the UCF file needs to specify a setup
time constraint of Tsu = T - Tskew - Tco. Repeat for each input. Do
a similar procedure for FPGA outputs.
4. Did the FPGA's timing report state that it meets all timing
constraints? The correct answer here is 'yes'. If not, iterate #1-4
until you have the correct answers to each question.

On the assumption that you've properly made it through #1-4 (and
assuming that there are no clock domain crossings), then your design
is OK. Since the design is OK, this implies the result of a timing
failure must be the testbench. The basic triage here is:
1. Verify that the inputs to the FPGA meet the requirements listed in
the FPGA's timing report output. As an example, if you have some
input that is generated synchronously, like this...
Some_Inp <= Blah_Blah_Blah when rising_edge(clock);
Then 'Some_Inp' will be transitioning 1 delta cycle (i.e. 0 ns) after
the rising edge of 'clock'. That will never meet any non-zero hold
time requirement that the FPGA timing report specified. Maybe the
testbench delays the clock like this...
Some_Inp <= Blah_Blah_Blah when rising_edge(clock);
Clock_To_Fpga <= clock;
Now the FPGA will see 'Some_Inp' and 'clock' transition at the exact
same time. Think that will meet either a setup or a hold time
requirement?
2. Although not relevant to your current problem, one would also want
to verify that you're sampling outputs at the appropriate time as
well. Usually though this is not a problem...if you did have a
mistake here though it would show up as a functional failure reported
by the testbench not a timing error reported by the post-route FPGA
design.

Since you didn't mention anything about multiple clocks in your
design, I've assumed that the design is a single clock design.
However, if there are multiple clocks then the error you reported
could be because the clock enable input is generated in one clock
domain and used to enable your counter which counts in another clock
domain. If that's the case, then your design will fail, the solution
is to resynchronize with a single flip flop the output from the source
domain into the counter's clock domain. That resynchronized signal
will be used to enable the counter.

Kevin Jennings
 
Hi all,


* wclk and wenbl are the 'master' signals, and they are synchronous (they
both rise at exactly the same time)

Nothing in post route rises at exactly the same time. Are these inpu
signals driven from your testbench? If so you need to spec a hold time fro
wclk->wenable and change your testbench to add this.

Clock enables are derived from the clock so they will have a clk->Q dela
that gives them hold time. The easiest way to model this is to resync th
wenable to the falling edge of wclk.


The scary thing is that I think your simulation is catching the enable o
the same wclk that creates the wenable. If thats so then everything i
happening one cycle before it should. In real life if a clk creates a
enable then the enabled act occurs on the next clock.

John




---------------------------------------
Posted through http://www.FPGARelated.com
 
Hi all,

First of all, thank you all for the very prompt responses, and sorry
couldn't respond earlier. I think the crux of the matter is summed up i
@jt_eaton's comment:

Nothing in post route rises at exactly the same time.
... but I believe I should try to explain a bit, what it is I'm lookin
after. A bit of a mammoth post follows - apologies in advance.


For one, I have only partial knowledge of HDL, but so far I manage somehow
My biggest problem is, basically, that when I start coding, usually I en
up confused in the "things happening in the next clock cycle" thing.

From my sequential programming background, say when I see "a=2;" in C;
read that as: "_after the program counter passes this statement, a hold
value 2_" ... I try to relate that to HDL as in "_after the simulato
passes this posedge, a holds value 2_" - so when I code stuff with thi
expectation, and I see 'action on next cycle' in simulator, I get confuse
thoroughly. Then I do all my best to defeat that in behavioral simulation
and usually I manage; then I come to post-map sim, and I realize most o
that does NOT really work.


So, I decided to study this a bit on a simpler example; for instance, for
chip interface, I'll need a clocked counter with enable and reset. Th
concept would be simple: when enable high, do increase count on cloc
posedge; on reset high, do not increase count and set count to 0. Fo
instance, that is exactly the kind of device which is given here:

http://www.asic-world.com/vhdl/first1.html#Counter_Design_Block

I modified that code a bit (counter_aw.vhd), and used my own testbenc
(test_twb.vhd), which I put here (along with some screenshots I'll refe
to):

http://sdaaubckp.sourceforge.net/post/vhd_counter_aw/

Clock is 50 Mhz (period 20 ns). The "Counter_Design_Block" is architectur
'behav' in the 'counter_aw.vhd' file (uncommented). This one works unde
behavioral simulation as I expect it to (aw_orig_beh_sim.png); that is
reset of counter to 0 and its increase happen at the posedges I expect
Same results are for post-translate simulation (aw_orig_post-trans_sim.png
- however, post place and route sim (post-par_sim_delayed.png) is 'delayed
- e.g. from posedge of enable, to when cout becomes high, is like 30 ns (
3 clock semiperiods); however that is not the same delay throughout the si
run!

Since I encountered this before, I tried to code "my own" counte
(architecture my_starting_point, commented), and I immediately made som
mistakes - first, the final assignment to the output port was within a
'IF', so even behavioral simulation showed everything delayed to next cloc
cycle (aw_startp_beh_sim_delayed.png); after fixing that, this counte
behaves more-less the same as the previous exampl
(aw_startp_beh_sim_ok.png) - but the problem with it, is that it is no
synthesizable (as far as I can see, the problem is using rising_edge twic
on different signals in the same process).

So, after solving that, I basically ended up with the problem described i
the original post - unfortunately, I cannot reconstruct the conditions wit
the X's (that appear approx 4 ns after rise of wclk) that I got in th
original post (then again, that day my PC did crash a couple of times, s
maybe that had something to do with problems with memory for ise or isim?)
Then I got to the inverter thing, removed some of the timing violation
with it; and found that to avoid the final timing violations, 'reset
internally would have to be effectuated 'first', 'enable' second and th
'clk' last - so I delayed the clk twice (four inverters), and enable once
and I got to architecture my_ending_point (commented).

With my_ending_point code, the behavioral simulatio
(aw_endp_beh_sim_delayed_no-ucf.png) seems fine, except that the very firs
count after enable happens in "next" clock cycle -- however, post-par si
(aw_endp_post-par_sim_delayed_ucf.png) shows that, in addition, there ar
glitches - and there is almost 10 ns delay (the 'effectuation' of the coun
happens almost on clk negedge)!! For the post-map si
(post-map_sim_delayed_ucf.png) this delay seems to be less (though still 5
< x < 10 ns) , but glitches are still there.

While I'm at the glitches, "Xilinx Synthesis and Simulation Design Guide"
notes:

Glitches in Your Design
When a glitch (small pulse) occurs ..., the glitch
may be passed along ..., or it may
be swallowed .. . .. To produce more
accurate simulation of how signals are propagated
within the silicon, Xilinx models this
behavior in the timing simulation netlist.
When it says "Xilinx *models*", does it mean that the glitches will be
there present "by design" of the HDL code circuit - or is it something the
simulator introduces? Meaning, should I try to eliminate them through
design, or should I just be careful if they "propagate"? Then again - I
wasn't really aware of this until now - I was reading a bit more on this,
and turns out from basics, that minimal configuration of synchronous (as in
combinatorial/unclocked) circuits (Mealy/Moore ?!) are *by default*
glitchy, and one is advised to "buffer" the result with a (clocked) FF -
which results with the actual 'effectuation' occurring on next clock cycle;
so maybe the glitches in the sim just try to illustrate this effect?

Anyways - I'm sure in my initial code I used to get somewhat less than 5 ns
delay for post-map (which is why I'm surprised slightly at the above
results), but I can't reconstruct that anymore. Which, of course, means I
haven't done something right :) I guess my question would be down to - what
am I missing, so that I can get somewhat like the aw_orig_beh_sim.png
results in post-par sim, but delayed by no more than quarter period? That,
for me, would be a confirmation that the engine should more or less work
reliably on the chip as well - but is that a correct assumption? (if not,
then I probably shouldn't bother getting so "ideal" post-map/par results,
ideal as in "results almost like behavioral sim").

I've tried putting in some timing constraints (aw_endp_counter.ucf), while
trying to get rid of static timing and ise warnings as well (synthesizer
doesn't like outputs of combinatorial logic [due to use of inverters] to be
used as clock) - but I'm not really sure what I'm doing; since as far as I
can remember, changing the constraint values didn't really result with much
difference in post-map/par simulation.

Well, I guess this is as detailed as I can formulate my problem for now ...



I presume you forced it to keep the inverters, otherwise they
will usually optimize away. You might try with only one forced,
in which case it will optimize the other by inverting the signal
somewhere else. Or with a forced non-inverting gate.
Interesting trick about keeping only one forced - I just used "attribute
KEEP" on all of the involved signals, that seems to have worked..

Now, the most obvious thing would be to insert a delay ...

No, the most obvious thing would be to check your testbench and
validate that your inputs meet the timing requirements because that's
where the problem likely lies.
...

I was thinking that timing constraints in the .ucf file would help

Timing constraints should have already been specified, but if you
haven't done so yet, then yes you should specify them.
Got it - thanks to this comment, I started looking into timing constraints
as ISE understands them (in .ucf file), but I still cannot get a proper
understanding of those..


In FPGAs, you can't implement controlled time delays. Delay lines are
not a primitive element in the device.
Got that too - but could one consider two inverters to behave as a somewhat
controlled delay (as in, the actual delay obtained by them is dependent on
how they end up being routed - but we can still now they'll insert, say,
approx 0.4 ns?)


I'm guessing based on what you described from the error message to
signals in your design that you may understand the failing path, but
what you're not understanding is what really needs to be fixed.
Exactly - this is 100% correct :)


The problem could very likely be in your testbench rather than the
design
That could indeed be the problem - @jt_eaton seems to agree ...


below I've listed the basic steps you need to follow:
Thanks for taking the time to write those up, @KJ, much appreciated!


1. Did you enter setup time constraints for all inputs? Did you
setup clock to output delay time constraints for all outputs? (Note:
For your particular problem, the cause is likely on the 'input' side)
I didn't at first; then I tried, but as I said, I'm not sure I understand
it. For instance, i have:

OFFSET = IN 6 ns VALID 8 ns BEFORE "clk" RISING;

ISE draws a sort of a diagram, and the way I interpret the diagram, th
above sentence should mean "do not allow that a data signal synchronou
with rising edge of CLK, propagates outside of 2 < x < 4 ns range"; whic
is likely not correct, since I couldn't perceive anything to that effect i
simulation results.


2. What is the basis for the time constraints in #1? The correct
answer to this question is the datashee(s) of any device(s) that are
connected to the FPGA.
Well, I have the wrong answer, unfortunately :/ Essentially, I saw th
above timing violations, and simply tried to 'translate' them to timin
constraints (as I understood them above) - that probably was not the righ
way to do it. Other than that, I'm running clock @50 MHz, so I tried t
make the testbench for that - and to make the timing constraints relate t
100 MHz clock (as in - "if it works @100, it will work for 50 MHz too")
the device I'm intending to use this with counter with, however, ma
require a much slower counter (kHz).


3. Are you sure you used the datasheet(s) timing constraints
properly? Setup time (Tsu) for the FPGA will be clock to output (Tco)
of the external device less any clock skew (Tskew) of the clock
(period T). In other words, the UCF file needs to specify a setup
time constraint of Tsu = T - Tskew - Tco. Repeat for each input. Do
a similar procedure for FPGA outputs.
Thanks for this - I'll need to chew on this a bit more, I wasn't aware o
the "setup time constraint".


4. Did the FPGA's timing report state that it meets all timing
constraints? The correct answer here is 'yes'. If not, iterate #1-4
until you have the correct answers to each question.
Thanks for this too - I found the Implement Design/Map/"Analyze Post-Ma
Static Timing"; at first it was complaining (showed red X's), then I got i
to stop (but for the most part, I was just trying different numbers aroun
based on the messages I got, not sure what I actually did there :) )

Actually, now that I come back to it, I can see a fail:

Timing constraint: TIMEGRP "couts" OFFSET = OUT 5 ns AFTER COMP "clk";
...
Minimum allowable offset is 6.106ns.
-------------------------------------------------------------------------------

Paths for end point cout<11> (IOB.PAD), 1 path
-------------------------------------------------------------------------------

Slack (slowest paths): -1.106ns
I guess from this, if I put OFFSET = OUT 6.2 ns, it will pass? Or is ther
another way to force the synthesizer to conform to 5 ns?


On the assumption that you've properly made it through #1-4 (and
assuming that there are no clock domain crossings), then your design
is OK.
Talking about clock domain crossings - would inverting a clock four time
and "declaring" that signal as clock as well, constitute clock domai
crossing?


Since the design is OK, this implies the result of a timing
failure must be the testbench. The basic triage here is:
Many thanks for writing this up as well :)


1. Verify that the inputs to the FPGA meet the requirements listed in
the FPGA's timing report output. As an example, if you have some
input that is generated synchronously, like this...
Some_Inp <= Blah_Blah_Blah when rising_edge(clock);
Then 'Some_Inp' will be transitioning 1 delta cycle (i.e. 0 ns) after
the rising edge of 'clock'. That will *NEVER* meet any non-zero hold
time requirement that the FPGA timing report specified.
Thanks for this (emphasis mine) - as it can be seen in test_twb.vhd (fro
link above), what I do is simply:

...
wenbl <= '0';
wreset <= '0';
...

... which, I guess, means "effectuate these signals in parallel/at the sam
time" - and thus the 0 ns transition you're speaking of?


Maybe the testbench delays the clock like this...
Some_Inp <= Blah_Blah_Blah when rising_edge(clock);
Clock_To_Fpga <= clock;
Now the FPGA will see 'Some_Inp' and 'clock' transition at the exact
same time. Think that will meet either a setup or a hold time
requirement?
I have not used the "when" syntax so much - but I'd answer (from my
somewhat sequential programming perspective, and after the tips so far)
like this:
* Some_Inp part will "block" until rising_edge of clock; when posedge clock
occurs, it will effectuate the next statement - however after a delta of 0
ns;
** that is, Clock_To_Fpga will be effectuated "now"/"in parallel" with the
previous statement - that is on posedge of 'clk';
* FPGA will see both Some_Inp and Clock_To_Fpga change at the "same time";

* since FPGA expects that a setup time and hold time of minimum X ns has
transpired from the moment Some_Inp changes, to the moment 'Clock_To_Fpga'
changes (and, I assume, activates sampling of Some_Inp)
... hence, there will be a setup or hold timing violation - i.e. time
requirement will not be met. (?)


2. Although not relevant to your current problem, one would also want
to verify that you're sampling outputs at the appropriate time as
well. Usually though this is not a problem...if you did have a
mistake here though it would show up as a functional failure reported
by the testbench not a timing error reported by the post-route FPGA
design.
Would this be related to glitches too? I.e. if glitches occur close to
posedge sampling clock transition, I may want to 'buffer' the output, until
the next negedge for instance?


Since you didn't mention anything about multiple clocks in your
design, I've assumed that the design is a single clock design.
However, if there are multiple clocks then the error you reported
could be because the clock enable input is generated in one clock
domain and used to enable your counter which counts in another clock
domain.
Could it be, that the synthesizer recognizes the "twice inverted" clock
signal as a clock from a second domain?


If that's the case, then your design will fail, the solution
is to resynchronize with a single flip flop the output from the source
domain into the counter's clock domain. That resynchronized signal
will be used to enable the counter.
Would that resynchronization be like the 'buffering' for the minimal
Moore/Mealy glitching mentioned above? If so, then it would 'delay' the
'effectuation' of values until next clock cycle, right?


* wclk and wenbl are the 'master' signals, and they are synchronous
(they
both rise at exactly the same time)

Nothing in post route rises at exactly the same time.
Thanks for that - I guess now, I'm better aware of that; but when the
thread started I wasn't. Can this also be interpreted as: "Nothing in post
route should rise at exactly the same time" (as far as signals from the
testbench are concerned)?


Are these input
signals driven from your testbench?
Yup.


If so you need to spec a hold time from
wclk->wenable and change your testbench to add this.
Many thanks for that - see, *that* I wasn't aware of ... Will have to look
that up.


Clock enables are derived from the clock so they will have a clk->Q
delay
that gives them hold time.
Ok, that makes sense - much appreciated :)


The easiest way to model this is to resync the
wenable to the falling edge of wclk.
Makes a lot of sense now - will give it a shot. I know the answer is
probably yes - but in that case, do I again have to worry about timing
constraints?


The scary thing is that I think your simulation is catching the enable
on
the same wclk that creates the wenable.
I think that is correct - actually, it seems it does perceive some delay
between the wenable and the wclk, but (I guess) not enough.


If thats so then everything is
happening one cycle before it should. In real life if a clk creates an
enable then the enabled act occurs on the next clock.
Thanks for that - the occurring on "next clock" was exactly what I wanted
to avoid; and it seems, with all the "inverter delays" and such, what I
managed to do is move everything to happen "one cycle before it should" :)



In any case, to sum up - while I'm starting to see why "update on next
clock" is so important - is it still possible (or smart) to aim for updates
occurring at least earlier than a semiperiod *before* the 'next' clock (and
this is simply for my own perceptual ease in reading simulation results:
then it would be easier for me to read, if I get the value I expect in
*this* cycle)?


Thanks again for the awesome guidance,
Cheers!



---------------------------------------
Posted through http://www.FPGARelated.com
 
Hi all,

Just a followup to the previous post, as it seems I got some kind of
closure:

So, I decided to study this a bit on a simpler example; ...

http://sdaaubckp.sourceforge.net/post/vhd_counter_aw/
I have now uploaded counter_aw2.vhd, test_twb2.vhd and aw2.ucf on the sam
location (no changes in previous files).


First of all, I got a hint at "Buffer_type BUFG ignored? #8
http://forums.xilinx.com/t5/Implementation/Buffer-type-BUFG-ignored/m-p/167766#M350
about synthesis of asynchronous vs synchronous reset counter. So I decide
to take a look at the synchronous version from the start (counter_aw2.vhd
uncommented part).

Again behavioral sim was I expected it to be, and post-map sim showed X'
and timing violation. Now, the thing is that I wrote the testbenc
more-less randomly, just tossing arbitrary signal changes and WAIT delays
just to see 'in general' how the resulting circuit would behave. So, I ha
ended up enforcing 'clk' and 'enable' testbench signals to change at th
*same* moment in time. That may be good enough for behavioral sim, but no
for post-map - and to confirm the previous posts, this is the essence o
the problem.

So I just inserted a 'WAIT for 1 ns' delay in the testbenc
(test_twb2.vhd), and looped the rest of the signals - and there were n
more timing violations in post-map. (noting that if there is no suc
explicit loop, the whole process will - loop moving the phase for 1ns eac
iteration; and eventually causing the phase between 'enable' and 'clk' t
be again zero, thus causing periodic timing violations). Then I trie
delaying for PERIOD-1ns (just to have the enable rise just before clk)
and that worked fine as well.

Now, my guess is that, if I was working with external clk and enable,
couldn't just delay the signals for as many nanoseconds as I please, jus
to avoid timing violations - so I'd have to work according to some spec
However, in my case, the only thing external signal is the clock, an
enable and reset would be calculated from it - hence there will be som
inherent delay between clk and enable; and that would further limit th
usage of the counter. While talking of 'limitations': expanding the dat
out in bit lines and zooming in (in isim) will reveal that the glitches ar
due to different propagation times of individual bit changes (so they occu
only between particular value transitions) - so, in fact, nothing strang
there (as I first thought) :)

Measuring (in isim) the time between the clk posedge and change of dat
(cout) will reveal about 5.5 ns delay. Just for the heck of it, I tried t
limit that with a timing constraint in the .ucf file:

...
TIMEGRP "couts" OFFSET = OUT 4 ns AFTER COMP "clk";

... and immediately after, post-map static analysis failed:

-- Timing constraint: TIMEGRP "couts" OFFSET = OUT 4 ns AFTER COMP "clk"

-- 16 paths analyzed, 16 endpoints analyzed, 16 failing endpoints
-- 16 timing errors detected.
-- Minimum allowable offset is 5.878ns.

So, I guess this tells me: if I sample the cout 6 ns after clk posedge,
should have safe cout data; so for the testbench clock @50Hz, I coul
initiate count at clk posedge, and consider to have the right data @ nex
negedge, 10 ns after. However, even with a separate process:

process(clk)
begin
if falling_edge(clk) then
cout <= std_logic_vector(pre_count);
end if;
end process;

... glitches will still be there in post-map. And I guess, even i
'buffered' one more time, they will still occur: my guess is, synthesize
would instantiate FF for the bit buffer anyways, so there will be stil
different length routes to them -> different delays -> glitch. So the mi
6ns wait would be needed in respect to 'when does the rest of the engin
read this data' (rather than, 'when to read for a buffer' to avoid seein
glitches altogether).


In the end, even if I could somehow mask the glitches and avoid seein
them, metastability is still inheren
(http://www.asic-world.com/tidbits/metastablity.html) in reality; so
guess this is as good as it gets in post-map sim (given that my testbench
is 'arbitrarily written'; and the only constraint I have in the aw2.ucf is
clock @100 MHz).

And, of course: getting rid of the X's and timing violations in post-map
sim, doesn't mean that post-route sim will be just as well behaving :) But
at least I have some sort of understanding from a simple example to go
along with, when tackling that - thank you all for the help!

Cheers!


---------------------------------------
Posted through http://www.FPGARelated.com
 

Welcome to EDABoard.com

Sponsor

Back
Top