Clock dividers and #1 delays.

V

Vikas Mishra

Guest
Hello Folks,

I am having some trouble understanding the operation of verilog
simulators w.r.t the design I want to implement below.

(I have drawn the diagram such that this should look ok with a fixed
width font - can't verify it looks correct using google group - but
basically it is a set of two flops where the first flop has a counter
at the input and flop A is clocked by clk, and flop B is clocked by
clk_by_2.)

+-----------+ +---------+ +---------+
| | | | | |
-----+ Counter +-------+ +-------------------+ |
| | | | | |
| | | Flop A | | Flop B |
+-----------+ | | | |
| /\ | | /\ |
+----+----+ +----+----+
| |
| |
| |
| |
| |
| +----------------+ |
| | | |
--------------+----+ Divide by 2 +-------+
| |
+----------------+

In this design, the flop A is being clocked by clk and Flop B is being
clocked by clk_by_2. I had modeled this using verilog without using
any #1. My observation with modelsim, VCS and NCSim is that Flop B,
latches the data that is launched by Flop A in the same cycle. The
simulator A.E's justify this saying that the Divide by 2, delays the
clock by a delta amount and due to this Flop B will latch the data in
the same clock.

I find this rather strange, since this is one of the cases, in which
not adding a #1 can result in a simulation synthesis mismatch
(simulator being wrong obviously).

Is this something that is expected ? What are the guidelines that you
folks follow in order to make sure that this simulates and synthesizes
correctly. Do you always add an incremental delay to the data to make
sure that this simulates as you would expect ?

I can post the verilog if someone would like to look at that.

Thanks for the help in advance.

Regards,
Vikas
 
Vikas Mishra <vikas.mishra@gmail.com> wrote:

I am having some trouble understanding the operation of verilog
simulators w.r.t the design I want to implement below.
Most logic now has zero hold time FF's...

(I have drawn the diagram such that this should look ok with a fixed
width font - can't verify it looks correct using google group - but
basically it is a set of two flops where the first flop has a counter
at the input and flop A is clocked by clk, and flop B is clocked by
clk_by_2.)

+-----------+ +---------+ +---------+
| | | | | |
----+ Counter +-------+ +-------------------+ |
| | | Flop A | | Flop B |
+-----------+ | | | |
| /\ | | /\ |
+----+----+ +----+----+
| |
| +----------------+ |
| | | |
--------------+----+ Divide by 2 +-------+
| |
+----------------+

In this design, the flop A is being clocked by clk and Flop B is being
clocked by clk_by_2. I had modeled this using verilog without using
any #1. My observation with modelsim, VCS and NCSim is that Flop B,
latches the data that is launched by Flop A in the same cycle. The
simulator A.E's justify this saying that the Divide by 2, delays the
clock by a delta amount and due to this Flop B will latch the data in
the same clock.

I find this rather strange, since this is one of the cases, in which
not adding a #1 can result in a simulation synthesis mismatch
(simulator being wrong obviously).
For synthesis you are responsible for meeting the setup/hold time
for the logic family in use. It looks like your logic might fail
to meet the setup time in actual hardware.

In older verilog, using #1 was common when using blocking
assignment to generate the appropriate result. (I believe that
was true for a long time after non-blocking assignment was available.)

Is this something that is expected ? What are the guidelines that you
folks follow in order to make sure that this simulates and synthesizes
correctly. Do you always add an incremental delay to the data to make
sure that this simulates as you would expect ?
Best is to write synchronous logic, which doesn't clock FF's from
the output of other FF's or gated clocks.

-- glen
 
Hello Glen,

On Nov 4, 2:08 pm, glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
<snipped>
For synthesis you are responsible for meeting the setup/hold time
for the logic family in use.  It looks like your logic might fail
to meet the setup time in actual hardware.
I am not sure if I explained this clearly - I don't have a problem in
synthesis. In synthesis, I will close all timing parameters to ensure
that this can meet timing. This is purely a simulation problem for
me.

In older verilog, using #1 was common when using blocking
assignment to generate the appropriate result.  (I believe that
was true for a long time after non-blocking assignment was available.)

Is this something that is expected ? What are the guidelines that you
folks follow in order to make sure that this simulates and synthesizes
correctly. Do you always add an incremental delay to the data to make
sure that this simulates as you would expect ?

Best is to write synchronous logic, which doesn't clock FF's from
the output of other FF's or gated clocks.  
I agree with your comment - however this is an example I cooked up to
simplify the behavior that I was trying to understand and that is why
this looks so simplistic. The actual design is a wee bit more involved
than this. Also this happens to be the spec that I need to latch a
signal in the divided clock domain.

Regards,
Vikas
 
Vikas Mishra <vikas.mishra@gmail.com> wrote:

On Nov 4, 2:08?pm, glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
snipped
For synthesis you are responsible for meeting the setup/hold time
for the logic family in use. ?It looks like your logic might fail
to meet the setup time in actual hardware.

I am not sure if I explained this clearly - I don't have a problem in
synthesis. In synthesis, I will close all timing parameters to ensure
that this can meet timing. This is purely a simulation problem for
me.
Yes, but it is your responsibility to get that right. It is not
so obvious in the case given that you know that the clock will
arrive before, after, or during the input transition. (It depends
on the exact routing.)

(snip)

Best is to write synchronous logic, which doesn't clock FF's from
the output of other FF's or gated clocks. ?

I agree with your comment - however this is an example I cooked up to
simplify the behavior that I was trying to understand and that is why
this looks so simplistic. The actual design is a wee bit more involved
than this. Also this happens to be the spec that I need to latch a
signal in the divided clock domain.
More usual in synchronous design is a FF with an enable input.
From the logic shown, I believe that you could use the divided
clock as an enable input to the second FF. You would still have
to watch setup/hold on the enable, though.

Otherwise, you use post route timing data. I believe that means
that the appropriate #1 (or more) delays are a reasonable solution.

-- glen
 
On Nov 4, 5:16 am, glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
Vikas  Mishra <vikas.mis...@gmail.com> wrote:

On Nov 4, 2:08?pm, glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
snipped
For synthesis you are responsible for meeting the setup/hold time
for the logic family in use. ?It looks like your logic might fail
to meet the setup time in actual hardware.
I am not sure if I explained this clearly - I don't have a problem in
synthesis. In synthesis, I will close all timing parameters to ensure
that this can meet timing. This is purely a simulation problem for
me.

Yes, but it is your responsibility to get that right.  It is not
so obvious in the case given that you know that the clock will
arrive before, after, or during the input transition.  (It depends
on the exact routing.)

(snip)

Best is to write synchronous logic, which doesn't clock FF's from
the output of other FF's or gated clocks. ?
I agree with your comment - however this is an example I cooked up to
simplify the behavior that I was trying to understand and that is why
this looks so simplistic. The actual design is a wee bit more involved
than this. Also this happens to be the spec that I need to latch a
signal in the divided clock domain.

More usual in synchronous design is a FF with an enable input.
From the logic shown, I believe that you could use the divided
clock as an enable input to the second FF.  You would still have
to watch setup/hold on the enable, though.

Otherwise, you use post route timing data.  I believe that means
that the appropriate #1 (or more) delays are a reasonable solution.

-- glen
If you're trying to model two clocks whose rising edges are
coincident (for real hardware this means within the required skew
to meet hold time) then you should probably do something like:

always clk_2x = #5 !clk_2x; // presuming 100 MHz
always at (posedge clk_2x) clk_1x = !clk_1x; // use blocking assign

The blocking assign then switches clk_1x before the non-blocking
assignments are made and should fix the mis-match you're seeing.

HTH,
Gabor
 

Welcome to EDABoard.com

Sponsor

Back
Top