Uncovering potential clock-domain related issues in design

G

googler

Guest
I am working on a design where I am dealing with two clock domains and
a large number of signals are crossing domains. I have followed the
general principles for multi-clock design (like using synchronizers,
using Gray pointers for FIFOs etc). However, I want to make sure that
I didn't leave any hole that might be a potential issue later on.
Currently I am running RTL simulation, but I think clock domain
related issues cannot be found through RTL simulation - is that right?
Is there a way I can still verify if clock domain related logic in my
code is fine, particularly at this stage of running RTL simulation? I
know most such issues are usually caught during gate-level simulation
(especially SDF GLS), but I don't want to wait that long.

I am interested to know how experienced designers uncover potential
issues related to multiple clock domains. Any special verification
technique or maybe some tools (like 0-in probably)? Thanks for any
advice.
 
googler wrote:

Currently I am running RTL simulation, but I think clock domain
related issues cannot be found through RTL simulation - is that right?
True, but I can find the *logical* clock frequency
margins of my design with RTL simulation.

For example, I might like to know how low can my system clock
frequency can go before I miss a bit on the serial receive channel.

If I can't force such a failure, my testbench has a problem.

Is there a way I can still verify if clock domain related logic in my
code is fine, particularly at this stage of running RTL simulation?
Lots of long-term bench tests at maximum bandwidth can help,
but this must really be guaranteed by design rules.

I use "known good" synchronization blocks between
synchronous islands.

When a single clock design is impossible,
my design process includes the following steps:
1. Partition the design into one top module or entity per clock.
2. All module designs synchronize and strobify non-native inputs.
3. Fast module designs stretch and handshake outputs to slow modules.
4. Run static timing for each module to verify Fmax for each clock.
5. Run a functional simulation for each top module.
6. Run an overall functional simulation.

I know most such issues are usually caught during gate-level simulation
(especially SDF GLS), but I don't want to wait that long.
I find it very difficult to catch *any* synchronization issues this way.


-- Mike Treseler
 
On Jun 5, 11:43 pm, googler <pinaki_...@yahoo.com> wrote:
I am working on a design where I am dealing with two clock domains and
a large number of signals are crossing domains. I have followed the
general principles for multi-clock design (like using synchronizers,
using Gray pointers for FIFOs etc). However, I want to make sure that
I didn't leave any hole that might be a potential issue later on.
Currently I am running RTL simulation, but I think clock domain
related issues cannot be found through RTL simulation - is that right?
Is there a way I can still verify if clock domain related logic in my
code is fine, particularly at this stage of running RTL simulation? I
know most such issues are usually caught during gate-level simulation
(especially SDF GLS), but I don't want to wait that long.

I am interested to know how experienced designers uncover potential
issues related to multiple clock domains. Any special verification
technique or maybe some tools (like 0-in probably)? Thanks for any
advice.
Try a static design rule checker. LEDA (Synopsys) and SpyGlass-CDC
(Atrenta) are a couple that immediately come to mind. These tools
identify clock domain crossings by statically examining the topology
of your design (no simulation). They then look at the crossings and
check whether or not you have synchronization logic between the
domains.

Formal tools, like 0-in, I think also often have CDC checking
capabilities, but you will probably be under-utilizing the tool (or
overpaying depending on how you look at it.) Not to say they don't
work, but Formal tools can do so much more than just CDC checking that
if that's all you use them for you won't be getting your money's
worth.

-Ryan
 
On Jun 8, 1:03 pm, Mike Treseler <mtrese...@gmail.com> wrote:
googler wrote:
Currently I am running RTL simulation, but I think clock domain
related issues cannot be found through RTL simulation - is that right?

True, but I can find the *logical* clock frequency
margins of my design with RTL simulation.

For example, I might like to know how low can my system clock
frequency can go before I miss a bit on the serial receive channel.

If I can't force such a failure, my testbench has a problem.

Is there a way I can still verify if clock domain related logic in my
code is fine, particularly at this stage of running RTL simulation?

Lots of long-term bench tests at maximum bandwidth can help,
but this must really be guaranteed by design rules.

I use "known good" synchronization blocks between
synchronous islands.

When a single clock design is impossible,
my design process includes the following steps:
1. Partition the design into one top module or entity per clock.
2. All module designs synchronize and strobify non-native inputs.
3. Fast module designs stretch and handshake outputs to slow modules.
4. Run static timing for each module to verify Fmax for each clock.
5. Run a functional simulation for each top module.
6. Run an overall functional simulation.

I know most such issues are usually caught during gate-level simulation
(especially SDF GLS), but I don't want to wait that long.

I find it very difficult to catch *any* synchronization issues this way.

-- Mike Treseler
Gate level timing simulations find two problems that are not found in
RTL simulation and static timing analysis:

1) Tool errors where the synthesis/p&r tool did not generate hardware
that behaves like the RTL.

2) Constraint errors such as inaccurate multi-cycle and false path
constraints.

The former is a risk factor: how much do you trust your tool, and what
is the trade-off between time spent in gate level simulations and time
spend finding the problem in the lab (this almost always tilts toward
Gate level sims with ASICs, but often towards the lab with FPGAs)

The latter is a must-do. Either avoid those constraints and force the
s/p/r tools to treat every path as single clock cycle, or run gate
level sims designed to demonstrate the correctness of those
constraints (i.e. focus the gate level simulations on those areas
containing the constraints). The only alternative is to use formal
verification tools that can prove/disprove multi-cycle and false path
constraints.

Andy
 
Andy wrote:

Gate level timing simulations find two problems that are not found in
RTL simulation and static timing analysis:

1) Tool errors where the synthesis/p&r tool did not generate hardware
that behaves like the RTL.
Yes. A good check-off test because while the odds
of a synthesis error are small, the odds of
finding such an error with a gate sim,
if it happens, are very good.

2) Constraint errors such as inaccurate multi-cycle and false path
constraints.
A gate sim is less effective in this case
unless the testbench covers static timing.

Either avoid those constraints and force the
s/p/r tools to treat every path as single clock cycle, or run gate
level sims designed to demonstrate the correctness of those
constraints
I prefer to avoid these constraints.

They are always confusing, never portable,
difficult to specify, and hard to verify.

I have yet to see an fpga case where such a constraint
worked better, or consumed fewer resources
than an equivalent pipelined version of the same design.

-- Mike Treseler
 
CDC checks include both topological exploration and functional
verification.
With the topological exploration, it is important to identify all
cross-domain paths.
With functional verification, it is important to verify that cross-
domain effects will not cause functional failures. For this purpose,
it is important to model cross-domain effects as well as develop cross-
domain-specific checks for each one of the cross-domain paths.

-Alex



Andy wrote:
On Jun 8, 1:03 pm, Mike Treseler <mtrese...@gmail.com> wrote:
googler wrote:
Currently I am running RTL simulation, but I think clock domain
related issues cannot be found through RTL simulation - is that right?

True, but I can find the *logical* clock frequency
margins of my design with RTL simulation.

For example, I might like to know how low can my system clock
frequency can go before I miss a bit on the serial receive channel.

If I can't force such a failure, my testbench has a problem.

Is there a way I can still verify if clock domain related logic in my
code is fine, particularly at this stage of running RTL simulation?

Lots of long-term bench tests at maximum bandwidth can help,
but this must really be guaranteed by design rules.

I use "known good" synchronization blocks between
synchronous islands.

When a single clock design is impossible,
my design process includes the following steps:
1. Partition the design into one top module or entity per clock.
2. All module designs synchronize and strobify non-native inputs.
3. Fast module designs stretch and handshake outputs to slow modules.
4. Run static timing for each module to verify Fmax for each clock.
5. Run a functional simulation for each top module.
6. Run an overall functional simulation.

I know most such issues are usually caught during gate-level simulation
(especially SDF GLS), but I don't want to wait that long.

I find it very difficult to catch *any* synchronization issues this way.

-- Mike Treseler

Gate level timing simulations find two problems that are not found in
RTL simulation and static timing analysis:

1) Tool errors where the synthesis/p&r tool did not generate hardware
that behaves like the RTL.

2) Constraint errors such as inaccurate multi-cycle and false path
constraints.

The former is a risk factor: how much do you trust your tool, and what
is the trade-off between time spent in gate level simulations and time
spend finding the problem in the lab (this almost always tilts toward
Gate level sims with ASICs, but often towards the lab with FPGAs)

The latter is a must-do. Either avoid those constraints and force the
s/p/r tools to treat every path as single clock cycle, or run gate
level sims designed to demonstrate the correctness of those
constraints (i.e. focus the gate level simulations on those areas
containing the constraints). The only alternative is to use formal
verification tools that can prove/disprove multi-cycle and false path
constraints.

Andy
 
googler wrote:
Currently I am running RTL simulation, but I think clock domain
related issues cannot be found through RTL simulation - is that right?
Is there a way I can still verify if clock domain related logic in my
code is fine, particularly at this stage of running RTL simulation? I
You might see some of the problems at RTL level, but most of the hard to
find problems are not visible in RTL simulations (metastability, bus
synchronizations etc.)

know most such issues are usually caught during gate-level simulation
(especially SDF GLS), but I don't want to wait that long.
You might see few problems more in gate level simulations, but you will
not see all the problems at gate level either. And usually asynchronous
domain crossings create bunch of warnings in simulations that are
usually ok in reality. Filtering out all the false positives is tough
job to do correctly.

I am interested to know how experienced designers uncover potential
issues related to multiple clock domains. Any special verification
technique or maybe some tools (like 0-in probably)? Thanks for any
advice.
I have used 0-in CDC and it found quite quickly few very hard to find
problems and the tool was quite easy to use. I used only the static
features tough, and quickly tried the simulation things. There are
also other static tools for this from other vendors, but the good
tools are not cheap. I would recommend static tools for CDC checking.

--Kim
 
Mike Treseler wrote:
Andy wrote:

1) Tool errors where the synthesis/p&r tool did not generate hardware
that behaves like the RTL.

Yes. A good check-off test because while the odds
of a synthesis error are small, the odds of
finding such an error with a gate sim,
if it happens, are very good.
I would say that the odds for synthesis error in FPGA tools is quite
high. I have seen at least one such error for each bigger design.
Fortunately for FPGAs the easiest way to test is in lab for such
errors, gate level simulations are such a pain to set up. Gate
simulations are more critical for ASIC flows.

2) Constraint errors such as inaccurate multi-cycle and false path
constraints.

A gate sim is less effective in this case
unless the testbench covers static timing.
Usually constraint errors are easy to catch in gate simulations. They
are most commonly in the internal logic of the design, and there each
cell has timing checks, and also the paths have delays, they usually
show errors in the false paths and multicycle paths. Or even some
missing rules might be visible. Things like duty cycle checks of some
small clock tree in ASIC can be easily forgotten from STA scripts, but
the error quickly visible in gate sims.


Either avoid those constraints and force the
s/p/r tools to treat every path as single clock cycle, or run gate
level sims designed to demonstrate the correctness of those
constraints

I prefer to avoid these constraints.

They are always confusing, never portable,
difficult to specify, and hard to verify.
There are static tools to verify multicycle constraints, and also some
tools will automatically try to find all possible multicycle and
false paths.

I have yet to see an fpga case where such a constraint
worked better, or consumed fewer resources
than an equivalent pipelined version of the same design.
Pipelining needs more FFs, multicycle paths save some resources. They
are just fine if used very carefully.

--Kim
 
Kim Enkovaara wrote:

I have used 0-in CDC and it found quite quickly few very hard to find
problems and the tool was quite easy to use. I used only the static
features tough, and quickly tried the simulation things. There are
also other static tools for this from other vendors, but the good
tools are not cheap. I would recommend static tools for CDC checking.
Thanks for the review.
The basic fpga STA tools mostly punt clock domain crossings.
I am intrigued with the idea of static tools for this.
I intend to evaluate 0-in CDC

-- Mike Treseler
 
Kim Enkovaara wrote:

Pipelining needs more FFs, multicycle paths save some resources. They
are just fine if used very carefully.
Yes, very carefully.
On fpgas, flops come for free with the LUTs

> --Kim
 
Kim Enkovaara wrote:
(snip)

I would say that the odds for synthesis error in FPGA tools is quite
high. I have seen at least one such error for each bigger design.
I only remember one, because it was so funny to see at the time.

The design tools liked to convert state machines to a one-hot
implementation, even if they were not written that way.

It turned out that I was using the value of the state register
(similar to a counter), but the tools didn't notice. The low
bits of the one-hot register were used in place of the count.

Fortunately for FPGAs the easiest way to test is in lab for such
errors, gate level simulations are such a pain to set up. Gate
simulations are more critical for ASIC flows.
That was for FPGA, but it showed up pretty fast in gate
level simulations.

I like to do my state machines in one always block, setting
the new state variable in one case statement, based on the
previous state. The tool liked two, one to set the new
state and one with case to set a temporary state variable.
(I think that is how it was done, anyway.)

-- glen
 

Welcome to EDABoard.com

Sponsor

Back
Top