Best Async FIFO Implementation

D

Davy

Guest
Hi all,

Does there exist a best implementation of Asynchronous FIFO?

Any suggestions will be appreciated!
Best regards,
Davy
 
Davy wrote:
Hi all,

Does there exist a best implementation of Asynchronous FIFO?

Any suggestions will be appreciated!
Best regards,
Davy
I guess it depends on what you're looking for.
At minimum, it should *work* ...
Then the rest is a compromise of resources/speed/feature(like almost
empty/full flags,...)/...(reliability?)


Sylvain
 
All members of the Virtex-4 family from Xilinx have a
(hard-coded=full-custom) FIFO controller in each of their BlockRAMs. It
accepts different clocks for read and write (called "asynchronous
operation") at any frequency up to 500 MHz. Capacity is 18 Kbits, the
width is 4 to 36 bits, and the depth is accordingly from 4K to 512
addresses (depth and width can easily be expanded with additional
BlockRAMs)
There is an EMPTY and a FULL flag, and also an ALMOST EMPTY and an
ALMOST FULL flag, both fully programmable (with 1-address granularity).

I designed the crucial asynchronous empty arbitration logic, and it
works perfectly: We tested it by writing data at ~200 MHz into the
FIFO, and reading it out at ~500 MHz, and the asynchrous empty-detect
logic had worked flawlessly for all those >10e14 operations when we
stopped the test after a week.
No real FIFO application will probably ever go empty 200 million times
a second...
The high performance is due to very fast and compact full-custom logic,
and our long experience in analyzing and dealing with the effects of
metastability.

Peter Alfke, Xilinx Applications (posting from home)
 
For simulation, are the Xilinx FIFO models any faster than before?
Just recently I had to write fully-synchronous FIFO models to
accelerate the simulations and achieved 100X (one hundred times)
improvement.

RAUL
 
Simulating asynchronous clocking must be very difficult and time
consuming (I dare not use the word "impossible" for fear of being
flamed). How do you cover all clock phase relationship, down to the
femtosecond level? Synchronizers operate with that kind of timing
resolution.
Peter Alfke, speaking for himself.
 
Event-based simulation allows you to have very fine resolutions. Just
make sure that all your signals crossing clock domains are flopped and
that there are no Clock-to-Q delays involved in your model. I have run
the fast FIFO models in ModelSim PE 6.1a and Veritak 1.75A and they
have indentical behavior to the Xilinx models.
 
Raul, this may just reveal my ignornce, but anyhow:

How do you model metastability, which needs sub-femtosecond resolution?
How do you model that an asynchronous FIFO generates its EMPTY flag in
time, even under the most adverse timing conditions between the two
incoming clocks?
Those have been things that kept me awake at night :-(

Peter Alfke
 
Peter Alfke wrote:
Raul, this may just reveal my ignornce, but anyhow:

How do you model metastability, which needs sub-femtosecond resolution?
How do you model that an asynchronous FIFO generates its EMPTY flag in
time, even under the most adverse timing conditions between the two
incoming clocks?
Those have been things that kept me awake at night :-(
Usually in RTL simulations you don't even want to model things like that.
Most important thing is to get fast simulation times for the whole design.
And at least in the past Xilinx models were overly complex for pure RTL
simulations, and usually own simulation models were needed to get the speed.

The correctness of the async fifos must come from the design, reviews
etc. It's impossible to simulate all the cases.

Of course with netlist simulations timing accurate models are needed,
but that is small part of simulations. That is usually done to check
timing constraints and synthesis bugs (if formal verification tools are
not part of the users toolset). Asynch portions are almost impossible to
simulate. Nowadays there are also formal tools that check clock domain
crossing correctness etc. Those tools can even inject errors during
simulation that could be caused by metastability (the places are found by the
formal portion).

--Kim
 
Kim, thank you for that clarification. That means I was right in
considering any simulation of metastability-causing asynchronous
clocking impossible. There is no substitute for creativity, circuit
analysis, some deep thinking, and experimentation. All of that we have
done to verify the metastable behavior of our flip-flops, and to verify
the behavior of our asynchronous FIFO in Virtex-4.
Obviously, one can always simulate the effect that a given metastable
delay has on the rest of the circuitry, but one cannot simulate the
origin of the metastable delay.
Peter Alfke, Xilinx Applications
 
Hi,

There is no need to simulate metastability. The RTL simulations are
functional. All conditions of empty and full have been verified with
directed and random behavior over long simulations with clocks sliding
past each other. The FIFOs are as assymetrical as 128 bits in and 16
bits out and with clocks as different as 37.125 MHz and 100 MHz.

The simulations have been proven correct in the lab on Virtex-2 Xilinx
FPGAs running for several hours with real data.

ModelSim PE's code profiler said that time was being spent mostly in
the Xilinx FIFOs.

RAUL
 
Hi, Davy -

You may want to browse a number of papers on my web page for coding
guidelines and coding styles related to multi-clock design and
asynchronous FIFO design.

At the web page: www.sunburst-design.com/papers

Look for the San Jose SNUG 2001 paper:
Synthesis and Scripting Techniques for Designing Multi-Asynchronous
Clock Designs

Look for the San Jose SNUG 2002 paper:
Simulation and Synthesis Techniques for Asynchronous FIFO Design

Look for the second San Jose SNUG 2002 paper (co-authored with Peter
Alfke of Xilinx):
Simulation and Synthesis Techniques for Asynchronous FIFO Design with
Asynchronous Pointer Comparisons

Peter likes the second FIFO style better but the asynchronous nature of
the design does not lend itself well to timing analysis and DFT.

I prefer the more synchronous style of the first FIFO paper.

I hope to have another FIFO paper on my web page soon that uses Peter's
clever quadrant-based full-empty detection with a more synchronous
coding style.

We spend hours covering multi-clock and Async FIFO design in my
Advanced Verilog Class. These are non-trivial topics that are poorly
covered in undergraduate training. I have had engineers email me to
tell me that their manager told them to run all clock-crossing signals
through a pair of flip-flops and everything should work! WRONG!

Regards - Cliff Cummings
Verilog & SystemVerilog Guru
www.sunburst-design.com
 

Welcome to EDABoard.com

Sponsor

Back
Top