Distributed ram timing qurry

K

kaz

Guest
I have several fifos that are small and implemented as distribute
ram.There are some timing violations reported on paths between sourc
register of fifo control signals(e.g. read signal) and fifo data output.

This raised some question in my head as how timing is assessed for suc
fifos (or SRL for that matter). A fifo or SRL chain uses luts plus outpu
register. Wouldn't that mean there is inherently a long path to the outpu
register or should we say it is long but not combinatorial? Is ther
anyway to improve timing in such designs like fifos or SRL chains.

Regards

Kaz

--------------------------------------
Posted through http://www.FPGARelated.com
 
kaz wrote:
I have several fifos that are small and implemented as distributed
ram.There are some timing violations reported on paths between source
register of fifo control signals(e.g. read signal) and fifo data output.

This raised some question in my head as how timing is assessed for such
fifos (or SRL for that matter). A fifo or SRL chain uses luts plus output
register. Wouldn't that mean there is inherently a long path to the output
register or should we say it is long but not combinatorial? Is there
anyway to improve timing in such designs like fifos or SRL chains.

Regards

Kaz

---------------------------------------
Posted through http://www.FPGARelated.com

What you're asking is device-related. What FPGA family are you using?
Generally speaking, Xilinx devices can implement distributed memory
FIFO pretty well, but you need to limit the depth of the FIFO to
get it to run at high clock rates. Also there's a big difference
between common-clock FIFO and independent-clock FIFO. The first
kind can be implemented in SRL, the second cannot.

Also it would be good to see a full failing path from your timing
report (.twr file) to see if this is a logic level issue or
a routing length issue.

--
Gabor
 
kaz <37480@fpgarelated> wrote:

(snip, and previously snipped)

It is Virtex 6 and I have achieved better than 368MHz at module level.
At integration (into very large project) it fails marginally. As I said it
is the path from read register to fifo data output (single clock 16 words
depth, distributed ram).I don't expect any logic apart from fifo stages
implemented in luts. I am just asking if there is anyway to improve such
paths. I tried block ram and it failed very badly.

Do you mean that it fails, even when timing satisfies the
post-route timing data?

That isn't good.

-- glen
 
kaz <37480@fpgarelated> wrote:
kaz <37480@fpgarelated> wrote:
(snip, and previously snipped)
(snip)
It is Virtex 6 and I have achieved better than 368MHz at module level.
At integration (into very large project) it fails marginally. As I
said

(snip, I wrote)
Do you mean that it fails, even when timing satisfies the
post-route timing data?

(snip)
No, by module level I mean when compiled on its own. The actual
project is not based on any incremental approach or logic lock
but all the lower modules are just added to project to
be fitted freely anywhere it chooses.

I use Spartan, so it might be different.

Some years ago, I was working on a project where the pre-route
timing was so good, maybe twice as fast as it needed to be, and
I didn't even bother to look at the post-route timing until I
tried it, and it didn't work.

At least for Spartan, the routing is optimized over the whole chip.

Well, there are things that you can do to give hints and such,
but it is easily possible that changing one part changes the timing
of some unrelated part.

It might be that you can route some modules, keep them fixed while
you route others. That probably makes more sense in big designs.

But whatever it does, it should meet post-route timing, and
you should run that to be sure that is fast enough.

-- glen
 
What you're asking is device-related. What FPGA family are you using?
Generally speaking, Xilinx devices can implement distributed memory
FIFO pretty well, but you need to limit the depth of the FIFO to
get it to run at high clock rates. Also there's a big difference
between common-clock FIFO and independent-clock FIFO. The first
kind can be implemented in SRL, the second cannot.

Also it would be good to see a full failing path from your timing
report (.twr file) to see if this is a logic level issue or
a routing length issue.

--
Gabor

It is Virtex 6 and I have achieved better than 368MHz at module level.
At integration (into very large project) it fails marginally. As I said i
is the path from read register to fifo data output (single clock 16 word
depth, distributed ram).I don't expect any logic apart from fifo stage
implemented in luts. I am just asking if there is anyway to improve suc
paths. I tried block ram and it failed very badly.

Kaz


--------------------------------------
Posted through http://www.FPGARelated.com
 
kaz <37480@fpgarelated> wrote:

(snip, and previously snipped)

It is Virtex 6 and I have achieved better than 368MHz at module level.
At integration (into very large project) it fails marginally. As
said
it
is the path from read register to fifo data output (single clock 1
words
depth, distributed ram).I don't expect any logic apart from fif
stages
implemented in luts. I am just asking if there is anyway to improv
such
paths. I tried block ram and it failed very badly.

Do you mean that it fails, even when timing satisfies the
post-route timing data?

That isn't good.

-- glen

No, by module level I mean when compiled on its own. The actual project i
not based on any incremental approach or logic lock but all the lowe
modules are just added to project to be fitted freely anywhere it chooses


Kaz
--------------------------------------
Posted through http://www.FPGARelated.com
 

Welcome to EDABoard.com

Sponsor

Back
Top