Manual Partitioning to Multiple FPGAs

T

tushit

Guest
Hi,
I have a design which does not fit on my Altera Stratix device. I need
to split it onto 2 Stratix devices. Is it possible to manually do
this? I can't afford a partitioning software. The clock frequency for
the design after fitting will be around 30MHz and I can run the design
at a speed slower than that achieved after fitting.
So can I safely operate the design at say 20Mhz if Quartus was to
ensure a speed of 30Mhz on a single larger FPGA? Slowing the FPGA by
10 MHz would mean I have an extra 100ns delay which will be used up by
the interconnect delay between the 2 FPGAs(due to rise time/fall time
of IO pins). Assuming this approach works, approx. how much extra
delay should I leave for the interconnect delays? Are there any other
issues I should be aware of?
Thanks
Tushit
 
"tushit" <tushitjain@yahoo.com> wrote in message
news:ec6aab0.0402152209.5f58efb6@posting.google.com...
Hi,
I have a design which does not fit on my Altera Stratix device. I need
to split it onto 2 Stratix devices. Is it possible to manually do
this? I can't afford a partitioning software. The clock frequency for
the design after fitting will be around 30MHz and I can run the design
at a speed slower than that achieved after fitting.
Apart from the obvious organisational problem - how to make the split -
there may be some tricky issues about clock synchronisation. You
need to be sure that setup AND HOLD times are met in both devices.

So can I safely operate the design at say 20Mhz if Quartus was to
ensure a speed of 30Mhz on a single larger FPGA? Slowing the FPGA by
10 MHz would mean I have an extra 100ns delay
Sorry? 30MHz is 33ns period, 20MHz is 50ns period; sounds
like only 17ns extra, to me.

which will be used up by
the interconnect delay between the 2 FPGAs(due to rise time/fall time
of IO pins). Assuming this approach works, approx. how much extra
delay should I leave for the interconnect delays?
Remember that the propagation delay of a typical output driver
is dominated by the capacitance it is driving. Propagation delay
across a typical PCB is around 2ns per foot, so that should be OK
unless your FPGAs are a long way apart or you let your PCB
autorouter do silly things. Data sheet specs for FPGA output
drivers usually tell you how the delay increases as a function
of capacitance, so it should all be fairly predictable.

If you are splitting one FPGA into two, it seems likely that
the signals from one FPGA to the other will drive only one
FPGA input in most cases. Therefore the capacitive slowdown
should be modest.

Be ready to add pipeline stages in the design, to cope with
the very large propagation delays of FPGA I/O pad structures.
But 30MHz should be easy to achieve across the boundary.

Key suggestion: DON'T supply one FPGA's clock from an output
on the other FPGA. Instead, be sure to supply BOTH FPGA's
clocks from the same source. The worst-case skew between the
two FPGA's clock buffer delays should be very much smaller
than propagation delays of each FPGA's output pads;
if this is true, you will have no problems with hold time.
--
Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how
VHDL * Verilog * SystemC * Perl * Tcl/Tk * Verification * Project Services

Doulos Ltd. Church Hatch, 22 Market Place, Ringwood, Hampshire, BH24 1AW, UK
Tel: +44 (0)1425 471223 mail: jonathan.bromley@doulos.com
Fax: +44 (0)1425 471573 Web: http://www.doulos.com

The contents of this message may contain personal views which
are not the views of Doulos Ltd., unless specifically stated.
 
I'd redesign it for a split but better yet get a bigger part (if possible)
even if it means porting it to a different vendor. Cutting up a design into
multiple pieces, even with tools that promise to do it, is dicey. You have
timing issues, added real-estate as well as potential problems with scaling
(if applicable). In addition, the design becomes bounded and dependent on
the partition.
 
tushit wrote:

I have a design which does not fit on my Altera Stratix device.
You may have some unused resources
like block ram and multiplier/dsp blocks
that could be used for logic.

Consider trying other synthesizers.

-- Mike Treseler
 
Jonathan Bromley wrote:

Key suggestion: DON'T supply one FPGA's clock from an output
on the other FPGA. Instead, be sure to supply BOTH FPGA's
clocks from the same source. The worst-case skew between the
two FPGA's clock buffer delays should be very much smaller
than propagation delays of each FPGA's output pads;
if this is true, you will have no problems with hold time.

I wonder about this because I'm looking into moving data across device
boundaries for a project. The approach I am favoring at the moment is to
have a source-synchronous bus + control + clock leave device A and enter
device B. The output clock would be generated via DDR method within the
IOB. It would seem to me that --assuming careful PCB layout-- this method
might be preferable to having an external clock generator feed devices A and
B.
Am I missing something? I can see that with proper DCM configuration it
truly doesn't matter which way you go (or it shouldn't)?


--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Martin Euredjian

To send private email:
0_0_0_0_@pacbell.net
where
"0_0_0_0_" = "martineu"
 
Hi Tushit,

In Quartus II 4.0 try setting the following Logic Options:

a) Auto Packed Registers : to either Minimize Area or Minimize Area with
chains. This is set in the Assignment Settings->Fitter Settings->More
Settings Dialog:.

b) Optimization Technique is set to Area. This is set in the Assignment
Settings->Analysis and Synthesis settings.

- Subroto Datta
Altera Corp.


"tushit" <tushitjain@yahoo.com> wrote in message
news:ec6aab0.0402152209.5f58efb6@posting.google.com...
Hi,
I have a design which does not fit on my Altera Stratix device. I need
to split it onto 2 Stratix devices. Is it possible to manually do
this? I can't afford a partitioning software. The clock frequency for
the design after fitting will be around 30MHz and I can run the design
at a speed slower than that achieved after fitting.
So can I safely operate the design at say 20Mhz if Quartus was to
ensure a speed of 30Mhz on a single larger FPGA? Slowing the FPGA by
10 MHz would mean I have an extra 100ns delay which will be used up by
the interconnect delay between the 2 FPGAs(due to rise time/fall time
of IO pins). Assuming this approach works, approx. how much extra
delay should I leave for the interconnect delays? Are there any other
issues I should be aware of?
Thanks
Tushit
 
I wonder about this because I'm looking into moving data across device
boundaries for a project. The approach I am favoring at the moment is to
have a source-synchronous bus + control + clock leave device A and enter
device B. The output clock would be generated via DDR method within the
IOB. It would seem to me that --assuming careful PCB layout-- this method
might be preferable to having an external clock generator feed devices A and
B.
Am I missing something? I can see that with proper DCM configuration it
truly doesn't matter which way you go (or it shouldn't)?
I think one important idea is to use an approach that you are
comfortable with. What is "best" probably depends upon details
that haven't been specified yet.

What are you going to use for the main clock on device B?
Can you run the whole chip off the source-synchronous clock,
or does it need to run off a normal clock, in sync with
device A. If the latter, then you need a FIFO or such
to get across the clock boundary.

How fast are you running? Can you afford pipeline delays?
Will it work if you put a pipeline stage at the output IOBs
and another at the input IOBs? If things are slow enough so
that works, it avoids the clock re-sync tangle.

--
The suespammers.org mail server is located in California. So are all my
other mailboxes. Please do not send unsolicited bulk e-mail or unsolicited
commercial e-mail to my suespammers.org address or any of my other addresses.
These are my opinions, not necessarily my employer's. I hate spam.
 

Welcome to EDABoard.com

Sponsor

Back
Top