Problem with large arrays

Guest
Hi,

I am having trouble with a design that is trying to use a large array: the
elaboration tool (Cadence's NCelab) says that the limit is 2^22 locations (I
am trying to using an array of signals that is just slightly larger) for the
32-bit tool. I tried the 64-bit tool (limit of 2^52) but it says it doesn't
have enough disk space for intermediate files (I only have 100MB available).

Is it possible to manually partition the array into smaller subsections i.e. a
set of signals that are arrays vs. one big array? Any pointers on how to do
this if it is possible?
 
On Thu, 17 Nov 2005 20:01:54 +0000 (UTC), gthorpe@ee.ryerson.ca wrote:

Hi,

I am having trouble with a design that is trying to use a large array: the
elaboration tool (Cadence's NCelab) says that the limit is 2^22 locations (I
am trying to using an array of signals that is just slightly larger) for the
32-bit tool. I tried the 64-bit tool (limit of 2^52) but it says it doesn't
have enough disk space for intermediate files (I only have 100MB available).
Why do you need the gigantic array? Typically you need it to model a
large external memory. No sensible simulation will access *every*
location of this very large memory. So the correct approach is to
model only those locations that are actually used in the simulation.
In VHDL this is fairly straightforward to do, since you can create
dynamically-allocated data structures; but it is a non-trivial
programming problem to make it work really well. Naive approaches
are likely to simulate rather slowly.

Generally this idea is known as "sparse memory". I know that NC
supports sparse memory models in Verilog; I'm not sure about VHDL.
Trawl through the NC help documentation looking for "sparse memory"
and see if there's an out-of-the-box solution. If there isn't,
then my preferred approach for memory modelling is this:

* Split the memory address (as a bit pattern) into two halves,
"row address" and "column address". (For a DRAM these have an
obvious counterpart in the RAM itself, but that's not important.)
* Make a table of pointers (access type) indexed by the row address,
so that each pointer points to one row. However, at startup you
initialise all these pointers to NULL because no rows have yet been
allocated.
* When your simulation makes a write access to the memory, find the
row address pointer. If it's NULL, then this is the very first
write to any location in that row and you must allocate a new
row data structure and make the pointer point to it. A row
data structure is of course just an array of memory locations
indexed by the column address. Now that you have the real row,
you can write to the appropriate location in it.
* On a write access, if the row address pointer is not NULL then
you've already created that row and it's easy to write to the
appropriate location in that row.
* On a read access, if the row address pointer is NULL then you
know that the corresponding row has never been written and you
simply return (others => 'U') as the result. But if the row
exists, you can read the appropriate location in it.

This technique works really well if the pattern of accesses to
your memory shows some locality - in other words, if you access
a location then there's a high probability of accessing other
nearby locations. It's fast - each access costs only one lookup
in the table of row address pointers, a dereference of that pointer
and then a second array access into the row data structure. But
it works very badly if the pattern of accesses into the memory
is random or tends to go in columns, because you end up creating
many row data structures each containing only one or two useful
locations. If you have that kind of problem, a hashing approach
will work better. I'm not going to try to explain that here :)
--
Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how
VHDL, Verilog, SystemC, Perl, Tcl/Tk, Verification, Project Services

Doulos Ltd. Church Hatch, 22 Market Place, Ringwood, BH24 1AW, UK
Tel: +44 (0)1425 471223 mail:jonathan.bromley@doulos.com
Fax: +44 (0)1425 471573 Web: http://www.doulos.com

The contents of this message may contain personal views which
are not the views of Doulos Ltd., unless specifically stated.
 
Jonathan Bromley <jonathan.bromley@doulos.com> wrote:
On Thu, 17 Nov 2005 20:01:54 +0000 (UTC), gthorpe@ee.ryerson.ca wrote:

Hi,

I am having trouble with a design that is trying to use a large array: the
elaboration tool (Cadence's NCelab) says that the limit is 2^22 locations (I
am trying to using an array of signals that is just slightly larger) for the
32-bit tool. I tried the 64-bit tool (limit of 2^52) but it says it doesn't
have enough disk space for intermediate files (I only have 100MB available).

Why do you need the gigantic array? Typically you need it to model a
large external memory. No sensible simulation will access *every*
location of this very large memory. So the correct approach is to
model only those locations that are actually used in the simulation.
I am not modeling external memory, but I need to store a lookup table in a
design unit and it will actually be part of the design. This is only because I
want to compare the cost-performance of using a large lookup table versus
using an approximation method: I know this will be a horrible large design, but
I have to do it for comparison purposes.

Previously, instead of signals I used variables and the snapshot was not any
bigger than the approximation method, most likely due to the optimizations
you are describing (it only seemed to store those values that were actually
used). However, now I want to separate the part updating the table from the
part performing control based on value in the table and I think a signal is the
only sythesizeable way to accomplish this. The simulation is of course now much
larger, but synthesis results should not change. Would a shared variable be
synthesizeable (only one process writes, the other only reads)?

I got the 32-bit tool to work by manually partitioning the array (the 64-bit
tool requires more [2x?] space and this would have been very bad, see below).

In VHDL this is fairly straightforward to do, since you can create
dynamically-allocated data structures; but it is a non-trivial
programming problem to make it work really well. Naive approaches
are likely to simulate rather slowly.

Generally this idea is known as "sparse memory". I know that NC
supports sparse memory models in Verilog; I'm not sure about VHDL.
Trawl through the NC help documentation looking for "sparse memory"
and see if there's an out-of-the-box solution. If there isn't,
then my preferred approach for memory modelling is this:

* Split the memory address (as a bit pattern) into two halves,
"row address" and "column address". (For a DRAM these have an
obvious counterpart in the RAM itself, but that's not important.)
* Make a table of pointers (access type) indexed by the row address,
so that each pointer points to one row. However, at startup you
initialise all these pointers to NULL because no rows have yet been
allocated.
* When your simulation makes a write access to the memory, find the
row address pointer. If it's NULL, then this is the very first
write to any location in that row and you must allocate a new
row data structure and make the pointer point to it. A row
data structure is of course just an array of memory locations
indexed by the column address. Now that you have the real row,
you can write to the appropriate location in it.
* On a write access, if the row address pointer is not NULL then
you've already created that row and it's easy to write to the
appropriate location in that row.
* On a read access, if the row address pointer is NULL then you
know that the corresponding row has never been written and you
simply return (others => 'U') as the result. But if the row
exists, you can read the appropriate location in it.

This technique works really well if the pattern of accesses to
your memory shows some locality - in other words, if you access
a location then there's a high probability of accessing other
nearby locations. It's fast - each access costs only one lookup
in the table of row address pointers, a dereference of that pointer
and then a second array access into the row data structure. But
it works very badly if the pattern of accesses into the memory
is random or tends to go in columns, because you end up creating
many row data structures each containing only one or two useful
locations. If you have that kind of problem, a hashing approach
will work better. I'm not going to try to explain that here :)
--
Thanks for outlining this technique though: it will be useful for my testbench
for this project (which just happens to be a memory controller)! However,
it seems that the NC tools have some way of doing this when it involves
variables (maybe something to do with keeping a history for signals).

Also, it turn out simulation speed is not diminished by the big array of
signals, just that it eats memory (I used some temporary space and it requires
over 300MB -- and this is using the smaller 32-bit tool).


Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how
VHDL, Verilog, SystemC, Perl, Tcl/Tk, Verification, Project Services

Doulos Ltd. Church Hatch, 22 Market Place, Ringwood, BH24 1AW, UK
Tel: +44 (0)1425 471223 mail:jonathan.bromley@doulos.com
Fax: +44 (0)1425 471573 Web: http://www.doulos.com

The contents of this message may contain personal views which
are not the views of Doulos Ltd., unless specifically stated.
 

Welcome to EDABoard.com

Sponsor

Back
Top