ASIC speed

Y

Yu Jun

Guest
I'm working on a cpu core and intend to embed it into ASIC circuits,
with the aim to do some network processing. Now the FPGA prototype is
running and a 66M speed is achieved( xilinx virtexII-4 ). Wondering
how fast it can run in ASIC, we had our ASIC guys to synthesize the
codes and the result was shocking, it reached 400M! Far beyond our
expectation of 150M. The library we used was of 0.13u, from TI, fairly
fast, in which a NAND gate is around 0.03ns.

Now my question is: Is the ASIC speed result reliable? Since we didn't
do P&R( we don't have tools and experiences ), I really doubt the
timing report may be over optimistically estimated and not reliable. I
was told something about "wire load model" and ours is automatically
selected by the compiler.

Anybody can give me some hints or direct me to some documents will be
very appreciated! Thank you very much.

yu jun

yujun@huawei.com
 
Now my question is: Is the ASIC speed result reliable?
If it's from DC, then no.

Since we didn't
do P&R( we don't have tools and experiences ), I really doubt the
timing report may be over optimistically estimated and not reliable. I
was told something about "wire load model" and ours is automatically
selected by the compiler.
Knock off 20%, as you're likely to have a more realistic figure.

If you're working at .13, you probably want to be using physical
synthesis rather than synthesis based on wire load models.

Jon
 
Yu Jun,

Knock off 20% for .13u from schematic to RC extracted.

Also depends on what the foundry actually supports: is this based on lo-k
dieletric?

If not, that will take you down another 5%.

The Virtex II Pro IBM405PPC runs at 450 MHz, so I would expect any well
designed and semi-custom layout uP to be at least that fast in .13u.

Austin

Yu Jun wrote:

I'm working on a cpu core and intend to embed it into ASIC circuits,
with the aim to do some network processing. Now the FPGA prototype is
running and a 66M speed is achieved( xilinx virtexII-4 ). Wondering
how fast it can run in ASIC, we had our ASIC guys to synthesize the
codes and the result was shocking, it reached 400M! Far beyond our
expectation of 150M. The library we used was of 0.13u, from TI, fairly
fast, in which a NAND gate is around 0.03ns.

Now my question is: Is the ASIC speed result reliable? Since we didn't
do P&R( we don't have tools and experiences ), I really doubt the
timing report may be over optimistically estimated and not reliable. I
was told something about "wire load model" and ours is automatically
selected by the compiler.

Anybody can give me some hints or direct me to some documents will be
very appreciated! Thank you very much.

yu jun

yujun@huawei.com
 
Hi,

It is not quite as simple as that. In case you are using a
conservative wire-load model, provided by the silicon vendor, and a
healthy margin for clock jitter, scan flip-flop timing overhead and
second order effects, as well as a conservative setting for
environmental parameters (for example 100+ deg. celsius temperature
and voltage 15% lower than nominal for the process you
are using) then the results could be quite realistic. In case you are
running
DC with an optimistic setup than you could be off by way more than
20%. You
need to provide further info about your setup in order to get a
realistic
answer to your question.

Ljubisa Bajic
ATI Technologies
-------------- My opinions do not represent those of my employer
--------------


jon@beniston.com (Jon Beniston) wrote in message news:<e87b9ce8.0311070140.5bc4afb@posting.google.com>...
Now my question is: Is the ASIC speed result reliable?

If it's from DC, then no.

Since we didn't
do P&R( we don't have tools and experiences ), I really doubt the
timing report may be over optimistically estimated and not reliable. I
was told something about "wire load model" and ours is automatically
selected by the compiler.

Knock off 20%, as you're likely to have a more realistic figure.

If you're working at .13, you probably want to be using physical
synthesis rather than synthesis based on wire load models.

Jon
 
Austin Lesea <Austin.Lesea@xilinx.com> wrote in message news:<3FABC4A2.E78A6D86@xilinx.com>...
Yu Jun,

Knock off 20% for .13u from schematic to RC extracted.

Also depends on what the foundry actually supports: is this based on lo-k
dieletric?

If not, that will take you down another 5%.

The Virtex II Pro IBM405PPC runs at 450 MHz, so I would expect any well
designed and semi-custom layout uP to be at least that fast in .13u.

Austin

Yu Jun wrote:

I'm working on a cpu core and intend to embed it into ASIC circuits,
with the aim to do some network processing. Now the FPGA prototype is
running and a 66M speed is achieved( xilinx virtexII-4 ). Wondering
how fast it can run in ASIC, we had our ASIC guys to synthesize the
codes and the result was shocking, it reached 400M! Far beyond our
expectation of 150M. The library we used was of 0.13u, from TI, fairly
fast, in which a NAND gate is around 0.03ns.

Now my question is: Is the ASIC speed result reliable? Since we didn't
do P&R( we don't have tools and experiences ), I really doubt the
timing report may be over optimistically estimated and not reliable. I
was told something about "wire load model" and ours is automatically
selected by the compiler.

Anybody can give me some hints or direct me to some documents will be
very appreciated! Thank you very much.

yu jun

yujun@huawei.com


Your surprise really reflects that your design is not Blockram limited
but gate/logic level limited where ASICs will stay about 5x faster or
more. If you were not going to ASIC, your design might be considered
slow since you could push any Blockrams to 200MHz or so, but then it
is very difficult to do much cpu logic with only a few LUT levels per
cycle. MicroBlaze (at 120MHz)is probably limited to multiplier delay
as well as cpu logic levels long before hitting BlockRam limit, and I
am sure its hand placed where needed to boot.

For those designs that are truly Blockram limited, an ASIC memory
won't be much faster than BlockRams for the same architecture spec &
process, they are also likely made by same foundry on similar process.
Ofcourse ASICs can offer custom compiled SRAMs to get a bit more speed
and they do allow 5x more logic layers in that cycle.

The note of 30ps nand gates, that compares to 3GHz P4 cycle of 330ps
or about 10 gate delays. Although I am sure Intel doesn't use many
gates as we know them but various high speed pass logic schemes so
they are using much shorter transit times. Also SRAMs have for decades
had access times of about 10 gate delays too. And the old
supercomputer designers used to clock cpus in 10 ECL layers of dotted
logic, so I figure 10 Lut levels is fair enough cycle target. Luckily
the carry chains we need are not done by Lut level logic or we would
be really ____ed, but then we deal with switched wires instead.


johnjakson_usa_com
 

Welcome to EDABoard.com

Sponsor

Back
Top