32x32 fast multiplier

D

designer

Guest
Hi,
I am trying to come up with a fast multiplier (300MHz speed). inputs
are 32 bit wide and a truncated (32 bit wide) result is desired. What
I want to do is , Calculate the flags (condition codes like Z,C,N,V
flags ) before the multiplier result is calculated i.e. I want atleast
the condition codes to be ready at 300MHz clock speed. I tried to
search in the synopsys design ware, but couldn't get anything
specific on Flags calculation.

Thanks,
Vittal
 
On Mon, 22 Sep 2008 23:05:43 -0700 (PDT), designer
<vittal.patil@gmail.com> wrote:

Hi,
I am trying to come up with a fast multiplier (300MHz speed). inputs
are 32 bit wide and a truncated (32 bit wide) result is desired. What
I want to do is , Calculate the flags (condition codes like Z,C,N,V
flags ) before the multiplier result is calculated i.e. I want atleast
the condition codes to be ready at 300MHz clock speed. I tried to
search in the synopsys design ware, but couldn't get anything
specific on Flags calculation.
Almost all the condition codes can be calculated without the need the
full multiplier output. Obviously Zero and Negative are the easiest
ones. The output is zero is either input is zero and the output is
negative if the signs of the inputs are different. For overflow,
assuming you want it active if the result doesn't fit into 32 bits,
you need to calculate log2(a)+log2(b) < 32 which is again simpler than
the multiplier. Carry is similarly calculated. I'd suggest you design
your condition code logic based on the inputs as opposed to the output
of the multiplier and estimate how much hardware you need.
As to 300 MHz 32x32->32 multiplier, it strictly depends on what
process you're using with anything at or below 130nm, you should be
able to do it in a single cycle with no problems or at most 2 cycles
with .25u with almost any standard cell library. Designware has
pipelined multiplier support which I suggest you use it if your
version of DC doesn't support re-timing.
 
designer wrote:

I am trying to come up with a fast multiplier (300MHz speed). inputs
are 32 bit wide and a truncated (32 bit wide) result is desired. What
I want to do is , Calculate the flags (condition codes like Z,C,N,V
flags ) before the multiplier result is calculated i.e. I want atleast
the condition codes to be ready at 300MHz clock speed. I tried to
search in the synopsys design ware, but couldn't get anything
specific on Flags calculation.
I don't know what C is supposed to be, Z and N are easy.

I believe you can find some set of pairs (i,j) such that
i*j does not overflow, (i+1)*j and i*(j+1) do.
If there aren't too many, and especially if some are powers
of two (making the comparison easy) a bunch of comparators
input to an AND/OR tree should do it. It will be a lot faster
than multiply, I am not so sure how much logic it takes.

-- glen
 
On Tue, 23 Sep 2008 23:10:31 -0700 (PDT), designer
<vittal.patil@gmail.com> wrote:

On Sep 23, 12:55 pm, Muzaffer Kal <k...@dspia.com> wrote:
On Mon, 22 Sep 2008 23:05:43 -0700 (PDT), designer

vittal.pa...@gmail.com> wrote:
Hi,
I am trying to come up with a fast multiplier (300MHz speed). inputs
are 32 bit wide and a truncated (32 bit wide) result is desired. What
I want to do is , Calculate the flags (condition codes like Z,C,N,V
flags ) before the multiplier result is calculated i.e. I want atleast
the condition codes to be ready at 300MHz clock speed. I tried to
search in the synopsys design ware, but couldn't get anything
specific on Flags calculation.

Almost all the condition codes can be calculated without the need the
full multiplier output. Obviously Zero and Negative are the easiest
ones. The output is zero is either input is zero and the output is
negative if the signs of the inputs are different. For overflow,
assuming you want it active if the result doesn't fit into 32 bits,
you need to calculate log2(a)+log2(b) < 32 which is again simpler than
the multiplier. Carry is similarly calculated. I'd suggest you design
your condition code logic based on the inputs as opposed to the output
of the multiplier and estimate how much hardware you need.
As to 300 MHz 32x32->32 multiplier, it strictly depends on what
process you're using with anything at or below 130nm, you should be
able to do it in a single cycle with no problems or at most 2 cycles
with .25u with almost any standard cell library. Designware has
pipelined multiplier support which I suggest you use it if your
version of DC doesn't support re-timing.

Thanks for the reply...
But the negative flag (N) is computed wrong (if computed based on the
input signs) if a overflow occurs.
So what is the normal/generic interpretation here.
I 'm not sure about the generic interpretation but I'd make the use of
the use of carry and negative flags conditional on being no overflow;
same as the result of the multiplication. There is also the issue of
what you would do with the result if there is an overflow. If you're
designing a part of a datapath (ie in a hardwired DSP with no excepion
processing) then it may make sense to saturate the output on overflow
with the correct sign. If you're designing an ALU the exception
processing is done higher up in the hierarchy so you can just generate
an overflow flag and generate the output without any further
processing. It depends on what you need.
 
On Sep 23, 12:55 pm, Muzaffer Kal <k...@dspia.com> wrote:
On Mon, 22 Sep 2008 23:05:43 -0700 (PDT), designer

vittal.pa...@gmail.com> wrote:
Hi,
I am trying to come up with a fast multiplier (300MHz speed). inputs
are 32 bit wide and a truncated (32 bit wide) result is desired. What
I want to do is , Calculate the flags (condition codes like Z,C,N,V
flags ) before the multiplier result is calculated i.e. I want atleast
the condition codes to be ready at 300MHz clock speed. I tried to
search in the synopsys design ware, but couldn't get anything
specific on Flags calculation.

Almost all the condition codes can be calculated without the need the
full multiplier output. Obviously Zero and Negative are the easiest
ones. The output is zero is either input is zero and the output is
negative if the signs of the inputs are different. For overflow,
assuming you want it active if the result doesn't fit into 32 bits,
you need to calculate log2(a)+log2(b) < 32 which is again simpler than
the multiplier. Carry is similarly calculated. I'd suggest you design
your condition code logic based on the inputs as opposed to the output
of the multiplier and estimate how much hardware you need.
As to 300 MHz 32x32->32 multiplier, it strictly depends on what
process you're using with anything at or below 130nm, you should be
able to do it in a single cycle with no problems or at most 2 cycles
with .25u with almost any standard cell library. Designware has
pipelined multiplier support which I suggest you use it if your
version of DC doesn't support re-timing.
Thanks for the reply...
But the negative flag (N) is computed wrong (if computed based on the
input signs) if a overflow occurs.
So what is the normal/generic interpretation here.
 
On Sep 24, 12:42 pm, Muzaffer Kal <k...@dspia.com> wrote:
On Tue, 23 Sep 2008 23:10:31 -0700 (PDT), designer



vittal.pa...@gmail.com> wrote:
On Sep 23, 12:55 pm, Muzaffer Kal <k...@dspia.com> wrote:
On Mon, 22 Sep 2008 23:05:43 -0700 (PDT), designer

vittal.pa...@gmail.com> wrote:
Hi,
I am trying to come up with a fast multiplier (300MHz speed). inputs
are 32 bit wide and a truncated (32 bit wide) result is desired. What
I want to do is , Calculate the flags (condition codes like Z,C,N,V
flags ) before the multiplier result is calculated i.e. I want atleast
the condition codes to be ready at 300MHz clock speed. I tried to
search in the synopsys design ware, but couldn't get anything
specific on Flags calculation.

Almost all the condition codes can be calculated without the need the
full multiplier output. Obviously Zero and Negative are the easiest
ones. The output is zero is either input is zero and the output is
negative if the signs of the inputs are different. For overflow,
assuming you want it active if the result doesn't fit into 32 bits,
you need to calculate log2(a)+log2(b) < 32 which is again simpler than
the multiplier. Carry is similarly calculated. I'd suggest you design
your condition code logic based on the inputs as opposed to the output
of the multiplier and estimate how much hardware you need.
As to 300 MHz 32x32->32 multiplier, it strictly depends on what
process you're using with anything at or below 130nm, you should be
able to do it in a single cycle with no problems or at most 2 cycles
with .25u with almost any standard cell library. Designware has
pipelined multiplier support which I suggest you use it if your
version of DC doesn't support re-timing.

Thanks for the reply...
But the negative flag (N) is computed wrong (if computed based on the
input signs) if a overflow occurs.
So what is the normal/generic interpretation here.

I 'm not sure about the generic interpretation but I'd make the use of
the use of carry and negative flags conditional on being no overflow;
same as the result of the multiplication. There is also the issue of
what you would do with the result if there is an overflow. If you're
designing a part of a datapath (ie in a hardwired DSP with no excepion
processing) then it may make sense to saturate the output on overflow
with the correct sign. If you're designing an ALU the exception
processing is done higher up in the hierarchy so you can just generate
an overflow flag and generate the output without any further
processing. It depends on what you need.
How do I calculate log(a) ???????
 

Welcome to EDABoard.com

Sponsor

Back
Top