30 bit adder performance

A

ALuPin

Guest
Hi @ newsgroup,

I have the following problem:

Using a Cypress VHDL template for a 30bit adder I face the problem
that I have to activate an internal pipeline stage within the template
to get the performance I need.
Using this pipeline stage the output of the adder is only valid
every two clock cycles.

So my question:

Is it possible to split the addition into two adders and to combine
the results that way
so that I get a valid sum every clock cycle ?

Any suggestion is highly appreciated.

Thanks in advance.

Rgds
André
 
ALuPin wrote:
Is it possible to split the addition into two adders and to combine
the results that way so that I get a valid sum every clock cycle ?
Of course. Here's an example for 2 unsigneds. I haven't checked it for correctness or anything, but it's just to give you an idea.

SIGNAL stage1_res : unsigned(15 DOWNTO 0);
SIGNAL stage1_store1 : unsigned(14 DOWNTO 0);
SIGNAL stage1_store2 : unsigned(14 DOWNTO 0);

stage_1: PROCESS IS
BEGIN

WAIT UNTIL clk = '1';

-- Calculate lower half, and store upper halves
stage1_res <= data_in1(14 DOWNTO 0) + data_in2(14 DOWNTO 0);
stage1_store1 <= data_in1(29 DOWNTO 15);
stage1_store2 <= data_in2(29 DOWNTO 15);

-- synchronous reset
IF reset = '1' THEN
stage1_res <= (OTHERS => '0');
stage1_store1 <= (OTHERS => '0');
stage1_store2 <= (OTHERS => '0');
END IF;

END PROCESS stage_1;


stage_2: PROCESS IS
BEGIN

WAIT UNTIL clk = '1';

-- Calculate upper half (taking carry into account), and concatenate with lower half
IF stage1_res = '1' THEN -- carry
data_out <= (stage1_store1 + stage1_store2 + 1) & stage1_res(14 DOWNTO 0);
ELSE
data_out <= (stage1_store1 + stage1_store2) & stage1_res(14 DOWNTO 0);
END IF;

-- synchronous reset
IF reset = '1' THEN
data_out <= (OTHERS => '0');
END IF;

END PROCESS stage_2;


Regards,

Pieter Hulshoff
 
ALuPin wrote:
Hi @ newsgroup,

I have the following problem:

Using a Cypress VHDL template for a 30bit adder I face the problem
that I have to activate an internal pipeline stage within the template
to get the performance I need.
Using this pipeline stage the output of the adder is only valid
every two clock cycles.

So my question:

Is it possible to split the addition into two adders and to combine
the results that way
so that I get a valid sum every clock cycle ?
If your adder only outputs a sum every two clock cycles, then you are
not really pipelining it. I expect it is instead using a clock enable
which only operates every two clocks. This would let you speed up the
clock, but require two clock cycles to give you a result.

A pipeline will let you change the inputs on every clock cycle. There
will be a two clock delay to get your first result, but after that a new
result will appear on every clock.

--

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design URL http://www.arius.com
4 King Ave 301-682-7772 Voice
Frederick, MD 21701-3110 301-682-7666 FAX
 
A couple of tips.

In the second stage, you can replace:
IF stage1_res = '1' THEN -- carry
data_out <= (stage1_store1 + stage1_store2 + 1) & stage1_res(14 DOWNTO 0);
ELSE
data_out <= (stage1_store1 + stage1_store2) & stage1_res(14 DOWNTO 0);
END IF;
with:

data_out <= (stage1_store1 + stage1_store2 + ("0"& stage1_res(15)) )
& stage1_res(14 DOWNTO 0);

Having the 2nd adder as code in both of the above can
be problematic as the +1 can be seen either as a
carry in (good: only one resource), or as an additional
adder/incrementer (bad: 2nd resource).

If you run into problems, you can use an algorithm
to force carry-in to be part of the adder:
variable T18 : signed(16 downto 0) ;
.. . .

T18 := ('0' & Astage1_store1 & '1') + ('0' & stage1_store2 & stage1_res(15)) ;
data_out <= T18(16downto 1) & stage1_res(14 DOWNTO 0);



One other tip:
If you want tools to do pipelining for you
don't include reset in the pipeline registers.


There are probably some sizing issues you will need to fix.

Cheers,
Jim
--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Jim Lewis
Director of Training mailto:Jim@SynthWorks.com
SynthWorks Design Inc. http://www.SynthWorks.com
1-503-590-4787

Expert VHDL Training for Hardware Design and Verification
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~



ALuPin wrote:

Is it possible to split the addition into two adders and to combine
the results that way so that I get a valid sum every clock cycle ?


Of course. Here's an example for 2 unsigneds. I haven't checked it for correctness or anything, but it's just to give you an idea.

SIGNAL stage1_res : unsigned(15 DOWNTO 0);
SIGNAL stage1_store1 : unsigned(14 DOWNTO 0);
SIGNAL stage1_store2 : unsigned(14 DOWNTO 0);

stage_1: PROCESS IS
BEGIN

WAIT UNTIL clk = '1';

-- Calculate lower half, and store upper halves
stage1_res <= data_in1(14 DOWNTO 0) + data_in2(14 DOWNTO 0);
stage1_store1 <= data_in1(29 DOWNTO 15);
stage1_store2 <= data_in2(29 DOWNTO 15);

-- synchronous reset
IF reset = '1' THEN
stage1_res <= (OTHERS => '0');
stage1_store1 <= (OTHERS => '0');
stage1_store2 <= (OTHERS => '0');
END IF;

END PROCESS stage_1;


stage_2: PROCESS IS
BEGIN

WAIT UNTIL clk = '1';

-- Calculate upper half (taking carry into account), and concatenate with lower half
IF stage1_res = '1' THEN -- carry
data_out <= (stage1_store1 + stage1_store2 + 1) & stage1_res(14 DOWNTO 0);
ELSE
data_out <= (stage1_store1 + stage1_store2) & stage1_res(14 DOWNTO 0);
END IF;

-- synchronous reset
IF reset = '1' THEN
data_out <= (OTHERS => '0');
END IF;

END PROCESS stage_2;

Regards,

Pieter Hulshoff
 
André,
I will sketch out another approach that also works
well in FPGAs and reduces the number of first stage
registers. Since with FPGA's you get logic with the
registers, might as well use them.

In the following sketch, || is a register stage:

A(14:0) || ||
+ || Y1B(15:0) || Result(14:0)
B(14:0) || ||


A(29:15) || ||
+ || Y1A(15:0) || Result(31:15)
B(29:15) || + ||
|| Y1B(15) ||


Basically in stage1, you add the two 15 bit halves.
Doing this reduces your storage by N-1 bits (1 = carry).
In stage 2, the upper bits are incremented by 1 if there
was a carry out of the lower half addition.

Cheers,
Jim
--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Jim Lewis
Director of Training mailto:Jim@SynthWorks.com
SynthWorks Design Inc. http://www.SynthWorks.com
1-503-590-4787

Expert VHDL Training for Hardware Design and Verification
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


Hi @ newsgroup,

I have the following problem:

Using a Cypress VHDL template for a 30bit adder I face the problem
that I have to activate an internal pipeline stage within the template
to get the performance I need.
Using this pipeline stage the output of the adder is only valid
every two clock cycles.

So my question:

Is it possible to split the addition into two adders and to combine
the results that way
so that I get a valid sum every clock cycle ?

Any suggestion is highly appreciated.

Thanks in advance.

Rgds
André
 
Jim Lewis wrote:
Having the 2nd adder as code in both of the above can
be problematic as the +1 can be seen either as a
carry in (good: only one resource), or as an additional
adder/incrementer (bad: 2nd resource).
I have faith in the compiler. :) Besides: I usually check the resulting gate
logic to see what was generated.

One other tip:
If you want tools to do pipelining for you
don't include reset in the pipeline registers.
I don't trust tool generated pipelining. I've seen examples where deadlock
situations were created in FSMs where there were none before.

Regards,

Pieter Hulshoff
 

Welcome to EDABoard.com

Sponsor

Back
Top