verilog adders (verilog optimization)

Aug 29, 2008

I had assumed Verilog would collapse constants, but I wonder if that
is
always trued/allowed?

If I have something like
reg [7:0] sig1, sig2, sig3;

always @ *
sig3 = 1 + 2 + 3 + sig1 + sig2 + 4;

There must be at least 2 adders. Is Verilog required to produce 3
adders?

The code can easily be collapsed to
sig3 = 6 + sig1 + sig2 + 4;

Verilog evaluates left to right, so this becomes
sig3 = ( (6 + sig1) + sig2) + 4;

This requires 3 adders. Is it legal for Verilog to compile the
original code to:
sig3 = 10 + sig1 + sig2;

Thoughts?

Thanks!

John Providenza

Andy · Aug 29, 2008

Most synthesis tools will take advantage of associative and
commutative properties of expressions in order to produce an optimal
implementation. If they don't, I don't use them very long.

Andy

Aug 29, 2008

On Aug 29, 7:03 am, Andy <jonesa...@comcast.net> wrote:

Most synthesis tools will take advantage of associative and
commutative properties of expressions in order to produce an optimal
implementation. If they don't, I don't use them very long.

Andy

For grins, I created a very simple test case and synthesized it
using the Xilinx XST synthesizer. Here's the code:

module test (
input clk,
input [7:0] a, b,

output reg [7:0] z
);

reg [7:0] a1, b1, z1;

always @(posedge clk)
begin
a1 <= a;
b1 <= b;
z <= z1;
end

always @ *
begin
z1 = 1 + 2 + 3 + a1 + b1 + 4;
end
endmodule

Guess what? 3 adders. If I reorder the arithmetic to be
z1 = 1 + 2 + 3 + 4 + a1 + b1;
then I get 2 adders.

John Providenza

Mike Treseler · Aug 29, 2008

jprovidenza@yahoo.com wrote:

Guess what? 3 adders. If I reorder the arithmetic to be
z1 = 1 + 2 + 3 + 4 + a1 + b1;
then I get 2 adders.

Try finishing the synthesis and compare LUTs and Flops.
Some reductions happen on the back-end.

-- Mike Treseler

John_H · Aug 29, 2008

On Aug 29, 7:38 am, jprovide...@yahoo.com wrote:

For grins, I created a very simple test case and synthesized it
using the Xilinx XST synthesizer. Here's the code:

snip
always @ *
begin
z1 = 1 + 2 + 3 + a1 + b1 + 4;
end
endmodule

Guess what? 3 adders. If I reorder the arithmetic to be
z1 = 1 + 2 + 3 + 4 + a1 + b1;
then I get 2 adders.

John Providenza

Different synthesizers will produce different results. 20 years ago,
you might be hard pressed to find a Verilog synthesizer that wasn't
order-dependent but things are better now. The Xilinx XST has become
a respectable synthesizer but I still wouldn't call it a "world class"
synthesis engine. For a free tool, it's great.

A better synthesizer *might* produce repeatible results with minimal
logic. I don't think algebraic optimization is high on any
synthesizer's feature list, though, such that even exceptional logic
synthesizers might stumble on some simple arithmatic.

I'd personally love to see more work on algebraic optimization but I'm
not holding my breath.

- John_H

Kevin Neilson · Aug 29, 2008

Code:
z1 = 1 + 2 + 3 + a1 + b1 + 4;
Cell Usage :
# BELS : 35
# GND : 1
# LUT1 : 1
# LUT2 : 7
# LUT3 : 1
# LUT4 : 7
# LUT5 : 1
# LUT6 : 1
# MUXCY : 7
# VCC : 1
# XORCY : 8
# FlipFlops/Latches : 24
# FD : 23
# FDR : 1

LUTS 18
XORCY 8
MUXCY 7

Code:
z1 = 1 + 2 + 3 + 4 + a1 + b1;
Cell Usage :
# BELS : 29
# GND : 1
# LUT1 : 1
# LUT2 : 6
# LUT4 : 6
# MUXCY : 7
# XORCY : 8
# FlipFlops/Latches : 24
# FD : 24

LUTS 13
XORCY 8
MUXCY 7

John Providenza

That's interesting. The fact that each design has 8 XORCYs leads me to
believe that there are only two adders, but I'm not sure why the LUT
count differs. It seems like there are only 8 LUTs needed--an 8-bit
adder (with truncated 8-bit output) should require only 4 LUTs and 4
XORCYs--so I don't know why there would be 13 and 18 LUTs used. Did you
look at the technology schematic?
-Kevin

Mike Treseler · Aug 29, 2008

jprovidenza@yahoo.com wrote:

Here's the XST synthesis data for the two cases. It sure looks to
me like there's extra logic.

Sure enough.
Thanks for trying it, and for reporting results.
Consider submitting the case with Xilinx
since you have all the data.

I agree that XST is not well know as a state-of-the-art, super-duper,
terrific, knock-your-socks-off synthesizer, but it is a data point.

Well, you have proven your point.
The synthesis front-end should be
smart enough to collect constants.

On the other hand, most designers
would do something like:
parameter sum = 1+2+3+4;

-- Mike Treseler

Aug 29, 2008

On Aug 29, 9:54 am, Mike Treseler <mtrese...@gmail.com> wrote:

jprovide...@yahoo.com wrote:
Guess what? 3 adders. If I reorder the arithmetic to be
z1 = 1 + 2 + 3 + 4 + a1 + b1;
then I get 2 adders.

Try finishing the synthesis and compare LUTs and Flops.
Some reductions happen on the back-end.

-- Mike Treseler

Here's the XST synthesis data for the two cases. It sure looks to
me like there's extra logic.

I agree that XST is not well know as a state-of-the-art, super-duper,
terrific, knock-your-socks-off synthesizer, but it is a data point.

Code:
z1 = 1 + 2 + 3 + a1 + b1 + 4;
Cell Usage :
# BELS : 35
# GND : 1
# LUT1 : 1
# LUT2 : 7
# LUT3 : 1
# LUT4 : 7
# LUT5 : 1
# LUT6 : 1
# MUXCY : 7
# VCC : 1
# XORCY : 8
# FlipFlops/Latches : 24
# FD : 23
# FDR : 1

LUTS 18
XORCY 8
MUXCY 7

Code:
z1 = 1 + 2 + 3 + 4 + a1 + b1;
Cell Usage :
# BELS : 29
# GND : 1
# LUT1 : 1
# LUT2 : 6
# LUT4 : 6
# MUXCY : 7
# XORCY : 8
# FlipFlops/Latches : 24
# FD : 24

LUTS 13
XORCY 8
MUXCY 7

John Providenza

Aug 29, 2008

On Aug 29, 11:43 am, Kevin Neilson
<kevin_neil...@removethiscomcast.net> wrote:

Code:
z1 = 1 + 2 + 3 + a1 + b1 + 4;
Cell Usage :
# BELS : 35
# GND : 1
# LUT1 : 1
# LUT2 : 7
# LUT3 : 1
# LUT4 : 7
# LUT5 : 1
# LUT6 : 1
# MUXCY : 7
# VCC : 1
# XORCY : 8
# FlipFlops/Latches : 24
# FD : 23
# FDR : 1

LUTS 18
XORCY 8
MUXCY 7

Code:
z1 = 1 + 2 + 3 + 4 + a1 + b1;
Cell Usage :
# BELS : 29
# GND : 1
# LUT1 : 1
# LUT2 : 6
# LUT4 : 6
# MUXCY : 7
# XORCY : 8
# FlipFlops/Latches : 24
# FD : 24

LUTS 13
XORCY 8
MUXCY 7

John Providenza

That's interesting. The fact that each design has 8 XORCYs leads me to
believe that there are only two adders, but I'm not sure why the LUT
count differs. It seems like there are only 8 LUTs needed--an 8-bit
adder (with truncated 8-bit output) should require only 4 LUTs and 4
XORCYs--so I don't know why there would be 13 and 18 LUTs used. Did you
look at the technology schematic?
-Kevin

I did not look in any detail. At some point, each synthesizer will
"do its own thing",
so I don't really care about the very low level details. The
schematics showed
3 adders, 2 of them had constants as inputs.

I'm curious as to
a) does Verilog some how require this?
b) what do some other synthesizers do?

John Providenza

Mike Treseler · Aug 30, 2008

jprovidenza@yahoo.com wrote:

I'm curious as to
a) does Verilog some how require this?
I don't see how.

Both cases sim the same.

b) what do some other synthesizers do?
Quartus does about the same thing.

3 then 2 adders at the rtl level for the same cases
and a few extra LUTs for the 3 counter case.

-- Mike Treseler

Jonathan Bromley · Sep 2, 2008

On Tue, 2 Sep 2008 08:27:10 -0700 (PDT), jprovidenza@yahoo.com wrote:

So, if A is constant 3 and C is constant 1, does the language *spec*
prevent Verilog from combining the two constants?

I don't believe so. Context-dependent operands of an
arithmetic expression (as are all operands of +, - etc)
are first widened to the context width, BEFORE any
arithmetic is done. Consequently, the order in which
addition-like operations are performed is unimportant,
because everything is done in the same bit-width.

I'm pretty sure (though I haven't yet proved it to my
own satisfaction) that multiplication can similarly be
rearranged algebraically, along with addition and
subtraction, without any effect on the results. But
division, with its potential loss of LSBs, will surely
exhibit some order dependences that would mess up
algebraic rearrangement.
--
Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how
VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services

Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK
jonathan.bromley@MYCOMPANY.com
http://www.MYCOMPANY.com

The contents of this message may contain personal views which
are not the views of Doulos Ltd., unless specifically stated.

Andy · Sep 2, 2008

On Aug 29, 6:29 pm, Mike Treseler <mtrese...@gmail.com> wrote:

jprovide...@yahoo.com wrote:
I'm curious as to
a) does Verilog some how require this?

I don't see how.
Both cases sim the same.

b) what do some other synthesizers do?

Quartus does about the same thing.
3 then 2 adders at the rtl level for the same cases
and a few extra LUTs for the 3 counter case.

-- Mike Treseler

Synplify Pro implements exactly the same thing for both orders (in
vhdl). 12 luts (Xilinx v4). RTL viewer shows three input adder (a1 +
b1 + 10) for both.

As it should be...

Andy

Muzaffer Kal · Sep 2, 2008

On Tue, 02 Sep 2008 17:27:57 +0100, Jonathan Bromley
<jonathan.bromley@MYCOMPANY.com> wrote:

On Tue, 2 Sep 2008 08:27:10 -0700 (PDT), jprovidenza@yahoo.com wrote:

So, if A is constant 3 and C is constant 1, does the language *spec*
prevent Verilog from combining the two constants?

I'm pretty sure (though I haven't yet proved it to my
own satisfaction) that multiplication can similarly be
rearranged algebraically, along with addition and
subtraction, without any effect on the results.

I don't think this is true when you say "along with addition and
subtraction". Multiplication (& division) has higher precedence than
addition & subtraction so you can't re-arrange it with them. A + B + C
might give the same result whether you calculate (A+B)+C or even
(A+C)+B but this is certainly not true A + B*C. Because of precedence
rules this has to be implemented as A + (B*C) and can't be done as
(A+B)*C.

Muzaffer Kal

http://www.dspia.com

Jonathan Bromley · Sep 2, 2008

On Tue, 02 Sep 2008 10:01:09 -0700, Muzaffer Kal wrote:

On Tue, 02 Sep 2008 17:27:57 +0100, Jonathan Bromley
jonathan.bromley@MYCOMPANY.com> wrote:

I'm pretty sure (though I haven't yet proved it to my
own satisfaction) that multiplication can similarly be
rearranged algebraically, along with addition and
subtraction, without any effect on the results.

I don't think this is true when you say "along with addition and
subtraction". Multiplication (& division) has higher precedence than
addition & subtraction so you can't re-arrange it with them.

No, for sure; that's why I said "rearranged algebraically".
I was thinking of rearrangements such as

A*B + A*C === A*(B+C)

which can often save hardware - in this case, one
multiplier saved with no other cost.

I don't know how effectively the existing synthesis tools
do that kind of thing, though.
--
Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how
VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services

Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK
jonathan.bromley@MYCOMPANY.com
http://www.MYCOMPANY.com

The contents of this message may contain personal views which
are not the views of Doulos Ltd., unless specifically stated.

Sep 2, 2008

On Sep 2, 7:48 am, Andy <jonesa...@comcast.net> wrote:

On Aug 29, 6:29 pm, Mike Treseler <mtrese...@gmail.com> wrote:

jprovide...@yahoo.com wrote:
I'm curious as to
a) does Verilog some how require this?

I don't see how.
Both cases sim the same.

b) what do some other synthesizers do?

Quartus does about the same thing.
3 then 2 adders at the rtl level for the same cases
and a few extra LUTs for the 3 counter case.

-- Mike Treseler

Synplify Pro implements exactly the same thing for both orders (in
vhdl). 12 luts (Xilinx v4). RTL viewer shows three input adder (a1 +
b1 + 10) for both.

As it should be...

Andy

Andy -

My question is specific to Verilog, I don't know what the VHDL
language
spec requires. From a Verilog LRM:

All operators shall associate left to right with the exception of
the
conditional operator, which shall associate right to left.
Associativity
refers to the order in which the operators having the same
precedence
are evaluated.

Thus, in the following example B is added to A and then C is
subtracted
from the result of A+B.
A + B - C

So, if A is constant 3 and C is constant 1, does the language *spec*
prevent
Verilog from combining the two constants?

John Providenza

Mike Treseler · Sep 2, 2008

Jonathan Bromley wrote:

No, for sure; that's why I said "rearranged algebraically".
I was thinking of rearrangements such as

A*B + A*C === A*(B+C)

which can often save hardware - in this case, one
multiplier saved with no other cost.

I don't know how effectively the existing synthesis tools
do that kind of thing, though.

Since brand A+X can't even collect constants,
I would try this experiment on Synplify Pro or Mentor.

-- Mike Treseler

Mike Treseler · Sep 2, 2008

Andy wrote:

Synplify Pro implements exactly the same thing for both orders (in
vhdl). 12 luts (Xilinx v4). RTL viewer shows three input adder (a1 +
b1 + 10) for both.

As it should be...

Indeed. Thanks for posting the results.

-- Mike Treseler

Mike Treseler · Sep 2, 2008

jprovidenza@yahoo.com wrote:

Yes, thank for posting the info, but I believe it is for VHDL.

I synthesized *your* verilog example.

My original
question was "does the Verilog language specifically forbid merging
the constants" as opoosed to "how good is your synthesis tool".

The answers were NO, NO, and NO, before
the discussion strayed.
This happens sometimes on usenet.

-- Mike Treseler

Andy · Sep 2, 2008

IINM, VHDL has the same left to right evaluation requirements among
equal-precedence arithmetic operators IN SIMULATION as Verilog has.
But we are not talking about a simulation, we are talking about
synthesis. With equal precedence arithmetic operators that do not have
"side effects," order of execution is unobservable (has no external
effect), but amongst those that do have side effects (most commonly
encountered with function calls and logical operators), order is
important.

For synthesis, the execution order rules for simulation do not always
have any meaning. If the results are always equivalent between the LRM
execution ordering and the implementation ordering, it is by
definition legal synthesis. In this context, equivalence is evaluated
at register/IO boundaries, not intermediate expressions or even
assignments. That's why they call it Register Transfer Logic (or
Level).

Restraining the synthesized implementation to maintain the same order
of operations, for no other reason than matching the execution order
from simulation, would eliminate a whole host of beneficial
optimizations. We would then be forced to write code that explicitly
describes an optimal (our choice, not the tool's) implementation. And
register re-timing would be absolutely forbidden.

Andy

Sep 2, 2008

On Sep 2, 11:24 am, Mike Treseler <mtrese...@gmail.com> wrote:

Andy wrote:
Synplify Pro implements exactly the same thing for both orders (in
vhdl). 12 luts (Xilinx v4). RTL viewer shows three input adder (a1 +
b1 + 10) for both.

As it should be...

Indeed. Thanks for posting the results.

-- Mike Treseler

Yes, thank for posting the info, but I believe it is for VHDL. My
original
question was "does the Verilog language specifically forbid merging
the
constants" as opoosed to "how good is your synthesis tool".

John Providenza

verilog adders (verilog optimization)

Guest

Andy

Guest

Guest

Mike Treseler

Guest

John_H

Guest

Kevin Neilson

Guest

Mike Treseler

Guest

Guest

Guest

Mike Treseler

Guest

Jonathan Bromley

Guest

Andy

Guest

Muzaffer Kal

Guest

Jonathan Bromley

Guest

Guest

Mike Treseler

Guest

Mike Treseler

Guest

Mike Treseler

Guest

Andy

Guest

Guest

Welcome to EDABoard.com

Sponsor

Online statistics

Forum statistics

verilog adders (verilog optimization)

Guest

Andy

Guest

Guest

Mike Treseler

Guest

John_H

Guest

Kevin Neilson

Guest

Mike Treseler

Guest

Guest

Guest

Mike Treseler

Guest

Jonathan Bromley

Guest

Andy

Guest

Muzaffer Kal

Guest

Jonathan Bromley

Guest

Guest

Mike Treseler

Guest

Mike Treseler

Guest

Mike Treseler

Guest

Andy

Guest

Guest

Log in

Welcome to EDABoard.com

Sponsor