Query about optimization

V

Vinay Deshpande

Guest
Hi all,
I am a newbie to VHDL. My query is about the optimization.
I created two versions of a N-bit adder. One with just C <= A + B;
and compiled it. It took 8 LEs. In second design I wrote a
hierarchical code for same. Like starting from a full adder, then
instantiating it to higher one. On compilation I got 20 LEs? Now my
question is, both circuits are essentially doing same job, no more no
less. Then why there is huge difference in LEs? Same thing I observed
with multiplier also. Do compilers have optimal implementations for
these operators (+, *)?
Waiting for reply.
Vinay
 
Vinay Deshpande posted:

" I am a newbie to VHDL. My query is about the optimization.
I created two versions of a N-bit adder. One with just C <= A + B;
and compiled it. It took 8 LEs. In second design I wrote a
hierarchical code for same. Like starting from a full adder, then
instantiating it to higher one. On compilation I got 20 LEs? Now my
question is, both circuits are essentially doing same job, no more no
less. Then why there is huge difference in LEs? Same thing I observed
with multiplier also. Do compilers have optimal implementations for
these operators (+, *)?"

Hello,

The term "optimization" is very commonly misused, and not just in the
HDL communities. To run a synthesizer (or for something which is
not a HDL, a compiler) which will definitely always provide the best
solution (i.e. which will definitely always perform true optimization)
is an intractable problem (it would run for months for a design which
a practical, suboptimal compromise would take merely hours to produce
a netlist for).

Regards,
Colin Paul Gloster
 
On May 22, 12:15 pm, Colin Paul Gloster <Colin_Paul_Glos...@ACM.org>
wrote:
Vinay Deshpande posted:

" I am a newbie to VHDL. My query is about the optimization.
I created two versions of a N-bit adder. One with just C <= A + B;
and compiled it. It took 8 LEs. In second design I wrote a
hierarchical code for same. Like starting from a full adder, then
instantiating it to higher one. On compilation I got 20 LEs? Now my
question is, both circuits are essentially doing same job, no more no
less. Then why there is huge difference in LEs? Same thing I observed
with multiplier also. Do compilers have optimal implementations for
these operators (+, *)?"

Hello,

The term "optimization" is very commonly misused, and not just in the
HDL communities. To run a synthesizer (or for something which is
not a HDL, a compiler) which will definitely always provide the best
solution (i.e. which will definitely always perform true optimization)
is an intractable problem (it would run for months for a design which
a practical, suboptimal compromise would take merely hours to produce
a netlist for).

Regards,
Colin Paul Gloster
But then why it takes less space in RTL design choice, i.e. C <= A +
B;?
 
Vinay Deshpande <deshpande.vinay@gmail.com> writes:
Do compilers have optimal implementations for
these operators (+, *)?
Yes. For instance, with your structual implementation out of
full adders, the synthesizer may not be able to figure out that
it can use the dedicated carry chain.
 
On May 22, 1:03 pm, Eric Smith <e...@brouhaha.com> wrote:
Vinay Deshpande <deshpande.vi...@gmail.com> writes:
Do compilers have optimal implementations for
these operators (+, *)?

Yes. For instance, with your structual implementation out of
full adders, the synthesizer may not be able to figure out that
it can use the dedicated carry chain.
Thank you very much.
Vinay
 
On May 22, 3:51 am, Vinay Deshpande <deshpande.vi...@gmail.com> wrote:
On May 22, 1:03 pm, Eric Smith <e...@brouhaha.com> wrote:

Vinay Deshpande <deshpande.vi...@gmail.com> writes:
Do compilers have optimal implementations for
these operators (+, *)?

Yes. For instance, with your structual implementation out of
full adders, the synthesizer may not be able to figure out that
it can use the dedicated carry chain.

Thank you very much.
Vinay
There are two areas where your hierarchical design would give most
synthesizers a harder problem. First, as noted above, synthesizers
have very good implementations of commonly used canned functions, like
"+", but often do not recognize a logical description as equivalent to
a canned function (the assumption is that if the designer wanted an
adder, they would have used "+"). Second, most synthesizers have only
limited optimization across entity boundaries, so a description that
spreads a combinatorial function across many entities is less likely
to result in a nearly optimal implementation.

Andy
 
On May 22, 5:44 pm, Andy <jonesa...@comcast.net> wrote:
On May 22, 3:51 am, Vinay Deshpande <deshpande.vi...@gmail.com> wrote:

On May 22, 1:03 pm, Eric Smith <e...@brouhaha.com> wrote:

Vinay Deshpande <deshpande.vi...@gmail.com> writes:
Do compilers have optimal implementations for
these operators (+, *)?

Yes. For instance, with your structual implementation out of
full adders, the synthesizer may not be able to figure out that
it can use the dedicated carry chain.

Thank you very much.
Vinay

There are two areas where your hierarchical design would give most
synthesizers a harder problem. First, as noted above, synthesizers
have very good implementations of commonly used canned functions, like
"+", but often do not recognize a logical description as equivalent to
a canned function (the assumption is that if the designer wanted an
adder, they would have used "+"). Second, most synthesizers have only
limited optimization across entity boundaries, so a description that
spreads a combinatorial function across many entities is less likely
to result in a nearly optimal implementation.

Andy
Thanks Andy,
 
On May 22, 5:44 pm, Andy <jonesa...@comcast.net> wrote:
On May 22, 3:51 am, Vinay Deshpande <deshpande.vi...@gmail.com> wrote:

On May 22, 1:03 pm, Eric Smith <e...@brouhaha.com> wrote:

Vinay Deshpande <deshpande.vi...@gmail.com> writes:
Do compilers have optimal implementations for
these operators (+, *)?

Yes. For instance, with your structual implementation out of
full adders, the synthesizer may not be able to figure out that
it can use the dedicated carry chain.

Thank you very much.
Vinay

There are two areas where your hierarchical design would give most
synthesizers a harder problem. First, as noted above, synthesizers
have very good implementations of commonly used canned functions, like
"+", but often do not recognize a logical description as equivalent to
a canned function (the assumption is that if the designer wanted an
adder, they would have used "+"). Second, most synthesizers have only
limited optimization across entity boundaries, so a description that
spreads a combinatorial function across many entities is less likely
to result in a nearly optimal implementation.

Andy
Thanks Andy
 
On May 22, 5:44 pm, Andy <jonesa...@comcast.net> wrote:
On May 22, 3:51 am, Vinay Deshpande <deshpande.vi...@gmail.com> wrote:

On May 22, 1:03 pm, Eric Smith <e...@brouhaha.com> wrote:

Vinay Deshpande <deshpande.vi...@gmail.com> writes:
Do compilers have optimal implementations for
these operators (+, *)?

Yes. For instance, with your structual implementation out of
full adders, the synthesizer may not be able to figure out that
it can use the dedicated carry chain.

Thank you very much.
Vinay

There are two areas where your hierarchical design would give most
synthesizers a harder problem. First, as noted above, synthesizers
have very good implementations of commonly used canned functions, like
"+", but often do not recognize a logical description as equivalent to
a canned function (the assumption is that if the designer wanted an
adder, they would have used "+"). Second, most synthesizers have only
limited optimization across entity boundaries, so a description that
spreads a combinatorial function across many entities is less likely
to result in a nearly optimal implementation.

Andy
Thanks Andy
 
Vinay Deshpande wrote:

Same thing I observed
with multiplier also. Do compilers have optimal implementations for
these operators (+, *)?
Yes, there are FPGAs with some some hardware multipliers (e.g. 18 x 18 bit
multipliers on some Spartan FPGAs). Then a "*" would be optimized to zero
LEs and one multiplier :)

--
Frank Buss, fb@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de
 
Second, most synthesizers have only
limited optimization across entity boundaries, so a description that
spreads a combinatorial function across many entities is less likely
to result in a nearly optimal implementation.

Do you have any code that actually demonstrates this lack of optomization?

KJ
 
On May 23, 4:54 am, "KJ" <kkjenni...@sbcglobal.net> wrote:
Second, most synthesizers have only
limited optimization across entity boundaries, so a description that
spreads a combinatorial function across many entities is less likely
to result in a nearly optimal implementation.

Do you have any code that actually demonstrates this lack of optomization?

KJ
You can try the example which I quoted above. Create two
implementations of N-bit full adder, one with simple statement like C
<= A + B; and other using hierarchical design starting from a simple
full adder. You can see difference in logic elements consumed by both
implementations.
 
"Vinay Deshpande" <deshpande.vinay@gmail.com> wrote in message
news:1179892781.689864.135800@g4g2000hsf.googlegroups.com...
On May 23, 4:54 am, "KJ" <kkjenni...@sbcglobal.net> wrote:
Second, most synthesizers have only
limited optimization across entity boundaries, so a description that
spreads a combinatorial function across many entities is less likely
to result in a nearly optimal implementation.

Do you have any code that actually demonstrates this lack of
optomization?

KJ

You can try the example which I quoted above. Create two
implementations of N-bit full adder, one with simple statement like C
= A + B; and other using hierarchical design starting from a simple
full adder. You can see difference in logic elements consumed by both
implementations.

That's not what I was asking about. The comment that Andy made that I
questioned was directed at logic that spans multiple entities. What he said
was that the entity presented some sort of boundary that made it difficult
to optomize logic across. I've yet to see any evidence of such a barrier
and asked him for an example to back up his claim (and one would also need
to know which synthesis tool has this problem since that is the tool that
has the problem). What I've seen is that the logic gets flattened into a
netlist right at the outset, the entity 'boundaries' at that point no longer
exist and they have no such effect as claimed. It could be though that
older tools, or crappy tools, for whatever reason, did (or do) have this
limitation (maybe they optomized entities but not globally with a flattened
netlist). Hence the question.

As for your example, there are multiple n-bit adder algorithms and I suspect
that C<=A+B is implemented with a different base algorithm than the
cascading of smaller adders. Your view is that both produce the sum of two
numbers and are functionally identical but optomizers are not so smart as to
discern better algorithms from logic but what they are good at is inferring
from the top level code what the best algorithm to implement is....and then
optomize the logic for that. A crude example would be if you were to code a
discrete Fourier transform algorithm that produces frequency responses from
a set of input numbers and expect that an optomizer would be able to figure
out that an FFT would be better algorithm to use. Both would give you the
same overall function but different implementations one of which would be
'better'. Having said that, though I'll admit that I don't know just which
base algorithm your tool happened to choose and how (or if) it differed from
your hand coded version and whether the observed differences were because of
a different algorithm choice at the outset or because of an entity boundary
barrier effect. But you can't assume that just because you got different
results from different source code that it is an example of an entity
boundary barrier limitation.

KJ
 
On Tue, 22 May 2007 19:54:59 -0400, "KJ" <kkjennings@sbcglobal.net>
wrote:

Second, most synthesizers have only
limited optimization across entity boundaries, so a description that
spreads a combinatorial function across many entities is less likely
to result in a nearly optimal implementation.

Do you have any code that actually demonstrates this lack of optomization?

KJ
does "keep_hierarchy=true" count?

:)

- Brian
 
On May 22, 6:54 pm, "KJ" <kkjenni...@sbcglobal.net> wrote:
Second, most synthesizers have only
limited optimization across entity boundaries, so a description that
spreads a combinatorial function across many entities is less likely
to result in a nearly optimal implementation.

Do you have any code that actually demonstrates this lack of optomization?

KJ
Not recently, no. Symplify usually does a pretty good job, but I don't
generally cross hierarchical boundaries with combinatorial logic (like
a recursive, hierarchical adder, etc.)

Andy
 
On May 23, 4:59 am, "KJ" <kkjenni...@sbcglobal.net> wrote:
"Vinay Deshpande" <deshpande.vi...@gmail.com> wrote in message

news:1179892781.689864.135800@g4g2000hsf.googlegroups.com...

On May 23, 4:54 am, "KJ" <kkjenni...@sbcglobal.net> wrote:
Second, most synthesizers have only
limited optimization across entity boundaries, so a description that
spreads a combinatorial function across many entities is less likely
to result in a nearly optimal implementation.

Do you have any code that actually demonstrates this lack of
optomization?

KJ

You can try the example which I quoted above. Create two
implementations of N-bit full adder, one with simple statement like C
= A + B; and other using hierarchical design starting from a simple
full adder. You can see difference in logic elements consumed by both
implementations.

That's not what I was asking about. The comment that Andy made that I
questioned was directed at logic that spans multiple entities. What he said
was that the entity presented some sort of boundary that made it difficult
to optomize logic across. I've yet to see any evidence of such a barrier
and asked him for an example to back up his claim (and one would also need
to know which synthesis tool has this problem since that is the tool that
has the problem). What I've seen is that the logic gets flattened into a
netlist right at the outset, the entity 'boundaries' at that point no longer
exist and they have no such effect as claimed. It could be though that
older tools, or crappy tools, for whatever reason, did (or do) have this
limitation (maybe they optomized entities but not globally with a flattened
netlist). Hence the question.

As for your example, there are multiple n-bit adder algorithms and I suspect
that C<=A+B is implemented with a different base algorithm than the
cascading of smaller adders. Your view is that both produce the sum of two
numbers and are functionally identical but optomizers are not so smart as to
discern better algorithms from logic but what they are good at is inferring
from the top level code what the best algorithm to implement is....and then
optomize the logic for that. A crude example would be if you were to code a
discrete Fourier transform algorithm that produces frequency responses from
a set of input numbers and expect that an optomizer would be able to figure
out that an FFT would be better algorithm to use. Both would give you the
same overall function but different implementations one of which would be
'better'. Having said that, though I'll admit that I don't know just which
base algorithm your tool happened to choose and how (or if) it differed from
your hand coded version and whether the observed differences were because of
a different algorithm choice at the outset or because of an entity boundary
barrier effect. But you can't assume that just because you got different
results from different source code that it is an example of an entity
boundary barrier limitation.

KJ
Synplify does not collapse to a flat netlist. The netlist is
hierarchical and closely matches the hierarchical structure of the
original code. The differences will be where it has recognized
combinatorial logic that crossed the RTL boundaries, which it
optimized, necessitating the boundary changes. Synplify certainly does
recognize combinatorial logic across hierarchy (to some extent) and
will optimize it (to some extent). I don't know what those limits are,
but to be on the safe side, I don't push them.

Andy
 
On May 23, 4:59 am, "KJ" <kkjenni...@sbcglobal.net> wrote:
"Vinay Deshpande" <deshpande.vi...@gmail.com> wrote in message

news:1179892781.689864.135800@g4g2000hsf.googlegroups.com...

On May 23, 4:54 am, "KJ" <kkjenni...@sbcglobal.net> wrote:
Second, most synthesizers have only
limited optimization across entity boundaries, so a description that
spreads a combinatorial function across many entities is less likely
to result in a nearly optimal implementation.

Do you have any code that actually demonstrates this lack of
optomization?

KJ

You can try the example which I quoted above. Create two
implementations of N-bit full adder, one with simple statement like C
= A + B; and other using hierarchical design starting from a simple
full adder. You can see difference in logic elements consumed by both
implementations.

That's not what I was asking about. The comment that Andy made that I
questioned was directed at logic that spans multiple entities. What he said
was that the entity presented some sort of boundary that made it difficult
to optomize logic across. I've yet to see any evidence of such a barrier
and asked him for an example to back up his claim (and one would also need
to know which synthesis tool has this problem since that is the tool that
has the problem). What I've seen is that the logic gets flattened into a
netlist right at the outset, the entity 'boundaries' at that point no longer
exist and they have no such effect as claimed. It could be though that
older tools, or crappy tools, for whatever reason, did (or do) have this
limitation (maybe they optomized entities but not globally with a flattened
netlist). Hence the question.

As for your example, there are multiple n-bit adder algorithms and I suspect
that C<=A+B is implemented with a different base algorithm than the
cascading of smaller adders. Your view is that both produce the sum of two
numbers and are functionally identical but optomizers are not so smart as to
discern better algorithms from logic but what they are good at is inferring
from the top level code what the best algorithm to implement is....and then
optomize the logic for that. A crude example would be if you were to code a
discrete Fourier transform algorithm that produces frequency responses from
a set of input numbers and expect that an optomizer would be able to figure
out that an FFT would be better algorithm to use. Both would give you the
same overall function but different implementations one of which would be
'better'. Having said that, though I'll admit that I don't know just which
base algorithm your tool happened to choose and how (or if) it differed from
your hand coded version and whether the observed differences were because of
a different algorithm choice at the outset or because of an entity boundary
barrier effect. But you can't assume that just because you got different
results from different source code that it is an example of an entity
boundary barrier limitation.

KJ
Synplify does not collapse to a flat netlist. The netlist is
hierarchical and closely matches the hierarchical structure of the
original code. The differences will be where it has recognized
combinatorial logic that crossed the RTL boundaries, which it
optimized, necessitating the boundary changes. Synplify certainly does
recognize combinatorial logic across hierarchy (to some extent) and
will optimize it (to some extent). I don't know what those limits are,
but to be on the safe side, I don't push them.

Andy
 
"Andy" <jonesandy@comcast.net> wrote in message
news:1179938495.489925.38130@m36g2000hse.googlegroups.com...
On May 23, 4:59 am, "KJ" <kkjenni...@sbcglobal.net> wrote:
"Vinay Deshpande" <deshpande.vi...@gmail.com> wrote in message

news:1179892781.689864.135800@g4g2000hsf.googlegroups.com...

On May 23, 4:54 am, "KJ" <kkjenni...@sbcglobal.net> wrote:
Second, most synthesizers have only
limited optimization across entity boundaries, so a description that
spreads a combinatorial function across many entities is less likely
to result in a nearly optimal implementation.

Do you have any code that actually demonstrates this lack of
optomization?

snip
Synplify does not collapse to a flat netlist. The netlist is
hierarchical and closely matches the hierarchical structure of the
original code.
It develops a list of combinatorial logic functions and registers for the
design and optomizes those logic functions (as well as other optomizations).
Their is no 'hierarchy' involved in that optomization process, just a list
of functions and registers.

The differences will be where it has recognized
combinatorial logic that crossed the RTL boundaries, which it
optimized, necessitating the boundary changes. Synplify certainly does
recognize combinatorial logic across hierarchy (to some extent) and
will optimize it (to some extent). I don't know what those limits are,
but to be on the safe side, I don't push them.

In other words, perhaps there really are no real limits only imagined ones.

KJ
 

Welcome to EDABoard.com

Sponsor

Back
Top