Vry silly noob question

On Thu, 7 Apr 2011 10:47:17 +0100, Chris Hinsley wrote:


And I'm having problems getting a paramatized form that produces what I
want, ie somthing that dosn't suck. :)

My macro did produce the correct result, and did it in what I thought
was an efficent way useing 'or' gates. If you have a simpler way to do
it I'd be grateful to see it.
Try this in your setup and see how it shapes up. I wrote
the port list in the old Verilog-95 style because it makes
parameterization a little safer (I can protect INPUT_BITS
against accidental change by making it a localparam).

module ONEHOT_TO_CODE(in_onehot, out_code);
parameter OUTPUT_BITS = 2;
localparam INPUT_BITS = 1 << OUTPUT_BITS;
input [INPUT_BITS-1:0] in_onehot;
output [OUTPUT_BITS-1:0] out_code;
reg [OUTPUT_BITS-1:0] out_code;

always @(in_onehot) begin : onehot_or_tree
integer bit_pos;
out_code = 0;
for (bit_pos = 0; bit_pos < INPUT_BITS; bit_pos=bit_pos+1)
if (in_onehot[bit_pos])
out_code = out_code | bit_pos;
end

endmodule

Jan's solution is also fine but in fact it's a priority encoder
rather than a one-hot encoder. A sufficiently smart synthesis
tool *might* be able to look upstream and detect that the input
signal is indeed one-hot, and perform the right optimizations;
but the simple change to OR-logic in my example above makes
the optimization certain, even if the onehot code is a primary
input for which the tool has no knowledge.
--
Jonathan Bromley
 
On 2011-04-07 11:45:18 +0100, Jonathan Bromley said:

module ONEHOT_TO_CODE(in_onehot, out_code);
parameter OUTPUT_BITS = 2;
localparam INPUT_BITS = 1 << OUTPUT_BITS;
input [INPUT_BITS-1:0] in_onehot;
output [OUTPUT_BITS-1:0] out_code;
reg [OUTPUT_BITS-1:0] out_code;

always @(in_onehot) begin : onehot_or_tree
integer bit_pos;
out_code = 0;
for (bit_pos = 0; bit_pos < INPUT_BITS; bit_pos=bit_pos+1)
if (in_onehot[bit_pos])
out_code = out_code | bit_pos;
end

endmodule
That produces a 32 multiplexer ladder. No or gates. Maybe Quartus is
spotting that it's quicker than useing the wide or gates ?

Chris
 
On Thu, 7 Apr 2011 12:51:33 +0100, Chris Hinsley
<chris.hinsley@gmail.com> wrote:

On 2011-04-07 11:45:18 +0100, Jonathan Bromley said:

module ONEHOT_TO_CODE(in_onehot, out_code);
parameter OUTPUT_BITS = 2;
localparam INPUT_BITS = 1 << OUTPUT_BITS;
input [INPUT_BITS-1:0] in_onehot;
output [OUTPUT_BITS-1:0] out_code;
reg [OUTPUT_BITS-1:0] out_code;

always @(in_onehot) begin : onehot_or_tree
integer bit_pos;
out_code = 0;
for (bit_pos = 0; bit_pos < INPUT_BITS; bit_pos=bit_pos+1)
if (in_onehot[bit_pos])
out_code = out_code | bit_pos;
end

endmodule

That produces a 32 multiplexer ladder. No or gates. Maybe Quartus is
spotting that it's quicker than useing the wide or gates ?
Interesting. At what stage in the synthesis flow
do you see that? I can see a couple of likely
possibilities:
- you're looking at the schematic from a fairly
early stage in the synth flow, before much
mapping has been done, and those muxes are
the first representation of the if() statement;
- what you're seeing is in fact a visualisation of
the carry chain, configured to make a wide OR gate.
How about timing and area reports vis-a-vis your
OR-constructing macro? Synthesis tool schematics
can be very helpful in understanding what your code
represents, but ultimately it's the final area and
delay numbers that count.

Hope I haven't sent you off on a wild-goose chase.
--
Jonathan Bromley
 
On 2011-04-07 13:49:48 +0100, Jonathan Bromley said:

On Thu, 7 Apr 2011 12:51:33 +0100, Chris Hinsley
chris.hinsley@gmail.com> wrote:

On 2011-04-07 11:45:18 +0100, Jonathan Bromley said:

module ONEHOT_TO_CODE(in_onehot, out_code);
parameter OUTPUT_BITS = 2;
localparam INPUT_BITS = 1 << OUTPUT_BITS;
input [INPUT_BITS-1:0] in_onehot;
output [OUTPUT_BITS-1:0] out_code;
reg [OUTPUT_BITS-1:0] out_code;

always @(in_onehot) begin : onehot_or_tree
integer bit_pos;
out_code = 0;
for (bit_pos = 0; bit_pos < INPUT_BITS; bit_pos=bit_pos+1)
if (in_onehot[bit_pos])
out_code = out_code | bit_pos;
end

endmodule

That produces a 32 multiplexer ladder. No or gates. Maybe Quartus is
spotting that it's quicker than useing the wide or gates ?

Interesting. At what stage in the synthesis flow
do you see that? I can see a couple of likely
possibilities:
- you're looking at the schematic from a fairly
early stage in the synth flow, before much
mapping has been done, and those muxes are
the first representation of the if() statement;
- what you're seeing is in fact a visualisation of
the carry chain, configured to make a wide OR gate.
How about timing and area reports vis-a-vis your
OR-constructing macro? Synthesis tool schematics
can be very helpful in understanding what your code
represents, but ultimately it's the final area and
delay numbers that count.

Hope I haven't sent you off on a wild-goose chase.
Drilling down to the RTL viewer level. The timing report is around 280
MHz, so that no problem at all, the priority encoder comes out at
around 178 MHz. So somthing is definately better.

I'll grab my old ENCODER macro from the backup and time it, hold on...

Chris
 
Interesting. At what stage in the synthesis flow
do you see that? I can see a couple of likely
possibilities:
- you're looking at the schematic from a fairly
early stage in the synth flow, before much
mapping has been done, and those muxes are
the first representation of the if() statement;
- what you're seeing is in fact a visualisation of
the carry chain, configured to make a wide OR gate.
How about timing and area reports vis-a-vis your
OR-constructing macro? Synthesis tool schematics
can be very helpful in understanding what your code
represents, but ultimately it's the final area and
delay numbers that count.

Hope I haven't sent you off on a wild-goose chase.

Drilling down to the RTL viewer level. The timing report is around 280
MHz, so that no problem at all, the priority encoder comes out at
around 178 MHz. So somthing is definately better.

I'll grab my old ENCODER macro from the backup and time it, hold on...

Chris
Well I'm happy to say their both going at 280'ish MHz, so that's a
keeper as the code is a lot clearer.

Chris
 
On Thu, 7 Apr 2011 14:22:34 +0100, Chris Hinsley wrote:

Well I'm happy to say
[two variants of onehot code converter]
both going at 280'ish MHz, so that's a
keeper as the code is a lot clearer.
I would have been pretty surprised if there were
any significant difference between the two.
A few percent speed variation here or there
is to be expected - just an accident of how the
place-and-route software got seeded. For a
64-in 6-out module, both descriptions should
resolve to six 32-input OR gates; the tool should
generate essentially identical implementations.

If you look at one of the slowest timing paths
in the timing analyzer you should be able to see
how the wide-OR got implemented - through the
carry chain, or as a tree of gates, or some
combination of the two. It's a pretty safe bet
that the tool has well-established strategies
for handling exactly this wide-gate problem.

It may well be that Quartus would make
different decisions for different sizes of
module. For example, a very small one (8-in
3-out) could be done in just one level of
lookup table logic, which probably beats an
implementation that gets on to the carry chain
and off it again. A very big version (256-in?)
is likely to be quite slow even on the carry
chain, and it would be quite interesting to
see what Quartus does - maybe make several
32-input ORs using carry logic, and then
merge the results in LUTs?

Anyway, enough speculation - back to real work.
Thanks
--
Jonathan Bromley
 
Anyway, enough speculation - back to real work.
Thanks
Thanks very much indeed for the tutorials. :)

I'll be after your job soon... ;)

Chris
 
On 4/7/2011 1:15 AM, Jonathan Bromley wrote:
On Wed, 06 Apr 2011 17:18:30 -0700, "Cary R."<no-spam@host.spam
wrote:

On 4/6/2011 3:28 PM, Jonathan Bromley wrote:

2**N yields a real result

If that's what you are getting then your vendor is being lazy and not
following the standard. They are likely using the C pow routine instead
taking the time to write the appropriate bit based power routine. The
power operator should only return a real value if either of the operands
are real.

Not true in Verilog, I'm afraid. The standard says you get
an integer result only when both operands are UNSIGNED
integers, and of course the undecorated integer literal 2
is SIGNED (and the parameter N is likely an integer, and
therefore signed, too). You could use
'd2**$unsigned(N)
and get an integer result, but that's a tad indigestible!
Jonathan, your information is dated. This was the case for 1364-2001 and
I'm guessing 1800-2005 since it was based on 1364-2001, but this was
changed in 1364-2005 and I believe 1800-2009 as well.

Cary
 
On Thu, 07 Apr 2011 09:11:48 -0700, "Cary R." wrote:

Jonathan, your information is dated. This was the case for 1364-2001 and
I'm guessing 1800-2005 since it was based on 1364-2001, but this was
changed in 1364-2005 and I believe 1800-2009 as well.
Whoah, that'll teach me not to rely on memory instead
of the LRM. Thanks for the correction - it's a change
that I had entirely missed.

Actually it's a bit odd, isn't it - either way. The
baby has been thrown out with the bathwater, since
now we have (for example) 2**(-2) = 0, whereas in
Verilog-2001 it would have been (real) 0.25.

Fun, fun, fun.
--
Jonathan Bromley
 
On 4/7/2011 10:49 AM, Jonathan Bromley wrote:
On Thu, 07 Apr 2011 09:11:48 -0700, "Cary R." wrote:

Jonathan, your information is dated. This was the case for 1364-2001 and
I'm guessing 1800-2005 since it was based on 1364-2001, but this was
changed in 1364-2005 and I believe 1800-2009 as well.

Whoah, that'll teach me not to rely on memory instead
of the LRM. Thanks for the correction - it's a change
that I had entirely missed.

Actually it's a bit odd, isn't it - either way. The
baby has been thrown out with the bathwater, since
now we have (for example) 2**(-2) = 0, whereas in
Verilog-2001 it would have been (real) 0.25.

Fun, fun, fun.
I think of this just like integer division, if you want a real value
then one of the operands must also be real. To me, this change makes the
power operator fit in better with the other arithmetic operators.

And in a bit based context both 0 and 0.25 are zero, so there is likely
little end user impact created by this change.

Cary
 
On 04/07/2011 12:45 PM, Jonathan Bromley wrote:
On Thu, 7 Apr 2011 10:47:17 +0100, Chris Hinsley wrote:


And I'm having problems getting a paramatized form that produces what I
want, ie somthing that dosn't suck. :)

My macro did produce the correct result, and did it in what I thought
was an efficent way useing 'or' gates. If you have a simpler way to do
it I'd be grateful to see it.

Try this in your setup and see how it shapes up. I wrote
the port list in the old Verilog-95 style because it makes
parameterization a little safer (I can protect INPUT_BITS
against accidental change by making it a localparam).

module ONEHOT_TO_CODE(in_onehot, out_code);
parameter OUTPUT_BITS = 2;
localparam INPUT_BITS = 1<< OUTPUT_BITS;
input [INPUT_BITS-1:0] in_onehot;
output [OUTPUT_BITS-1:0] out_code;
reg [OUTPUT_BITS-1:0] out_code;

always @(in_onehot) begin : onehot_or_tree
integer bit_pos;
out_code = 0;
for (bit_pos = 0; bit_pos< INPUT_BITS; bit_pos=bit_pos+1)
if (in_onehot[bit_pos])
out_code = out_code | bit_pos;
end

endmodule

Jan's solution is also fine but in fact it's a priority encoder
rather than a one-hot encoder. A sufficiently smart synthesis
tool *might* be able to look upstream and detect that the input
signal is indeed one-hot, and perform the right optimizations;
but the simple change to OR-logic in my example above makes
the optimization certain, even if the onehot code is a primary
input for which the tool has no knowledge.
Very nice, and certainly a pattern to remember.

I think this is much less obvious than it might seem
at first. One (I) might argue that your code doesn't really
specifies don't cares explicitly - it just sets them to different
values than in my solution. Why would that optimize better?
But it does (I tried it :)), in a significant way certainly for
higher bit widths and tight timing constraints.

The answer is that an or-based logic structure is implementation-wise
inherently much better - probably it's hard to do better
even with explicit don't cares. This is good news because
it suggests that there is never a need for a 'case' with explicit
don't cares (and no parametrizability.)

Moreover, this is really Advanced Usage :) Using a variable
both as intermediate variable and as output - wow!
Wonder what die-hard hardware thinkers think about that one :)

Jan

--
Jan Decaluwe - Resources bvba - http://www.jandecaluwe.com
Python as a HDL: http://www.myhdl.org
VHDL development, the modern way: http://www.sigasi.com
Analog design automation: http://www.mephisto-da.com
World-class digital design: http://www.easics.com
 
On Thu, 7 Apr 2011 15:03:01 +0100, Chris Hinsley wrote:

Thanks very much indeed for the tutorials. :)
A fine Socratean dialogue, in which the tutor learns
at least as much as the pupil!

I'll be after your job soon... ;)
Sheesh, not you as well... it's kinda hard trying to
stay one step ahead of (or, at least, no more than
two steps behind) a bunch of people who are faster,
smarter and more energetic than I am. My store of
been-there-done-that experience and wisdom is fast
losing its relevance, even when I can remember it.
Couldn't you youngsters just leave me to rest on
my laurels for a year or two???? <bah, humbug,...>
--
Jonathan Bromley
 
On Apr 7, 12:11 pm, "Cary R." <no-s...@host.spam> wrote:

<regarding the type rules for the ** operator>

Jonathan, your information is dated. This was the case for 1364-2001 and
I'm guessing 1800-2005 since it was based on 1364-2001, but this was
changed in 1364-2005 and I believe 1800-2009 as well.
Yes, I think I wrote the proposal when we changed this for 1364-2005.

1800-2005 was an extension to 1364, which meant 1364-2005 by the time
it was published. So it would follow 1364-2005.

The revised rules are in the 1800-2009 standard as well.

The new rules were more appropriate for a programming language than
the old ones were.
 
On 4/6/2011 3:28 PM, Jonathan Bromley wrote:

One very specific criticism: *don't* use 2**N; instead
favour 1<<N. The reason is that 2**N yields a real rather
than integer result, which can have troublesome consequences
when defining array sizes and subscripts.
This has been percolating in my brain and I realized that 1<<N also has
an unexpected consequence if N is a small signed register (e.g. reg
signed [3:0] N) and N is assigned a negative value. It certainly doesn't
give zero like you would expect from the power operator so they are not
100% compatible without a bit more logic (e.g. (N<0)?0:1<<N).

Cary
 
On 4/7/2011 9:50 PM, Steven Sharp wrote:

Yes, I think I wrote the proposal when we changed this for 1364-2005.

1800-2005 was an extension to 1364, which meant 1364-2005 by the time
it was published. So it would follow 1364-2005.
Thanks for the updated power rules and the clarification regarding what
1800-2005 is based on.

Cary
 
On Fri, 08 Apr 2011 09:39:48 -0700, "Cary R." <no-spam@host.spam>
wrote:

On 4/6/2011 3:28 PM, Jonathan Bromley wrote:

One very specific criticism: *don't* use 2**N; instead
favour 1<<N. The reason is that 2**N yields a real rather
than integer result, which can have troublesome consequences
when defining array sizes and subscripts.

This has been percolating in my brain and I realized that 1<<N also has
an unexpected consequence if N is a small signed register (e.g. reg
signed [3:0] N) and N is assigned a negative value. It certainly doesn't
give zero like you would expect from the power operator so they are not
100% compatible without a bit more logic (e.g. (N<0)?0:1<<N).
True. There are quite a few things we do with parameters
(or, at least, I do and I've seen many others do) that
depend on the parameters being "sensible" (typically >0
or some other similar restriction). One of the outstanding
advantages of VHDL over Verilog for RTL design has been
the ability to write assertions over parameter values,
and have them trip at elaboration time, allowing your
parameterized modules to report bad parameterization
in a sensible way before the simulator falls over for
some other reason that's a consequence of the bad
parameters. Now, in SV-2009, we have elaboration-time
assertions allowing us to do those same checks. Now
it's just a matter of waiting for tool support!
--
Jonathan Bromley
 
In article <ldpup6deldup2ppr6r3adhcvna8lrbeo81@4ax.com>,
Jonathan Bromley <spam@oxfordbromley.plus.com> wrote:
On Fri, 08 Apr 2011 09:39:48 -0700, "Cary R." <no-spam@host.spam
wrote:

On 4/6/2011 3:28 PM, Jonathan Bromley wrote:

One very specific criticism: *don't* use 2**N; instead
favour 1<<N. The reason is that 2**N yields a real rather
than integer result, which can have troublesome consequences
when defining array sizes and subscripts.

This has been percolating in my brain and I realized that 1<<N also has
an unexpected consequence if N is a small signed register (e.g. reg
signed [3:0] N) and N is assigned a negative value. It certainly doesn't
give zero like you would expect from the power operator so they are not
100% compatible without a bit more logic (e.g. (N<0)?0:1<<N).

True. There are quite a few things we do with parameters
(or, at least, I do and I've seen many others do) that
depend on the parameters being "sensible" (typically >0
or some other similar restriction). One of the outstanding
advantages of VHDL over Verilog for RTL design has been
the ability to write assertions over parameter values,
and have them trip at elaboration time, allowing your
parameterized modules to report bad parameterization
in a sensible way before the simulator falls over for
some other reason that's a consequence of the bad
parameters. Now, in SV-2009, we have elaboration-time
assertions allowing us to do those same checks. Now
it's just a matter of waiting for tool support!
We do it at "assertions checks" for our parameters all the time.
It IS at runtime, which our synthesis tool supports, with just
initial blocks:

initial
if( SOME_PARAM_IS_BAD )
begin
$display( "Illegal configuration at %m, blah blah" );
$finish;
end

A "hacked" assertion to be sure, but this works fine for us. XST
will exit at the $finish; The sim tool barfs at runtime.
Elaboration time might be better, but as long as it get's checked
before running a multi-hour sim of invalid results....

--Mark
 
On 4/8/2011 4:46 PM, Mark Curry wrote:
In article<ldpup6deldup2ppr6r3adhcvna8lrbeo81@4ax.com>,
Jonathan Bromley<spam@oxfordbromley.plus.com> wrote:
On Fri, 08 Apr 2011 09:39:48 -0700, "Cary R."<no-spam@host.spam
wrote:

On 4/6/2011 3:28 PM, Jonathan Bromley wrote:

One very specific criticism: *don't* use 2**N; instead
favour 1<<N. The reason is that 2**N yields a real rather
than integer result, which can have troublesome consequences
when defining array sizes and subscripts.

This has been percolating in my brain and I realized that 1<<N also has
an unexpected consequence if N is a small signed register (e.g. reg
signed [3:0] N) and N is assigned a negative value. It certainly doesn't
give zero like you would expect from the power operator so they are not
100% compatible without a bit more logic (e.g. (N<0)?0:1<<N).

True. There are quite a few things we do with parameters
(or, at least, I do and I've seen many others do) that
depend on the parameters being "sensible" (typically>0
or some other similar restriction). One of the outstanding
advantages of VHDL over Verilog for RTL design has been
the ability to write assertions over parameter values,
and have them trip at elaboration time, allowing your
parameterized modules to report bad parameterization
in a sensible way before the simulator falls over for
some other reason that's a consequence of the bad
parameters. Now, in SV-2009, we have elaboration-time
assertions allowing us to do those same checks. Now
it's just a matter of waiting for tool support!

We do it at "assertions checks" for our parameters all the time.
It IS at runtime, which our synthesis tool supports, with just
initial blocks:

initial
if( SOME_PARAM_IS_BAD )
begin
$display( "Illegal configuration at %m, blah blah" );
$finish;
end

A "hacked" assertion to be sure, but this works fine for us. XST
will exit at the $finish; The sim tool barfs at runtime.
Elaboration time might be better, but as long as it get's checked
before running a multi-hour sim of invalid results....

I have similar constructs in my code. For some cases I even print out
the basic configuration information for various blocks so I can visually
verify that they are all configured as expected.

Cary
 
On Fri, 08 Apr 2011 17:23:51 -0700, "Cary R." <no-spam@host.spam>
wrote:

On 4/8/2011 4:46 PM, Mark Curry wrote:
initial
if( SOME_PARAM_IS_BAD )
begin
$display( "Illegal configuration at %m, blah blah" );
$finish;
end

A "hacked" assertion to be sure, but this works fine for us. XST
will exit at the $finish; The sim tool barfs at runtime.
Elaboration time might be better, but as long as it get's checked
before running a multi-hour sim of invalid results....

I have similar constructs in my code. For some cases I even print out
the basic configuration information for various blocks so I can visually
verify that they are all configured as expected.
Sure, but the point is that in VHDL the parameter checking can be
done, top down, BEFORE any other processes get to execute and
BEFORE the instance's children are elaborated. That brings two
big benefits:
- your parameter checks are guaranteed to execute before
time-zero processes, rather than racing with them -
probably irrelevant for designs, but important for
testbench code;
- sims that would otherwise fail with incomprehensible
elaboration-time failures can yield helpful diagnostics.

I absolutely agree that a time-zero check on parameter
values is a lot better than nothing, though.
--
Jonathan Bromley
 
On 4/7/2011 4:47 AM, Chris Hinsley wrote:
In my mind, an encoder is a module that, assuming an input
with a single bit set, outputs that bit position.

Exactly what I'm after.

In my quick simulation, EncoderLogic doesn't behave like
that. EncoderLogic2 does not work - it misses the output
assignment. But when I make the "obvious" assignemnt, it
behaves different from EncoderLogic, and also not
as an encoder.

Perhaps something else is meant with "encoder", but if
my definition is correct, the solutions so far are way
too complicated.

Jan

And I'm having problems getting a paramatized form that produces what I
want, ie somthing that dosn't suck. :)

My macro did produce the correct result, and did it in what I thought
was an efficent way useing 'or' gates. If you have a simpler way to do
it I'd be grateful to see it.

Chris
I've thought of a method you might want to test to determine if it
gives faster or slower results.

Start with a lowest level encoder - same as the MUX but only for
pairs of inputs where all but the lowest bit of the output would
be the same. Should give only the lowest output bit. Use the
input width/2 of them.

Now the next level, for groups of 4 inputs where all but the lowest
two bits of the output are the same. If either of the two highest
of these inputs are 1, select the lowest level output bit from the
lowest bit encoder for the two higher of these inputs; otherwise
select it from the one for the two lower of these inputs. Set the
next lowest bit of the output depending on which lowest level
encoder was selected. Use the input width/4 of them.

Now the next level, for groups of 8 inputs where all but the lowest
three bits of the output are the same. If any of the four highest
of these inputs are 1, select the two lowest level output bits from
the next lower bit encoder for the four higher of these inputs;
otherwise select it from the one for the four lower of these inputs.
Set the next lowest bit of the output depending on which next lower
level encoder was selected. Use the input width/8 of them.

And so on up the chain, until your tree of encoders provides
enough inputs.

Unlikely to be able to use the ripple adder carry chain, so I'd
expect it to be faster under some circumstances but slower under
other circumstances.

I do not have a Verilog program installed which can test this
idea.

Robert Miles
 

Welcome to EDABoard.com

Sponsor

Back
Top