comparator fast implementation

S

salimbaba

Guest
Hi,
In my design i have two counters, a write_counter and a read_counter, bot
are 11 bits wide. I used a simple compare equation like this:

assign last_byte = odd_number_bytes ? (read_counter + 2 == write_counter
:(read_counter + 1 == write_counter);

and last_byte triggers the state machine etc etc.
now the logic designed by this comparator is of 6 logic levels which i
causing a timing failure in my design. I need to optimize this logic but
can't seem to think of any fast implementation. I tried to come up wit
something like a lookup on first two LSBs and XOR the other bits etc, bu
every solution that i come up with contains a lot of corner cases and th
whole thing starts to get messy.
So, can anyone help me with this logic's optimization? Perhaps a fas
implementation or a way to optimize it. Thanks a lot.

Regards


---------------------------------------
Posted through http://www.FPGARelated.com
 
something like a prediction logic where i can predict beforehand that th
coming byte is the last one or not.

---------------------------------------
Posted through http://www.FPGARelated.com
 
On Mon, 23 May 2011 14:31:38 -0500, "salimbaba" wrote:

Hi,
In my design i have two counters, a write_counter and a read_counter, both
are 11 bits wide. I used a simple compare equation like this:

assign last_byte = odd_number_bytes ? (read_counter + 2 == write_counter)
:(read_counter + 1 == write_counter);

and last_byte triggers the state machine etc etc.
now the logic designed by this comparator is of 6 logic levels which is
causing a timing failure in my design.
Not knowing the details it's a bit hard to suggest
what you might be able to change, but here are two
suggestions that might get you somewhere:

1.
Compute (write_counter - read_counter). That
should go in a nice fast adder structure using the
carry chain. Then compare the output of the subtract
with the constants 1 and 2. I think that should go in
4 levels of logic in total, though I'm not certain!

2. (Only if you're desperate.)
Maintain three separate read counters. Initialise
them to 0, 1 and 2 respectively. Increment them
all together on every read operation. Now you can
do the equality comparison without an extra +:

... odd ? (read_2==write) : read_1==write);

cheers
--
Jonathan Bromley
 
salimbaba <a1234573@n_o_s_p_a_m.n_o_s_p_a_m.owlpic.com> wrote:

In my design i have two counters, a write_counter and a read_counter, both
are 11 bits wide. I used a simple compare equation like this:

assign last_byte = odd_number_bytes ? (read_counter + 2 == write_counter)
:(read_counter + 1 == write_counter);

and last_byte triggers the state machine etc etc.
now the logic designed by this comparator is of 6 logic levels which is
causing a timing failure in my design. I need to optimize this logic but i
can't seem to think of any fast implementation. I tried to come up with
something like a lookup on first two LSBs and XOR the other bits etc, but
every solution that i come up with contains a lot of corner cases and the
whole thing starts to get messy.
The usual way involves pipelining, separating the add from the compare.
That is, in one cycle compute counter+2 and counter+3, in the
next compare them to write counter. The addition is one
higher to make up for the one cycle delay.

Note that it fails in some cases where yours doesn't. If those
can occur, then you have to special case them out. (Specifically,
when read_counter can initialize to write_counter-1 or
write_counter-2, when there is no time for the additional cycle.)

-- glen
 
Hey Glen,
I have taken care of those failure cases, forgot to mention in the pos
though. I will definitely look at it.

---------------------------------------
Posted through http://www.FPGARelated.com
 
A few ideas:

- For your app: put regs on your "read_counter+2" and "read_counter+1"
signals. You might have to adjust offsets using look-ahead logic. I
don't know how soon you need "last_byte" relative to when those
situations occur. Pipelining certainly requires you to account for the
latency elsewhere.

- For high-performance, split and pipeline comparators according to
fabric. For example, with 6-input LUTs, compare 3 bits at a time, use
a pipeline reg for each 3-bit compare, than use straight logic for
last stage (i.e. "if all sub-compares equal, entire comparison must be
equal"). You can't get much faster than breaking down your logic into
"single LUT-single REG" combinations.

- An idea for accounting for pipeline latencies is to use look-ahead
version of a counter for one path while using a different version for
the other. In that case, you'd either have to replicate the counter
logic or add pipeline regs to the actual counter to emulate look-ahead
versions. For high-performance, it's always a good idea to 'buffer'
your actual counter from downstream comparators and such anyway, since
your counter typically is nothing but a register that already feeds
back to its internal adder logic.

John
 

Welcome to EDABoard.com

Sponsor

Back
Top