Logic Glitches in Spartan-3?

R

Rob Gaddi

Guest
I've got a 24-input AND gate that I'd like to avoid having add another
register delay to before I toss it across a clock boundary.

all_done <= and_reduce(done);

If I just do it, AND it all together without a flop on the output, does
anyone know whether I'll get transition glitches (an output of 1 when
not all inputs are 1)?

I seem to remember something about individual LUTs being glitch-free,
and the synthesizer has to compose my giant AND out of either a LUT
tree or a mess o' LUTs "wire-and" driving a carry chain, Offhand, it
seems like neither structure should glitch. "Try it and see" doesn't
work; testing all 2^24 combinations and trying to determine whether I
get a glitch would be a beast of an effort.

Anyone know offhand?

--
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order. See above to fix.
 
On May 23, 7:34 pm, Rob Gaddi <rga...@technologyhighland.invalid>
wrote:
I've got a 24-input AND gate that I'd like to avoid having add another
register delay to before I toss it across a clock boundary.

        all_done <= and_reduce(done);

If I just do it, AND it all together without a flop on the output, does
anyone know whether I'll get transition glitches (an output of 1 when
not all inputs are 1)?

I seem to remember something about individual LUTs being glitch-free,
and the synthesizer has to compose my giant AND out of either a LUT
tree or a mess o' LUTs "wire-and" driving a carry chain,  Offhand, it
seems like neither structure should glitch.  "Try it and see" doesn't
work; testing all 2^24 combinations and trying to determine whether I
get a glitch would be a beast of an effort.

Anyone know offhand?

--
Rob Gaddi, Highland Technology --www.highlandtechnology.com
Email address domain is currently out of order.  See above to fix.
I believe the glitch free behavior of a LUT is from a single input
changing which results in the same value being output from two
different configuration FFs in the LUT.

But how would an AND gate glitch? I guess if one input is at a zero
and other inputs change you are concerned that a one might sneak
through? That is what the LUTs (at least according to Xilinx) are
assured to prevent. If you are worried about multiple inputs
switching simultaneously, e.g. a 1 to a 0 and a 0 to a 1, that is a
race condition and there are no guarantees. It just depends on which
gets to the output first.

Rick
 
Rob Gaddi <rgaddi@technologyhighland.invalid> wrote:
I've got a 24-input AND gate that I'd like to avoid having add another
register delay to before I toss it across a clock boundary.

all_done <= and_reduce(done);

If I just do it, AND it all together without a flop on the output, does
anyone know whether I'll get transition glitches (an output of 1 when
not all inputs are 1)?

I seem to remember something about individual LUTs being glitch-free,
That is the idea, though only true when the actual logic is
glitch-free. As you note, if one input is 0, for any combination
of other inputs the output should be zero. Note that does NOT
naturally happen for a SRAM.

and the synthesizer has to compose my giant AND out of either a LUT
tree or a mess o' LUTs "wire-and" driving a carry chain, Offhand, it
seems like neither structure should glitch. "Try it and see" doesn't
work; testing all 2^24 combinations and trying to determine whether I
get a glitch would be a beast of an effort.
It seems to me that a tree of glitch-free LUTs is glitch-free,
but see what others say.

Note that you might have different delays on the routes to different
inputs. In that case, depending on timing, there might be times
when all inputs are 1, and the output goes to 1. That is a race
condition, not a glitch.

-- glen
 
In article <jpk3g1$pmp$1@speranza.aioe.org>,
glen herrmannsfeldt <gah@ugcs.caltech.edu> writes:

It seems to me that a tree of glitch-free LUTs is glitch-free,
but see what others say.
I don't think so.

Consider this example:
LUT1 is OR(LUT2, LUT3)
LUT2 is XOR(A, 0)
LUT3 is XOR(A, 1)

In the static case, it doesn't matter what A is.
Now switch A. With appropraite routing delays you can make a glitch.

--
These are my opinions. I hate spam.
 
On Thursday, May 24, 2012 11:34:26 AM UTC+12, Rob Gaddi wrote:
I've got a 24-input AND gate that I'd like to avoid having add another
register delay to before I toss it across a clock boundary.
A 24 wide and, is an unusual Clock Mux, and Glitches usually only bother a design if they are on a clock tree. Is this clocking something ?

So long as the wide-AND output meets your clock domain Tsu.Th, why are you worried about glitches ?
 
"Rob Gaddi" <rgaddi@technologyhighland.invalid> wrote in message
news:20120523163426.7e77de05@rg.highlandtechnology.com...
seems like neither structure should glitch. "Try it and see" doesn't
work; testing all 2^24 combinations and trying to determine whether I
get a glitch would be a beast of an effort.
Doesn't seem it should be that difficult. 2^24 is only 16 million, not all
that large compared with the 100's of MHz clocks.
 
"Rob Gaddi" <rgaddi@technologyhighland.invalid> wrote in message
news:20120523163426.7e77de05@rg.highlandtechnology.com...
seems like neither structure should glitch. "Try it and see" doesn't
work; testing all 2^24 combinations and trying to determine whether I
get a glitch would be a beast of an effort.

Doesn't seem it should be that difficult. 2^24 is only 16 million, not al

that large compared with the 100's of MHz clocks.


And the most important part of that can be done by a walking-zero test:
Start with all zero.
Then all one.
24-stage walking-zero.
All one.
All zero.


---------------------------------------
Posted through http://www.FPGARelated.com
 
Hal Murray <hal-usenet@ip-64-139-1-69.sjc.megapath.net> wrote:
In article <jpk3g1$pmp$1@speranza.aioe.org>,
(snip, I wrote)
It seems to me that a tree of glitch-free LUTs is glitch-free,
but see what others say.

I don't think so.

Consider this example:
LUT1 is OR(LUT2, LUT3)
LUT2 is XOR(A, 0)
LUT3 is XOR(A, 1)

In the static case, it doesn't matter what A is.
Now switch A. With appropraite routing delays you can make
a glitch.
But note that I specifically excluded ones that depend on
timing delays, but you snipped out that part.

Note that non-LUT logic can glitch with timing delays, so
that is nothing new.

Depending on design, LUT logic can glitch even in cases where
gates wouldn't, which is why FPGAs use glitch-free LUTs.

-- glen
 
j.m.granville@gmail.com wrote:
On Thursday, May 24, 2012 11:34:26 AM UTC+12, Rob Gaddi wrote:
I've got a 24-input AND gate that I'd like to avoid having add another
register delay to before I toss it across a clock boundary.

A 24 wide and, is an unusual Clock Mux, and Glitches usually only bother a design if they are on a clock tree. Is this clocking something ?

So long as the wide-AND output meets your clock domain Tsu.Th, why are you worried about glitches ?
Maybe you missed the fact that he's crossing clock domains, so there's
no way to ensure setup and hold time.

The walking 0's case is really the worst case for an AND gate. It
guarantees glitches if your inputs aren't set up to "break
before make". i.e. if you go from 1110111 to 1101111 without
going to 1100111 in between, you'll get a glitch if there
are enough routing delay differences to pass through the LUT.
You'll need to walk the zeroes in all directions to ensure
complete timing test coverage. This means 276 transitions
for the 24-bit case (combination of 24 taken 2 at a time).
If your input logic can guarantee that you'll never make this
sort of state transition, then you won't get glitches. I
think it would be easier to add another register after the
AND gate and then you just need to meet Tsu and Th.

- Gabor
 
Gabor <gabor@szakacs.invalid> wrote:
j.m.granville@gmail.com wrote:
On Thursday, May 24, 2012 11:34:26 AM UTC+12, Rob Gaddi wrote:
I've got a 24-input AND gate that I'd like to avoid having add another
register delay to before I toss it across a clock boundary.

A 24 wide and, is an unusual Clock Mux, and Glitches usually
only bother a design if they are on a clock tree. Is this
clocking something ?

Maybe you missed the fact that he's crossing clock domains,
so there's no way to ensure setup and hold time.
That is true, but if there aren't any glitches it will go 1
either one cycle or the next.

The walking 0's case is really the worst case for an AND gate.
It guarantees glitches if your inputs aren't set up to "break
before make". i.e. if you go from 1110111 to 1101111 without
going to 1100111 in between, you'll get a glitch if there
are enough routing delay differences to pass through the LUT.
Given that it is an AND of done bits, in the usual case each
one transitions from 0 to 1 until all are 1.

If you use a traditional SRAM, with row select logic, the bits
of each row coming out, and then a MUX to select one of the
rows, even if the inputs change between states that don't
even come close to a cell of a different value, the output
can still glitch. The delays through different parts of
the DEMUX row select, or the MUX column select can be
different. Fill up a SRAM such that only one cell is 1,
hold one input at 0, and go through all combinations of
the other inputs.

For that matter, fill the array with all 0, and go through
all combinations of inputs. An SRAM can still glitch to 1.

The glitchless LUTs used in FPGAs are designed not to do that.

If you hold one input a 0, for all combinations of other inputs,
even with timing differences, the output should stay zero.

You'll need to walk the zeroes in all directions to ensure
complete timing test coverage. This means 276 transitions
for the 24-bit case (combination of 24 taken 2 at a time).
If your input logic can guarantee that you'll never make this
sort of state transition, then you won't get glitches. I
think it would be easier to add another register after the
AND gate and then you just need to meet Tsu and Th.
Since the 24 bit AND will be generated as some number of
LUTs feeding either another LUT or carry chain AND logic,
another possibility is to put a register between the two
levels of AND. It seems to me that isn't necessary, but
could be done.

-- glen
 
On May 23, 6:34 pm, Rob Gaddi <rga...@technologyhighland.invalid>
wrote:
I've got a 24-input AND gate that I'd like to avoid having add another
register delay to before I toss it across a clock boundary.

        all_done <= and_reduce(done);

If I just do it, AND it all together without a flop on the output, does
anyone know whether I'll get transition glitches (an output of 1 when
not all inputs are 1)?

I seem to remember something about individual LUTs being glitch-free,
and the synthesizer has to compose my giant AND out of either a LUT
tree or a mess o' LUTs "wire-and" driving a carry chain,  Offhand, it
seems like neither structure should glitch.  "Try it and see" doesn't
work; testing all 2^24 combinations and trying to determine whether I
get a glitch would be a beast of an effort.

Anyone know offhand?

--
Rob Gaddi, Highland Technology --www.highlandtechnology.com
Email address domain is currently out of order.  See above to fix.
Unless you can guarantee that no more than one of the done bits
changes at any one time, you WILL get glitches sooner or later, and
sooner or later they will line up with the (async) receiveing clock,
and you will have bad data transferred, if you simply leave it up to
the synthesis tool to implement the LUT/Carry structure.

If you force the synthesis tool to implement a specific AND structure,
you may be able to avoid glitches based on how the done bits
collectively behave. For example, if you are only concerned with the
leading edge of all_done, and the done bits monotonically transition
from 0 to 1 (never go back to zero until you don't care), you can
probably get some structure to work reliably.

If you don't have time for an additional clock cycle in the source
domain, do you have time for a half-cycle (register all_done on the
opposite edge of the source clock)?

Depending on the relative clock frequencies and your latency
requirements, you may have time to "filter" the all_done signal in the
destination clock domain, and thus reject glitches.

There are probably many ways to do this, but the conventional approach
of registering all_done in the source domain prior to sampling in the
destination domain is the easiest by far to verify.

Andy
 
glen herrmannsfeldt wrote:


It seems to me that a tree of glitch-free LUTs is glitch-free,
but see what others say.
It seems to me that routing delays will make this not work.
Yes, for exactly one input changing state at a time, your statement
MIGHT still be true, but for multiple inputs changing state at
the same time, such as one input going high while one other
goes low, the propagation of these signals through the routing
will certainly cause glitches.

Now, maybe there is some simplification possible depending on
how this is used. If the normal state is all inputs false,
and occasionally a few inputs are true, then this might never glitch.
If the normal state is 23 inputs true, and the pattern of which ones
are true changes, then I would expect glitches are possible.

Jon
 
On Thu, 24 May 2012 14:24:53 +0000 (UTC)
glen herrmannsfeldt <gah@ugcs.caltech.edu> wrote:

Given that it is an AND of done bits, in the usual case each
one transitions from 0 to 1 until all are 1.
That's exactly right, and I feel silly for not having mentioned it out
front. The assertion of all_done, shifted onto a different clock
domain, is what goes back around and asynchronously
clears all of the individual done flags.

All of you stop wincing; I know exactly how bad that sounds and I
swear to god there are reasons it had to be this way.

So the only possible modes of operation are:
all_done is low, and all I want is for it to never erroneously
transition high until all 24 done flags come up. Once it's up it
stays up.

all_done has been high, causing a 20 ns asynchronous clear pulse to
hit all the done flags, dropping them all. In this case, all_done
should drop once and only once as the various path skews from the
pulse to the clear to the AND gate structure all work themselves out.

The reason I'm concerned about glitches is, because all_done is sampled
asynchronously to it's originating clock, a glitch could happen to get
captured, with some unknown but small probability. The reason I can't
just test it out is the same: given that if a glitch exists there's
only a small probability of capturing it, I might test a given
sequence 10,000 times without catching a glitch that I would have seen
on the 10,001st.

--
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order. See above to fix.
 
Rob Gaddi <rgaddi@technologyhighland.invalid> wrote:

I've got a 24-input AND gate that I'd like to avoid having add another
register delay to before I toss it across a clock boundary.

all_done <= and_reduce(done);

If I just do it, AND it all together without a flop on the output, does
anyone know whether I'll get transition glitches (an output of 1 when
not all inputs are 1)?

I seem to remember something about individual LUTs being glitch-free,
and the synthesizer has to compose my giant AND out of either a LUT
tree or a mess o' LUTs "wire-and" driving a carry chain, Offhand, it
seems like neither structure should glitch. "Try it and see" doesn't
work; testing all 2^24 combinations and trying to determine whether I
get a glitch would be a beast of an effort.

Anyone know offhand?
You'll get glitches for sure due to different routing delays between
the LUTs themselves and the input signals.

--
Failure does not prove something is impossible, failure simply
indicates you are not using the right tools...
nico@nctdevpuntnl (punt=.)
--------------------------------------------------------------
 
Rob Gaddi <rgaddi@technologyhighland.invalid> wrote:
On Thu, 24 May 2012 14:24:53 +0000 (UTC)
(snip, I wrote)
Given that it is an AND of done bits, in the usual case each
one transitions from 0 to 1 until all are 1.

That's exactly right, and I feel silly for not having mentioned
it out front. The assertion of all_done, shifted onto a different
clock domain, is what goes back around and asynchronously
clears all of the individual done flags.
You mean a register between all_done and when it goes back around?
I would call it asynchronous if there wasn't a register there.

All of you stop wincing; I know exactly how bad that sounds and I
swear to god there are reasons it had to be this way.
If there is a zero hold time register (And even if not, the 0
signal won't make it back fast enough to cause problems.)
then that is fine. If there is no register around the loop,
then it takes somewhat more analysis.

So the only possible modes of operation are:
all_done is low, and all I want is for it to never erroneously
transition high until all 24 done flags come up. Once it's up it
stays up.
You don't say about any possibly glitches on inputs to the AND.

all_done has been high, causing a 20 ns asynchronous clear pulse to
hit all the done flags, dropping them all. In this case, all_done
should drop once and only once as the various path skews from the
pulse to the clear to the AND gate structure all work themselves out.
The only way you could know the pulse was 20ns is coming from a FF
clocked at 50MHz, so I will assume that. In that case, you only have
to worry about glitches that last longer than 20ns. That is, after
all_done has been latched, could it glitch low 20ns later. The AND
gate shouldn't do that itself, but if its inputs can, that could
still be a problem. But 20ns is pretty long, so you should be able
to prove that can't happen.

The reason I'm concerned about glitches is, because all_done is sampled
asynchronously to it's originating clock, a glitch could happen to get
captured, with some unknown but small probability. The reason I can't
just test it out is the same: given that if a glitch exists there's
only a small probability of capturing it, I might test a given
sequence 10,000 times without catching a glitch that I would have seen
on the 10,001st.
There is another problem not discussed yet: metastability.
Metastability is completely separate from race conditions, but
people sometimes get the two confused. You need to separately
show that metatability isn't a problem.

I am still not sure where the registers are, so I won't try to
say more about metastability.

Also, if there is no register around the loop (which seems
unlikely mentioning clock domains) then it takes different
treatment.

-- glen
 
On Friday, May 25, 2012 7:13:51 AM UTC+12, Rob Gaddi wrote:
The reason I'm concerned about glitches is, because all_done is sampled
asynchronously to it's originating clock, a glitch could happen to get
captured, with some unknown but small probability. The reason I can't
just test it out is the same: given that if a glitch exists there's
only a small probability of capturing it, I might test a given
sequence 10,000 times without catching a glitch that I would have seen
on the 10,001st.
Since this seems to be a slow handshake, and you do not expect it to spin at 50MHz, but you do want to avoid another register in the 'go' pathway, why not design your state handshake, so single-pulse glitches are tolerated/ignored ?
 
On Thu, 24 May 2012 14:24:53 +0000 (UTC)
glen herrmannsfeldt <gah@ugcs.caltech.edu> wrote:


Given that it is an AND of done bits, in the usual case each
one transitions from 0 to 1 until all are 1.


That's exactly right, and I feel silly for not having mentioned it out
front. The assertion of all_done, shifted onto a different clock
domain, is what goes back around and asynchronously
clears all of the individual done flags.

All of you stop wincing; I know exactly how bad that sounds and I
swear to god there are reasons it had to be this way.

So the only possible modes of operation are:
all_done is low, and all I want is for it to never erroneously
transition high until all 24 done flags come up. Once it's up it
stays up.

all_done has been high, causing a 20 ns asynchronous clear pulse to
hit all the done flags, dropping them all. In this case, all_done
should drop once and only once as the various path skews from the
pulse to the clear to the AND gate structure all work themselves out.

The reason I'm concerned about glitches is, because all_done is sampled
asynchronously to it's originating clock, a glitch could happen to get
captured, with some unknown but small probability. The reason I can't
just test it out is the same: given that if a glitch exists there's
only a small probability of capturing it, I might test a given
sequence 10,000 times without catching a glitch that I would have seen
on the 10,001st.

--
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order. See above to fix.
Perhaps a silly question, but why can you not register 'all_done'?

For extra safety, don't assert 'clear_dones' until 2 or 3 successiv
highs.


---------------------------------------
Posted through http://www.FPGARelated.com
 

Welcome to EDABoard.com

Sponsor

Back
Top