DC fifo behaviour at underflow/overflow

kaz · Dec 15, 2012

We all know that a fifo should operate without getting empty or full. Doe

anybody have experience of what sort of output disorder can one expec
when
operating in the wrong state (underflow or overflow).

I am asking that because naturally one thinks of some data samples gettin
lost
when a fifo is in this wrong state but I am facing another output patter
at
final system output and trying to find a cause. The pattern I get is an
odd/even offset by some 8 samples in one case or every 8th sampl
duplicated in
another case. For case1 if I realign that stream it gets correct so I a
not
actually losing samples. The system is too large and remotely tested an
there
is not much room to do any test at the time being.

I have suspicion of a dc fifo in the path that may enter wrong
state(underflow/overflow). It is altera dc fifo in stratix iv writing on
~368MHz clock and reading on ~245MHz, 32bits wide and 8 words deep.

Any thoughts appreciated

Kaz

---------------------------------------
Posted through http://www.FPGARelated.com

Andy Bartlett · Dec 15, 2012

"kaz" <3619@embeddedrelated> wrote in message
news:GLidnQK34ou08lHNnZ2dnUVZ_tGdnZ2d@giganews.com...

We all know that a fifo should operate without getting empty or full. Does

anybody have experience of what sort of output disorder can one expect
when
operating in the wrong state (underflow or overflow).

I am asking that because naturally one thinks of some data samples getting
lost
when a fifo is in this wrong state but I am facing another output pattern
at
final system output and trying to find a cause. The pattern I get is an
odd/even offset by some 8 samples in one case or every 8th sample
duplicated in
another case. For case1 if I realign that stream it gets correct so I am
not
actually losing samples. The system is too large and remotely tested and
there
is not much room to do any test at the time being.

I have suspicion of a dc fifo in the path that may enter wrong
state(underflow/overflow). It is altera dc fifo in stratix iv writing on
~368MHz clock and reading on ~245MHz, 32bits wide and 8 words deep.

Any thoughts appreciated

Kaz

I would never let a FIFO over or under flow. You should always stop writing
to the FIFO if the full flag is set and discard the input data stream. If
the empty flag is set you should not read from the FIFO - instead output
known dummy data (invariably I output all zero's).

Following this rule the behaviour of the FIFO is totally predictable.

Andy

kaz · Dec 15, 2012

I would never let a FIFO over or under flow. You should always sto
writing
to the FIFO if the full flag is set and discard the input data stream. I

the empty flag is set you should not read from the FIFO - instead output
known dummy data (invariably I output all zero's).

Following this rule the behaviour of the FIFO is totally predictable.

Andy

Thanks Andy. No question that fifo is meant to be working away fro
underflow
or overflow. What I am asking is there any known patterns that coul
emerge
- after all - within this unpredictibility. Here I am asking about known
symptoms of wrong behaviour really.

Kaz

---------------------------------------
Posted through http://www.FPGARelated.com

Andy Bartlett · Dec 15, 2012

"kaz" <3619@embeddedrelated> wrote in message
news:AaydnUTdO7FfE1HNnZ2dnUVZ_iydnZ2d@giganews.com...

I would never let a FIFO over or under flow. You should always stop
writing
to the FIFO if the full flag is set and discard the input data stream. If

the empty flag is set you should not read from the FIFO - instead output
known dummy data (invariably I output all zero's).

Following this rule the behaviour of the FIFO is totally predictable.

Andy

Thanks Andy. No question that fifo is meant to be working away from
underflow
or overflow. What I am asking is there any known patterns that could
emerge
- after all - within this unpredictibility. Here I am asking about known
symptoms of wrong behaviour really.

Kaz

Depends how the FIFO is constructed.

If it is as a dual port RAM with an incrementable write pointer on the input
port, and an incrementable read pointer on the output port then if you fill
it to full - stop writing - then keep pulling data from the read port it
will act as a circular buffer with data that will repeat over a number of
cycles which will equal the FIFO length.

You can work out other scenarios for this architecture yourself, for sure.

Andy

kaz · Dec 15, 2012

"kaz" <3619@embeddedrelated> wrote in message
news:AaydnUTdO7FfE1HNnZ2dnUVZ_iydnZ2d@giganews.com...

I would never let a FIFO over or under flow. You should always stop
writing
to the FIFO if the full flag is set and discard the input data stream
If

the empty flag is set you should not read from the FIFO - instea
output
known dummy data (invariably I output all zero's).

Following this rule the behaviour of the FIFO is totally predictable.

Andy

Thanks Andy. No question that fifo is meant to be working away from
underflow
or overflow. What I am asking is there any known patterns that could
emerge
- after all - within this unpredictibility. Here I am asking abou
known
symptoms of wrong behaviour really.

Kaz

Depends how the FIFO is constructed.

If it is as a dual port RAM with an incrementable write pointer on th
input
port, and an incrementable read pointer on the output port then if yo
fill
it to full - stop writing - then keep pulling data from the read port it
will act as a circular buffer with data that will repeat over a number o

cycles which will equal the FIFO length.

You can work out other scenarios for this architecture yourself, fo
sure.

Andy

My crucial point is:
Is there anyway this altera fifo will break up the stream into anothe
stream
with even samples ahead of its odd half by 8 samples?

Kaz

---------------------------------------
Posted through http://www.FPGARelated.com

kaz · Dec 15, 2012

My crucial point is:
Is there anyway this altera fifo will break up the stream into another
stream
with even samples ahead of its odd half by 8 samples?

Kaz

---------------------------------------
Posted through http://www.FPGARelated.com

Let me rephrase the problem.
It may not be that the presumed fifo problem is a case o
underflow/overflow
but rather it is a timing problem or both mixed up.

dc fifos protect against metastability to some degree but a failure could
occur.
The cross-domain paths are made false by default understandably. So is't

case of loss of functionality for some time intermittently that has to be
accepted. The error stays for several tens of msec then disappears. Don'
we
expect fifos to recover more quickly(its internal sync pipeline is set t
3).

Kaz

---------------------------------------
Posted through http://www.FPGARelated.com

Allan Herriman · Dec 16, 2012

On Sat, 15 Dec 2012 09:47:35 -0600, kaz wrote:

"kaz" <3619@embeddedrelated> wrote in message
news:AaydnUTdO7FfE1HNnZ2dnUVZ_iydnZ2d@giganews.com...

I would never let a FIFO over or under flow. You should always stop
writing
to the FIFO if the full flag is set and discard the input data stream.
If

the empty flag is set you should not read from the FIFO - instead
output
known dummy data (invariably I output all zero's).

Following this rule the behaviour of the FIFO is totally predictable.

Andy

Thanks Andy. No question that fifo is meant to be working away from
underflow or overflow. What I am asking is there any known patterns
that could emerge - after all - within this unpredictibility. Here I
am asking about
known
symptoms of wrong behaviour really.

Kaz

Depends how the FIFO is constructed.

If it is as a dual port RAM with an incrementable write pointer on the
input
port, and an incrementable read pointer on the output port then if you
fill
it to full - stop writing - then keep pulling data from the read port it
will act as a circular buffer with data that will repeat over a number
of

cycles which will equal the FIFO length.

You can work out other scenarios for this architecture yourself, for
sure.

Andy

My crucial point is:
Is there anyway this altera fifo will break up the stream into another
stream with even samples ahead of its odd half by 8 samples?

I saw a DC (dual clock) FIFO do something like that once. It was in a
Xilinx part, but the design error would apply equally well to an Altera
part or an ASIC.

It was part of an IP core that a client had bought. To make a 64 bit
wide FIFO, the IP developer had used two 32 bit wide FIFOs in parallel.
The two FIFOs had independent control circuits.

Of course, as a dual clock FIFO, one can't really make any guarantees
about the depth immediately after an asynchronous reset when the clocks
are running, and indeed the two halves of the FIFO would start with
different depths sometimes. There was no circuit to check for this state
and get them back into sync and the end result was until the next reset,
32 bit chunks of data were swapped around.

Regards,
Allan

Michael S · Dec 16, 2012

On Dec 15, 8:14 pm, "kaz" <3619@embeddedrelated> wrote:

My crucial point is:
Is there anyway this altera fifo will break up the stream into another
stream
with even samples ahead of its odd half by 8 samples?

Kaz

---------------------------------------
Posted throughhttp://www.FPGARelated.com

Let me rephrase the problem.
It may not be that the presumed fifo problem is a case of
underflow/overflow
but rather it is a timing problem or both mixed up.

The symptoms look exactly like underflow/overflow.

dc fifos protect against metastability to some degree but a failure could
occur.
The cross-domain paths are made false by default understandably. So is't a

case of loss of functionality for some time intermittently that has to be
accepted. The error stays for several tens of msec then disappears. Don't
we
expect fifos to recover more quickly(its internal sync pipeline is set to
3).

According to the dcfifo help, value of 3 is internally translated to
1, which for very high clock rates that you are using is almost
certainly insufficient. Try 4.

Kaz

---------------------------------------
Posted throughhttp://www.FPGARelated.com

Did you pay attention to DELAY_RDUSEDW/DELAY_WRUSEDW parameters?
Altera's default value (1) is unintuitive and, in my experience, tends
to cause problems. If you rely on exact values of rdusedw or wrusedw
ports for anything non-trivial, I'd recommend to set respective
DELAY_xxUSEDW to 0.
I'd also set OVERFLOW_CHECKING/UNDERFLOW_CHECKING to "OFF" and do
underflow/overflow prevention in my own logic.

BTW, personally, I wouldn't use Altera's 8-deep FIFOs, they don't
appear to be as well tested as their deeper relatives. Or, may be,
it's just me.

kaz · Dec 16, 2012

Many thanks for your contributions.

The fifo I am using is very basic: 32 bits wide, 8 words deep, no reset,
3 stage synchroniser, write and read connected directly(combinatorially
to
full/empty flags, word count not used, clocks(wr/rd 368/245).

I am trying to put my head deeper into how a fifo might work internally.
Assuming a simplest case, I understand the write pointer is clocked b
write
clock and it increments on write request (counting is binary or Gray).
The read pointer mirrors that on the read side.

The signals that cross the clock domain are the empty/full flag (in my cas

as I am not using the word counts).

What now mystifies me is that if anything went wrong be it flow issue or
timing then wouldn't these counters just increment from where they migh
have
landed implying self recovery, excluding the case that read pointer i
ahead
of write pointer (as assumed in my case because samples are read out
correctly each but misaligned).

I mean to get 8 samples odd/even misalignment I can only think of pointer

going crazy or address arriving crazy but regular.

Kaz

---------------------------------------
Posted through http://www.FPGARelated.com

kaz · Dec 16, 2012

On Dec 16, 12:39=A0pm, "kaz" <3619@embeddedrelated> wrote:
Many thanks for your contributions.

The fifo I am using is very basic: 32 bits wide, 8 words deep, n
reset,
3 stage synchroniser, write and read connecte
directly(combinatorially)
to
full/empty flags, word count not used, clocks(wr/rd 368/245).

I am trying to put my head deeper into how a fifo might wor
internally.
Assuming a simplest case, I understand the write pointer is clocked by
write
clock and it increments on write request (counting is binary or Gray).

Gray.

The read pointer mirrors that on the read side.

The signals that cross the clock domain are the empty/full flag (in m
ca=
se
as I am not using the word counts).

No, it does not work like that.
The signals that cross clock domains are:
* write pointer - resynchronized variant of it is used in the rdclk
clock domain to generate rdempty and rdusedw
* read pointer - resynchronized variant of it is used in the wrclk
clock domain to generate wrempty and wrusedw

I agree that a resynchronised variant of write pointer will be used to
generate rdempty and rdusew in other domain but not for read pointer itsel

i.e. each side has its own pointer.

What now mystifies me is that if anything went wrong be it flow issu
or
timing then wouldn't these counters just increment from where the
might
have
landed implying self recovery, excluding the case that read pointer is
ahead
of write pointer (as assumed in my case because samples are read out
correctly each but misaligned).

Your write clock is faster than your read clock, so, supposedly, your
wrreq has <100% duty cicle, right?
The exact effect of underflow/overflow will depend on specific pattern
applied to wrreq.

You wrote above that wrreq is connected directly to wrfull. Does it
mean that wrreq depends *only* on wrfull or does wrreq logic equation
has some additional terms?

yes the input rate is controlled by valid being active in 2/3 rati
regularly.
The read side is always active if fifo is not empty.

Kaz

---------------------------------------
Posted through http://www.FPGARelated.com

Michael S · Dec 16, 2012

On Dec 16, 12:39 pm, "kaz" <3619@embeddedrelated> wrote:

Many thanks for your contributions.

The fifo I am using is very basic: 32 bits wide, 8 words deep, no reset,
3 stage synchroniser, write and read connected directly(combinatorially)
to
full/empty flags, word count not used, clocks(wr/rd 368/245).

I am trying to put my head deeper into how a fifo might work internally.
Assuming a simplest case, I understand the write pointer is clocked by
write
clock and it increments on write request (counting is binary or Gray).

Gray.

The read pointer mirrors that on the read side.

The signals that cross the clock domain are the empty/full flag (in my case
as I am not using the word counts).

No, it does not work like that.
The signals that cross clock domains are:
* write pointer - resynchronized variant of it is used in the rdclk
clock domain to generate rdempty and rdusedw
* read pointer - resynchronized variant of it is used in the wrclk
clock domain to generate wrempty and wrusedw

What now mystifies me is that if anything went wrong be it flow issue or
timing then wouldn't these counters just increment from where they might
have
landed implying self recovery, excluding the case that read pointer is
ahead
of write pointer (as assumed in my case because samples are read out
correctly each but misaligned).

Your write clock is faster than your read clock, so, supposedly, your
wrreq has <100% duty cicle, right?
The exact effect of underflow/overflow will depend on specific pattern
applied to wrreq.

You wrote above that wrreq is connected directly to wrfull. Does it
mean that wrreq depends *only* on wrfull or does wrreq logic equation
has some additional terms?

I mean to get 8 samples odd/even misalignment I can only think of pointers

going crazy or address arriving crazy but regular.

Kaz

---------------------------------------
Posted throughhttp://www.FPGARelated.com

kaz · Dec 16, 2012

Yes there is extra term.
Here is some excerpt:

TX_SRX_FIFO_inst : TX_SRX_FIFO
PORT MAP (
data => TX_SRX_FIFO_DATA,
rdclk => iCLK245,
rdreq => TX_SRX_FIFO_rdreq,
wrclk => iCLK368,
wrreq => TX_SRX_FIFO_wrreq,
q => TX_SRX_FIFO_q,
rdempty => TX_SRX_FIFO_empty,
wrfull => TX_SRX_FIFO_full
);

-- 2 in 3 clock enables is used
TX_SRX_FIFO_wrreq <= (Sync_23_1b(1) AND (not TX_SRX_FIFO_full));
TX_SRX_FIFO_rdreq <= not TX_SRX_FIFO_empty;

the clock ratio is 368.64 to 245.76 to be exact.

Kaz

---------------------------------------
Posted through http://www.FPGARelated.com

Michael S · Dec 16, 2012

On Dec 16, 2:53 pm, "kaz" <3619@embeddedrelated> wrote:

On Dec 16, 12:39=A0pm, "kaz" <3619@embeddedrelated> wrote:
Many thanks for your contributions.

The fifo I am using is very basic: 32 bits wide, 8 words deep, no
reset,
3 stage synchroniser, write and read connected

directly(combinatorially)

to
full/empty flags, word count not used, clocks(wr/rd 368/245).

I am trying to put my head deeper into how a fifo might work
internally.
Assuming a simplest case, I understand the write pointer is clocked by
write
clock and it increments on write request (counting is binary or Gray).

Gray.

The read pointer mirrors that on the read side.

The signals that cross the clock domain are the empty/full flag (in my
ca> >se
as I am not using the word counts).

No, it does not work like that.
The signals that cross clock domains are:
* write pointer - resynchronized variant of it is used in the rdclk
clock domain to generate rdempty and rdusedw
* read pointer - resynchronized variant of it is used in the wrclk
clock domain to generate wrempty and wrusedw

I agree that a resynchronised variant of write pointer will be used to
generate rdempty and rdusew in other domain but not for read pointer itself

i.e. each side has its own pointer.

Of course.
Each side has it's own pointer + resynchronised copy of other side's
pointer

What now mystifies me is that if anything went wrong be it flow issue
or
timing then wouldn't these counters just increment from where they
might
have
landed implying self recovery, excluding the case that read pointer is
ahead
of write pointer (as assumed in my case because samples are read out
correctly each but misaligned).

Your write clock is faster than your read clock, so, supposedly, your
wrreq has <100% duty cicle, right?
The exact effect of underflow/overflow will depend on specific pattern
applied to wrreq.

You wrote above that wrreq is connected directly to wrfull. Does it
mean that wrreq depends *only* on wrfull or does wrreq logic equation
has some additional terms?

yes the input rate is controlled by valid being active in 2/3 ratio
regularly.
The read side is always active if fifo is not empty.

Kaz

---------------------------------------
Posted throughhttp://www.FPGARelated.com

yes = *only* wrfull or yes=additional terms?
If the former, where wrdata is coming from?

Can you post here a representative excerpt from your design ?

Michael S · Dec 16, 2012

On Dec 16, 3:57 pm, "kaz" <3619@embeddedrelated> wrote:

Yes there is extra term.
Here is some excerpt:

TX_SRX_FIFO_inst : TX_SRX_FIFO
PORT MAP (
data => TX_SRX_FIFO_DATA,
rdclk => iCLK245,
rdreq => TX_SRX_FIFO_rdreq,
wrclk => iCLK368,
wrreq => TX_SRX_FIFO_wrreq,
q => TX_SRX_FIFO_q,
rdempty => TX_SRX_FIFO_empty,
wrfull => TX_SRX_FIFO_full
);

-- 2 in 3 clock enables is used
TX_SRX_FIFO_wrreq <= (Sync_23_1b(1) AND (not TX_SRX_FIFO_full));
TX_SRX_FIFO_rdreq <= not TX_SRX_FIFO_empty;

the clock ratio is 368.64 to 245.76 to be exact.

Kaz

---------------------------------------
Posted throughhttp://www.FPGARelated.com

WOW, I reproduced the behavior that you describe (non-recovery after
overflow) in functional simulation with Altera's internal simulator!
I never imagined that anything like that is possible.
Sounds like bug in implementation of dcfifo. Of course Altera will
call this bug a feature and will say that as long as there was
overflow nothing could be guaranteed. Or similar bullsheet.
I am writing sequential counter and see the pattern like (64, 57, 66,
59, 68, 61, 70, 63...) on the read side.
To be continued...

kaz · Dec 16, 2012

On Dec 16, 6:00=A0pm, Michael S <already5cho...@yahoo.com> wrote:
On Dec 16, 3:57=A0pm, "kaz" <3619@embeddedrelated> wrote:

Yes there is extra term.
Here is some excerpt:

TX_SRX_FIFO_inst : TX_SRX_FIFO
=A0 PORT MAP (
=A0 =A0 data =A0 =A0 =3D> TX_SRX_FIFO_DATA,
=A0 =A0 rdclk =A0 =A0=3D> iCLK245,
=A0 =A0 rdreq =A0 =A0=3D> TX_SRX_FIFO_rdreq,
=A0 =A0 wrclk =A0 =A0=3D> iCLK368,
=A0 =A0 wrreq =A0 =A0=3D> TX_SRX_FIFO_wrreq,
=A0 =A0 q =A0 =A0 =A0 =A0=3D> TX_SRX_FIFO_q,
=A0 =A0 rdempty =A0=3D> TX_SRX_FIFO_empty,
=A0 =A0 wrfull =A0 =3D> TX_SRX_FIFO_full
=A0 =A0 );

=A0 =A0 -- 2 in 3 clock enables is used
=A0 =A0 TX_SRX_FIFO_wrreq <=3D (Sync_23_1b(1) AND (no
TX_SRX_FIFO_full=
));
=A0 =A0 TX_SRX_FIFO_rdreq <=3D not TX_SRX_FIFO_empty;

the clock ratio is 368.64 to 245.76 to be exact.

Kaz

---------------------------------------
Posted throughhttp://www.FPGARelated.com

WOW, I reproduced the behavior that you describe =A0(non-recovery after
overflow) in functional simulation with Altera's internal simulator!
I never imagined that anything like that is possible.
Sounds like bug in implementation of dcfifo. Of course Altera will
call this bug a feature and will say that as long as there was
overflow nothing could be guaranteed. Or similar bullsheet.
I am writing sequential counter and see the pattern like (64, 57, 66,
59, 68, 61, 70, 63...) on the read side.
To be continued...

Few more observations:
1. The problem is not limited to 8-deep DCFIFO. 16-deep DCFIFO could
be easily forced into the same "mad" state.
2. A single write into full FIFO is not enough to trigger the problem.
You have to write to full FIFO 3 times in a row. Which, generally
should never happen even in presence of poorly prevented
metastability.
3. So, in order to force FIFO into "mad" state you have to do stupid
sequence on the write side. But when FIFO is already mad, it's a read
side that is keeping it here. Somehow, it stops correctly detecting
rdempty condition.

What would I do?
1. I'd increase RDSYNC_DELAYPIPE/WRSYNC_DELAYPIPE to 4. It's very
unlikely that the problem is here, but for such high clock
frequencies the value of 3 is still wrong.
2. I'd start looking for race condition type of bug. Like feeding one
clock domain with vector, generated in other clock domain. If you
don't know all parts of design then try to look at Timequest "clock
transfers" display. It could be helpful.
3. In the longer run, I'd redesign the whole synchronization block.
IMHO, a design that has maximal FIFO read throughput exactly equal to
*nominal* write throughput is not sufficiently robust. I'd very much
prefer maximal read throughput to be at least 1% higher. Then your
FIFO will be most of the time close to empty and the block as whole
will be "self-curing". As additional benefit, you will have more
predictable latency through FIFO. Even if latency is not important for
your main functionality, it's good for easier debugging.

Thanks so much Michael. It is great that you thought of simulating fifo in
this mad state. I will try reproduce that. I assume you are doin
functional
simulation.
The interesting thing is that we got many fifos in our system but only on
is
misbehaving.

Kaz

---------------------------------------
Posted through http://www.FPGARelated.com

Michael S · Dec 16, 2012

On Dec 16, 6:00 pm, Michael S <already5cho...@yahoo.com> wrote:

On Dec 16, 3:57 pm, "kaz" <3619@embeddedrelated> wrote:

Yes there is extra term.
Here is some excerpt:

TX_SRX_FIFO_inst : TX_SRX_FIFO
PORT MAP (
data => TX_SRX_FIFO_DATA,
rdclk => iCLK245,
rdreq => TX_SRX_FIFO_rdreq,
wrclk => iCLK368,
wrreq => TX_SRX_FIFO_wrreq,
q => TX_SRX_FIFO_q,
rdempty => TX_SRX_FIFO_empty,
wrfull => TX_SRX_FIFO_full
);

-- 2 in 3 clock enables is used
TX_SRX_FIFO_wrreq <= (Sync_23_1b(1) AND (not TX_SRX_FIFO_full));
TX_SRX_FIFO_rdreq <= not TX_SRX_FIFO_empty;

the clock ratio is 368.64 to 245.76 to be exact.

Kaz

---------------------------------------
Posted throughhttp://www.FPGARelated.com

WOW, I reproduced the behavior that you describe (non-recovery after
overflow) in functional simulation with Altera's internal simulator!
I never imagined that anything like that is possible.
Sounds like bug in implementation of dcfifo. Of course Altera will
call this bug a feature and will say that as long as there was
overflow nothing could be guaranteed. Or similar bullsheet.
I am writing sequential counter and see the pattern like (64, 57, 66,
59, 68, 61, 70, 63...) on the read side.
To be continued...

Few more observations:
1. The problem is not limited to 8-deep DCFIFO. 16-deep DCFIFO could
be easily forced into the same "mad" state.
2. A single write into full FIFO is not enough to trigger the problem.
You have to write to full FIFO 3 times in a row. Which, generally
should never happen even in presence of poorly prevented
metastability.
3. So, in order to force FIFO into "mad" state you have to do stupid
sequence on the write side. But when FIFO is already mad, it's a read
side that is keeping it here. Somehow, it stops correctly detecting
rdempty condition.

What would I do?
1. I'd increase RDSYNC_DELAYPIPE/WRSYNC_DELAYPIPE to 4. It's very
unlikely that the problem is here, but for such high clock
frequencies the value of 3 is still wrong.
2. I'd start looking for race condition type of bug. Like feeding one
clock domain with vector, generated in other clock domain. If you
don't know all parts of design then try to look at Timequest "clock
transfers" display. It could be helpful.
3. In the longer run, I'd redesign the whole synchronization block.
IMHO, a design that has maximal FIFO read throughput exactly equal to
*nominal* write throughput is not sufficiently robust. I'd very much
prefer maximal read throughput to be at least 1% higher. Then your
FIFO will be most of the time close to empty and the block as whole
will be "self-curing". As additional benefit, you will have more
predictable latency through FIFO. Even if latency is not important for
your main functionality, it's good for easier debugging.

Michael S · Dec 16, 2012

On Dec 16, 8:02 pm, "kaz" <3619@embeddedrelated> wrote:

On Dec 16, 6:00=A0pm, Michael S <already5cho...@yahoo.com> wrote:
On Dec 16, 3:57=A0pm, "kaz" <3619@embeddedrelated> wrote:

Yes there is extra term.
Here is some excerpt:

TX_SRX_FIFO_inst : TX_SRX_FIFO
=A0 PORT MAP (
=A0 =A0 data =A0 =A0 =3D> TX_SRX_FIFO_DATA,
=A0 =A0 rdclk =A0 =A0=3D> iCLK245,
=A0 =A0 rdreq =A0 =A0=3D> TX_SRX_FIFO_rdreq,
=A0 =A0 wrclk =A0 =A0=3D> iCLK368,
=A0 =A0 wrreq =A0 =A0=3D> TX_SRX_FIFO_wrreq,
=A0 =A0 q =A0 =A0 =A0 =A0=3D> TX_SRX_FIFO_q,
=A0 =A0 rdempty =A0=3D> TX_SRX_FIFO_empty,
=A0 =A0 wrfull =A0 =3D> TX_SRX_FIFO_full
=A0 =A0 );

=A0 =A0 -- 2 in 3 clock enables is used
=A0 =A0 TX_SRX_FIFO_wrreq <=3D (Sync_23_1b(1) AND (not
TX_SRX_FIFO_full> &gt);
=A0 =A0 TX_SRX_FIFO_rdreq <=3D not TX_SRX_FIFO_empty;

the clock ratio is 368.64 to 245.76 to be exact.

Kaz

---------------------------------------
Posted throughhttp://www.FPGARelated.com

WOW, I reproduced the behavior that you describe =A0(non-recovery after
overflow) in functional simulation with Altera's internal simulator!
I never imagined that anything like that is possible.
Sounds like bug in implementation of dcfifo. Of course Altera will
call this bug a feature and will say that as long as there was
overflow nothing could be guaranteed. Or similar bullsheet.
I am writing sequential counter and see the pattern like (64, 57, 66,
59, 68, 61, 70, 63...) on the read side.
To be continued...

Few more observations:
1. The problem is not limited to 8-deep DCFIFO. 16-deep DCFIFO could
be easily forced into the same "mad" state.
2. A single write into full FIFO is not enough to trigger the problem.
You have to write to full FIFO 3 times in a row. Which, generally
should never happen even in presence of poorly prevented
metastability.
3. So, in order to force FIFO into "mad" state you have to do stupid
sequence on the write side. But when FIFO is already mad, it's a read
side that is keeping it here. Somehow, it stops correctly detecting
rdempty condition.

What would I do?
1. I'd increase RDSYNC_DELAYPIPE/WRSYNC_DELAYPIPE to 4. It's very
unlikely that the problem is here, but for such high clock
frequencies the value of 3 is still wrong.
2. I'd start looking for race condition type of bug. Like feeding one
clock domain with vector, generated in other clock domain. If you
don't know all parts of design then try to look at Timequest "clock
transfers" display. It could be helpful.
3. In the longer run, I'd redesign the whole synchronization block.
IMHO, a design that has maximal FIFO read throughput exactly equal to
*nominal* write throughput is not sufficiently robust. I'd very much
prefer maximal read throughput to be at least 1% higher. Then your
FIFO will be most of the time close to empty and the block as whole
will be "self-curing". As additional benefit, you will have more
predictable latency through FIFO. Even if latency is not important for
your main functionality, it's good for easier debugging.

Thanks so much Michael. It is great that you thought of simulating fifo in
this mad state. I will try reproduce that. I assume you are doing
functional
simulation.
The interesting thing is that we got many fifos in our system but only one
is
misbehaving.

Kaz

---------------------------------------
Posted throughhttp://www.FPGARelated.com

I thought a bit more about it. As a result, I am taking back
everything I said about Altera in the post #13.
Altera's dcfifo is o.k. The access pattern is just too troublesome for
overflow recovery, it will cause problems to any reasonable FIFO
implementation.
Sorry, Altera, I was wrong.

Now I'd try to explain the problem:
Immediately after overflow read pointer and write pointer are very
close to each other - on one cycle read pointer pulls ahead of write
pointer and reads sample from 9 writes ago, on the next cycle it fails
behind write pointer and reads the very last write sample, and then
again pulls ahead and so on.
It happens because read machine sees delayed version of write pointer,
trailing read pointer by one or two and then thinks that the FIFO is
almost full. And continues to read. Write machine, on the other hand,
sees delayed version of read pointer, equal to write pointer or
trailing it by one and then thinks that FIFO is either empty or almost
empty. And continues to write.
Since average rate of writing is exactly equal to rate of reading the
recovery fromthis situation can take a lot of time, in case of common
clock source recovery could never happen.

The solution? Assure that overflow/underflow never happens.
If you can't - at least increase the frequency of read clock, as
suggested in my previous post. 1% increase is enough.
If that is too hard too then slightly modify write pattern. Instead of
"++-++-++-++-" do "+++++++++---+++++++++---". That pattern will
guarantee instant overflow/underflow recovery. If, for some reason,
such modification of the write pattern is impossible then do smaller
modification "++++--++++--". This pattern is not safe, but
probabilistically should recover from overflow much faster than yours.

Good luck.

glen herrmannsfeldt · Dec 17, 2012

Michael S <already5chosen@yahoo.com> wrote:

(snip)

WOW, I reproduced the behavior that you describe =A0(non-recovery after
overflow) in functional simulation with Altera's internal simulator!
I never imagined that anything like that is possible.

(snip)

I thought a bit more about it. As a result, I am taking back
everything I said about Altera in the post #13.
Altera's dcfifo is o.k. The access pattern is just too troublesome for
overflow recovery, it will cause problems to any reasonable FIFO
implementation.
Sorry, Altera, I was wrong.

Now I'd try to explain the problem:
Immediately after overflow read pointer and write pointer are very
close to each other - on one cycle read pointer pulls ahead of write
pointer and reads sample from 9 writes ago, on the next cycle it fails
behind write pointer and reads the very last write sample, and then
again pulls ahead and so on.

Last I knew, FIFOs were supposed to have an almost full and almost
empty signal to avoid that problem. Maybe at 7/8 and 1/8.

It happens because read machine sees delayed version of write pointer,
trailing read pointer by one or two and then thinks that the FIFO is
almost full. And continues to read. Write machine, on the other hand,
sees delayed version of read pointer, equal to write pointer or
trailing it by one and then thinks that FIFO is either empty or almost
empty. And continues to write.

If you use the almost full and almost empty, that should leave plenty
of margin for such delays. Even more is needed if the signals are
processed through software.

Since average rate of writing is exactly equal to rate of reading the
recovery fromthis situation can take a lot of time, in case of common
clock source recovery could never happen.

Then, only after writing is finished, flush out all the data with the
actual empty flag.

-- glen

Michael S · Dec 17, 2012

On Dec 16, 3:57 pm, "kaz" <3619@embeddedrelated> wrote:

Yes there is extra term.
Here is some excerpt:

TX_SRX_FIFO_inst : TX_SRX_FIFO
PORT MAP (
data => TX_SRX_FIFO_DATA,
rdclk => iCLK245,
rdreq => TX_SRX_FIFO_rdreq,
wrclk => iCLK368,
wrreq => TX_SRX_FIFO_wrreq,
q => TX_SRX_FIFO_q,
rdempty => TX_SRX_FIFO_empty,
wrfull => TX_SRX_FIFO_full
);

-- 2 in 3 clock enables is used
TX_SRX_FIFO_wrreq <= (Sync_23_1b(1) AND (not TX_SRX_FIFO_full));
TX_SRX_FIFO_rdreq <= not TX_SRX_FIFO_empty;

the clock ratio is 368.64 to 245.76 to be exact.

Kaz

---------------------------------------
Posted throughhttp://www.FPGARelated.com

I'd also like to see a definition of TX_SRX_FIFO.

kaz · Dec 17, 2012

On Dec 16, 3:57=A0pm, "kaz" <3619@embeddedrelated> wrote:
Yes there is extra term.
Here is some excerpt:

TX_SRX_FIFO_inst : TX_SRX_FIFO
=A0 PORT MAP (
=A0 =A0 data =A0 =A0 =3D> TX_SRX_FIFO_DATA,
=A0 =A0 rdclk =A0 =A0=3D> iCLK245,
=A0 =A0 rdreq =A0 =A0=3D> TX_SRX_FIFO_rdreq,
=A0 =A0 wrclk =A0 =A0=3D> iCLK368,
=A0 =A0 wrreq =A0 =A0=3D> TX_SRX_FIFO_wrreq,
=A0 =A0 q =A0 =A0 =A0 =A0=3D> TX_SRX_FIFO_q,
=A0 =A0 rdempty =A0=3D> TX_SRX_FIFO_empty,
=A0 =A0 wrfull =A0 =3D> TX_SRX_FIFO_full
=A0 =A0 );

=A0 =A0 -- 2 in 3 clock enables is used
=A0 =A0 TX_SRX_FIFO_wrreq <=3D (Sync_23_1b(1) AND (no
TX_SRX_FIFO_full))=
;
=A0 =A0 TX_SRX_FIFO_rdreq <=3D not TX_SRX_FIFO_empty;

Hi Michael,

below is definition of fifo.
What troubles me is that write/read are tied up to full/empty respectively
so I don't see why flow problems should occur. Moreover the write/read is
protected internally as well.

Could you also please let me know was it timing simulation that you did?

Thanks

LIBRARY ieee;
USE ieee.std_logic_1164.all;

LIBRARY altera_mf;
USE altera_mf.all;

ENTITY TX_SRX_FIFO IS
PORT (
data : IN STD_LOGIC_VECTOR (31 DOWNTO 0);
rdclk : IN STD_LOGIC ;
rdreq : IN STD_LOGIC ;
wrclk : IN STD_LOGIC ;
wrreq : IN STD_LOGIC ;
q : OUT STD_LOGIC_VECTOR (31 DOWNTO 0);
rdempty : OUT STD_LOGIC ;
wrfull : OUT STD_LOGIC
);
END TX_SRX_FIFO;

ARCHITECTURE SYN OF tx_srx_fifo IS

SIGNAL sub_wire0 : STD_LOGIC ;
SIGNAL sub_wire1 : STD_LOGIC ;
SIGNAL sub_wire2 : STD_LOGIC_VECTOR (31 DOWNTO 0);

COMPONENT dcfifo
GENERIC (
intended_device_family : STRING;
lpm_numwords : NATURAL;
lpm_showahead : STRING;
lpm_type : STRING;
lpm_width : NATURAL;
lpm_widthu : NATURAL;
overflow_checking : STRING;
rdsync_delaypipe : NATURAL;
underflow_checking : STRING;
use_eab : STRING;
wrsync_delaypipe : NATURAL
);
PORT (
wrclk : IN STD_LOGIC ;
rdempty : OUT STD_LOGIC ;
rdreq : IN STD_LOGIC ;
wrfull : OUT STD_LOGIC ;
rdclk : IN STD_LOGIC ;
q : OUT STD_LOGIC_VECTOR (31 DOWNTO 0);
wrreq : IN STD_LOGIC ;
data : IN STD_LOGIC_VECTOR (31 DOWNTO 0)
);
END COMPONENT;

BEGIN
rdempty <= sub_wire0;
wrfull <= sub_wire1;
q <= sub_wire2(31 DOWNTO 0);

dcfifo_component : dcfifo
GENERIC MAP (
intended_device_family => "Stratix IV",
lpm_numwords => 8,
lpm_showahead => "OFF",
lpm_type => "dcfifo",
lpm_width => 32,
lpm_widthu => 3,
overflow_checking => "ON",
rdsync_delaypipe => 5,
underflow_checking => "ON",
use_eab => "ON",
wrsync_delaypipe => 5
)
PORT MAP (
wrclk => wrclk,
rdreq => rdreq,
rdclk => rdclk,
wrreq => wrreq,
data => data,
rdempty => sub_wire0,
wrfull => sub_wire1,
q => sub_wire2
);

END SYN;

Kaz

---------------------------------------
Posted through http://www.FPGARelated.com

DC fifo behaviour at underflow/overflow

kaz

Guest

Andy Bartlett

Guest

kaz

Guest

Andy Bartlett

Guest

kaz

Guest

kaz

Guest

Allan Herriman

Guest

Michael S

Guest

kaz

Guest

kaz

Guest

Michael S

Guest

kaz

Guest

Michael S

Guest

Michael S

Guest

kaz

Guest

Michael S

Guest

Michael S

Guest

glen herrmannsfeldt

Guest

Michael S

Guest

kaz

Guest

Log in

Welcome to EDABoard.com

Sponsor