serial protocol specs and verification

RCIngham · Jul 31, 2013

[snip]

A frame is defined as follows:

- sync :'111'
- header: dtype (4) - n.u.(2) - length (10)
- data : (16) * length

in principle between frames there can be any number of zeros (with bit
stuffing). An 'all zero' pattern in this sense might be of any number of
bits.

[snip]

Unless 'length' is limited, your worst case has header "0000001111111111
(with an extra bit stuffed) followed by 16 * 1023 = 16368 zeros, which wil
have 2728 ones stuffed into them. Total line packet length is 1911
symbols. If the clocks are within 1/19114 of each other, the same number o
symbols will be received as sent, ASSUMING no jitter. You can't assum
that, but if there is 'not much' jitter then perhaps 1/100k will be goo
enough for relative drift to not need to be corrected for.

So, for version 1, use the 'sync' to establish the start of frame and th
sampling point, simulate the 'Rx fast' and 'Rx slow' cases in parallel, an
see whether it works.

BTW, this is off-topic for C.A.F., as it is a system design problem no
related to the implementation method.

---------------------------------------
Posted through http://www.FPGARelated.com

Jul 31, 2013

On Wednesday, July 31, 2013 1:44:17 PM UTC+2, rickman wrote:

On 7/31/2013 3:36 AM, alb wrote:

On 29/07/2013 22:14, rickman wrote:

[]

Everyone's old favorite asynchronous serial RS232 usually uses a

clock at 16x, though I have seen 64x. From the beginning of the

start bit, it counts half a bit time (in clock cycles), verifies

the start bit (and not random noise) then counts whole bits and

decodes at that point. So, the actual decoding is done with a 1X

clock, but with 16 (or 64) possible phase values. It resynchronizes

at the beginning of each character, so it can't get too far off.

Yes, that protocol requires a clock matched to the senders clock to at

least 2.5% IIRC. The protocol the OP describes has much longer char

sequences which implies much tighter clock precision at each end and I'm

expecting it to use a clock recovery circuit... but maybe not. I think

he said they don't use one but get "frequent" errors.

At the physical level the bit stuffing will allow to resync continuously

therefore I'm not concerned if there's a clock recovery circuit.

We are using 40MHz (0.5 ppm stability) but after few seconds you can

already see how many cycles two clocks can drift apart.

I've never analyzed an async design with longer data streams so I don't

know how much precision would be required, but I"m sure you can't do

reliable data recovery with a 2x clock (without a pll). I think this

would contradict the Nyquist criterion.

neatpick mode on

Nyquist criterion has nothing to do with being able to sample data. As a

matter of fact your internal clock is perfectly capable to sample data

flowing in your fpga without the need to be 2x the data rate.

neatpick mode off

I don't know what you are talking about. If you asynchronously sample,

you very much do have to satisfy the Nyquist criterion. A 2x clock,

because it isn't *exactly* 2x, can *not* be used to capture a bitstream

so that you can find the the transitions and know which bit is which.

Otherwise there wouldn't be so many errors in the existing circuit.

In my earlier comments when I'm talking about a PLL I am referring to a

digital PLL. I guess I should have said a DPLL.

Why bothering? If you have a PLL on your FPGA you can profit of it,

otherwise you need something fancier.

Not sure of your context. You can't use the PLL on the FPGA to recover

the clock from an arbitrary data stream. It is not designed for that

and will not work because of the gaps in data transitions. It is

designed to allow the multiplication of clock frequencies. A DPLL can

be easily designed to recover the clock, but needs to be greater than 3x

the data rate in order to distinguish the fast condition from the slow

condition.

You can use the FPGA PLL to multiply your clock from 2x to 4x to allow

the DPLL to work correctly.

or do like many USB PHYs, assume the 2x clock in reasonably 50/50 and use
a DDR input flop to sample at 4x

-Lasse

rickman · Jul 31, 2013

On 7/31/2013 2:30 PM, langwadt@fonz.dk wrote:

On Wednesday, July 31, 2013 1:44:17 PM UTC+2, rickman wrote:

You can use the FPGA PLL to multiply your clock from 2x to 4x to allow
the DPLL to work correctly.

or do like many USB PHYs, assume the 2x clock in reasonably 50/50 and use
a DDR input flop to sample at 4x

Yes, that would be interesting to design actually, the logic gets two
bits at the same time rather than one bit, I guess it makes the machine
a bit more complicated in that you have to deal with four states and
four possible input combinations. Still, not a big deal, just a bit of
work on paper to understand the logic needed.

--

Rick

alb · Jul 31, 2013

On 31/07/2013 13:44, rickman wrote:
[]

neatpick mode on
Nyquist criterion has nothing to do with being able to sample data. As a
matter of fact your internal clock is perfectly capable to sample data
flowing in your fpga without the need to be 2x the data rate.
neatpick mode off

I don't know what you are talking about. If you asynchronously sample,
you very much do have to satisfy the Nyquist criterion. A 2x clock,
because it isn't *exactly* 2x, can *not* be used to capture a bitstream
so that you can find the the transitions and know which bit is which.

A data stream which is *exactly* flowing with a frequency f can be
*exactly* sampled with a clock frequency f, it happens continuously in
your synchronous logic. What happened to Nyquist theorem?

If you have a protocol with data and clock, does it mean that you will
recognize only half of the bits because your clock rate is just equal to
your data rate? I'm confused...

IMO calling a signal 'asynchronous' does not make any difference. Mr.
Nyquist referred to reconstructing an analog signal with a discrete
sampling (no quantization error involved). How does that applies to
digital transmission?

Otherwise there wouldn't be so many errors in the existing circuit.

It does not work not because of Nyquist limit, but because the recovery
of a phase shift cannot be done with just two clocks per bit.

[]

You can use the FPGA PLL to multiply your clock from 2x to 4x to allow
the DPLL to work correctly.

This is what I meant indeed. I believe I confused DPLL with ADPLL...

alb · Jul 31, 2013

On 31/07/2013 15:36, RCIngham wrote:

A frame is defined as follows:

- sync :'111'
- header: dtype (4) - n.u.(2) - length (10)
- data : (16) * length

in principle between frames there can be any number of zeros (with bit
stuffing). An 'all zero' pattern in this sense might be of any number of
bits.

[snip]

Unless 'length' is limited, your worst case has header "0000001111111111"
(with an extra bit stuffed) followed by 16 * 1023 = 16368 zeros, which will
have 2728 ones stuffed into them. Total line packet length is 19113
symbols.

Why you excluded the sync symbol?

If the clocks are within 1/19114 of each other, the same number of

symbols will be received as sent, ASSUMING no jitter.

5*10e-5 is a very large difference. We are using 0.5 ppm oscillators.
The amount of symbols received has to take into account phase shift
otherwise bits will be lost or oversampled.

You can't assume
that, but if there is 'not much' jitter then perhaps 1/100k will be good
enough for relative drift to not need to be corrected for.

Still not sure what you are trying to say.

So, for version 1, use the 'sync' to establish the start of frame and the
sampling point, simulate the 'Rx fast' and 'Rx slow' cases in parallel, and
see whether it works.

by saying 'in parallel' you mean a data stream with some bits slower and
some faster?
I think the main problem lies on the slight difference in clock
frequencies which lead to increasing phase shift to the point where a
bit is lost or oversampled.

BTW, this is off-topic for C.A.F., as it is a system design problem not
related to the implementation method.

IMO is an implementation issue, no specs will tell me how many times I
need to sample the data stream. The system design does not have a
problem IMO, it simply specify the protocol between two modules. But I
will be more than happy if you could point me out to some more
appropriate group.

Aug 1, 2013

On Wednesday, July 31, 2013 11:37:59 PM UTC+2, alb wrote:

On 31/07/2013 13:44, rickman wrote:

[]

neatpick mode on

Nyquist criterion has nothing to do with being able to sample data. As a

matter of fact your internal clock is perfectly capable to sample data

flowing in your fpga without the need to be 2x the data rate.

neatpick mode off

I don't know what you are talking about. If you asynchronously sample,

you very much do have to satisfy the Nyquist criterion. A 2x clock,

because it isn't *exactly* 2x, can *not* be used to capture a bitstream

so that you can find the the transitions and know which bit is which.

A data stream which is *exactly* flowing with a frequency f can be

*exactly* sampled with a clock frequency f, it happens continuously in

your synchronous logic. What happened to Nyquist theorem?

If you have a protocol with data and clock, does it mean that you will

recognize only half of the bits because your clock rate is just equal to

your data rate? I'm confused...

IMO calling a signal 'asynchronous' does not make any difference. Mr.

Nyquist referred to reconstructing an analog signal with a discrete

sampling (no quantization error involved). How does that applies to

digital transmission?

Otherwise there wouldn't be so many errors in the existing circuit.

It does not work not because of Nyquist limit, but because the recovery

of a phase shift cannot be done with just two clocks per bit.

may not technically be Nyquist limit, but like so many things in nature the
same relations are repeated

and if you take NRZ you'll notice that the highest "frequency" (0101010101..)
is only half of the data rate

-Lasse

Richard Damon · Aug 1, 2013

On 7/31/13 9:36 AM, RCIngham wrote:

Unless 'length' is limited, your worst case has header "0000001111111111"
(with an extra bit stuffed) followed by 16 * 1023 = 16368 zeros, which will
have 2728 ones stuffed into them. Total line packet length is 19113
symbols. If the clocks are within 1/19114 of each other, the same number of
symbols will be received as sent, ASSUMING no jitter. You can't assume
that, but if there is 'not much' jitter then perhaps 1/100k will be good
enough for relative drift to not need to be corrected for.

So, for version 1, use the 'sync' to establish the start of frame and the
sampling point, simulate the 'Rx fast' and 'Rx slow' cases in parallel, and
see whether it works.

BTW, this is off-topic for C.A.F., as it is a system design problem not
related to the implementation method.

Since you can resynchronize your sampling clock on each transition
received, you only need to "hold lock" for the maximum time between
transitions, which is 7 bit times. This would mean that if you have a
nominal 4x clock, some sample points will be only 3 clocks apart (if you
are slow) or some will be 5 clocks apart (if you are fast), while most
will be 4 clock apart. This is the reason for the 1 bit stuffing.

RCIngham · Aug 1, 2013

On 7/31/13 9:36 AM, RCIngham wrote:
[snip]

Unless 'length' is limited, your worst case has heade
"0000001111111111"
(with an extra bit stuffed) followed by 16 * 1023 = 16368 zeros, whic
will
have 2728 ones stuffed into them. Total line packet length is 19113
symbols. If the clocks are within 1/19114 of each other, the same numbe
of
symbols will be received as sent, ASSUMING no jitter. You can't assume
that, but if there is 'not much' jitter then perhaps 1/100k will b
good
enough for relative drift to not need to be corrected for.

So, for version 1, use the 'sync' to establish the start of frame an
the
sampling point, simulate the 'Rx fast' and 'Rx slow' cases in parallel
and
see whether it works.

BTW, this is off-topic for C.A.F., as it is a system design problem not
related to the implementation method.

Since you can resynchronize your sampling clock on each transition
received, you only need to "hold lock" for the maximum time between
transitions, which is 7 bit times. This would mean that if you have a
nominal 4x clock, some sample points will be only 3 clocks apart (if you
are slow) or some will be 5 clocks apart (if you are fast), while most
will be 4 clock apart. This is the reason for the 1 bit stuffing.

The bit-stuffing in long sequences of zeroes is almost certainly there t
facilitate a conventional clock recovery method, which I am proposing no
using PROVIDED THAT the clocks at each end are within a sufficiently tigh
tolerance. Detect the ones in the as-sent stream first, then decide whic
are due to bit-stuffing, and remove them.

Deciding how tight a tolerance is 'sufficiently tight' is probabl
non-trivial, so I won't be doing it for free.

---------------------------------------
Posted through http://www.FPGARelated.com

alb · Aug 1, 2013

On 01/08/2013 11:56, RCIngham wrote:
[]

Since you can resynchronize your sampling clock on each transition
received, you only need to "hold lock" for the maximum time between
transitions, which is 7 bit times. This would mean that if you have a
nominal 4x clock, some sample points will be only 3 clocks apart (if you
are slow) or some will be 5 clocks apart (if you are fast), while most
will be 4 clock apart. This is the reason for the 1 bit stuffing.

The bit-stuffing in long sequences of zeroes is almost certainly there to
facilitate a conventional clock recovery method, which I am proposing not
using PROVIDED THAT the clocks at each end are within a sufficiently tight
tolerance. Detect the ones in the as-sent stream first, then decide which
are due to bit-stuffing, and remove them.

What is the gain of not using 'conventional clock recovery'?

rickman · Aug 2, 2013

On 8/1/2013 8:55 AM, alb wrote:

On 01/08/2013 11:56, RCIngham wrote:
[]
Since you can resynchronize your sampling clock on each transition
received, you only need to "hold lock" for the maximum time between
transitions, which is 7 bit times. This would mean that if you have a
nominal 4x clock, some sample points will be only 3 clocks apart (if you
are slow) or some will be 5 clocks apart (if you are fast), while most
will be 4 clock apart. This is the reason for the 1 bit stuffing.

The bit-stuffing in long sequences of zeroes is almost certainly there to
facilitate a conventional clock recovery method, which I am proposing not
using PROVIDED THAT the clocks at each end are within a sufficiently tight
tolerance. Detect the ones in the as-sent stream first, then decide which
are due to bit-stuffing, and remove them.

What is the gain of not using 'conventional clock recovery'?

I think the point is that if the sequences are short enough that the
available timing tolerance is adequate, then you just don't need to
recover timing from the bit stream.

I've been looking at this, then working on other issues and have lost my
train of thought on this. I believe that a PLL (or DPLL) is not needed
as long as the input can be sampled fast enough and the reference
frequency is matched closely enough. But it is still important to
correct for "phase" as the OP puts it (IIRC) so that you can tell where
the bits are and not sample on transitions, just like a conventional
UART does it. We frequent enough transitions, the phase can be detected
and aligned while the exact frequency does not need to be recovered.

--

Rick

rickman · Aug 2, 2013

On 7/31/2013 5:37 PM, alb wrote:

On 31/07/2013 13:44, rickman wrote:
[]
neatpick mode on
Nyquist criterion has nothing to do with being able to sample data. As a
matter of fact your internal clock is perfectly capable to sample data
flowing in your fpga without the need to be 2x the data rate.
neatpick mode off

I don't know what you are talking about. If you asynchronously sample,
you very much do have to satisfy the Nyquist criterion. A 2x clock,
because it isn't *exactly* 2x, can *not* be used to capture a bitstream
so that you can find the the transitions and know which bit is which.

A data stream which is *exactly* flowing with a frequency f can be
*exactly* sampled with a clock frequency f, it happens continuously in
your synchronous logic. What happened to Nyquist theorem?

If you have a protocol with data and clock, does it mean that you will
recognize only half of the bits because your clock rate is just equal to
your data rate? I'm confused...

IMO calling a signal 'asynchronous' does not make any difference. Mr.
Nyquist referred to reconstructing an analog signal with a discrete
sampling (no quantization error involved). How does that applies to
digital transmission?

Yes, you are right about the rates. I was not thinking of this
correctly. The Nyquist theorem looks at *frequency* content which is
not the same as bit rate.

Otherwise there wouldn't be so many errors in the existing circuit.

It does not work not because of Nyquist limit, but because the recovery
of a phase shift cannot be done with just two clocks per bit.

[]
You can use the FPGA PLL to multiply your clock from 2x to 4x to allow
the DPLL to work correctly.

This is what I meant indeed. I believe I confused DPLL with ADPLL...

I am not familiar with ADPLL. What is that?

--

Rick

Richard Damon · Aug 2, 2013

On 8/1/13 5:56 AM, RCIngham wrote:

On 7/31/13 9:36 AM, RCIngham wrote:
[snip]

Unless 'length' is limited, your worst case has header
"0000001111111111"
(with an extra bit stuffed) followed by 16 * 1023 = 16368 zeros, which
will
have 2728 ones stuffed into them. Total line packet length is 19113
symbols. If the clocks are within 1/19114 of each other, the same number
of
symbols will be received as sent, ASSUMING no jitter. You can't assume
that, but if there is 'not much' jitter then perhaps 1/100k will be
good
enough for relative drift to not need to be corrected for.

So, for version 1, use the 'sync' to establish the start of frame and
the
sampling point, simulate the 'Rx fast' and 'Rx slow' cases in parallel,
and
see whether it works.

BTW, this is off-topic for C.A.F., as it is a system design problem not
related to the implementation method.

Since you can resynchronize your sampling clock on each transition
received, you only need to "hold lock" for the maximum time between
transitions, which is 7 bit times. This would mean that if you have a
nominal 4x clock, some sample points will be only 3 clocks apart (if you
are slow) or some will be 5 clocks apart (if you are fast), while most
will be 4 clock apart. This is the reason for the 1 bit stuffing.

The bit-stuffing in long sequences of zeroes is almost certainly there to
facilitate a conventional clock recovery method, which I am proposing not
using PROVIDED THAT the clocks at each end are within a sufficiently tight
tolerance. Detect the ones in the as-sent stream first, then decide which
are due to bit-stuffing, and remove them.

Deciding how tight a tolerance is 'sufficiently tight' is probably
non-trivial, so I won't be doing it for free.

Since a 4x clock allows for a 25% data period correction, and we will
get an opportunity to do so every 7 data periods, we can tolerate about
a 25/7 ~ 3% error in clock frequency. (To get a more exact value we will
need to know details like jitter and sampling apertures, but this gives
us a good ball-park figure). Higher sampling rates can about double
this, the key is we need to be able to know which direction the error is
in, so we need to be less than a 50% of a data period error including
the variation within a sample clock.

To try to gather the data without resynchronizing VASTLY decreases your
tolerance for clock errors as you need to stay within a clock cycle over
the entire message.

The protocol, with its 3 one preamble, does seem like there may have
been some effort to enable the use of a PLL to generate the data
sampling clock, which may have been the original method. This does have
the advantage the the data clock out of the sampler is more regular (not
having the sudden jumps from the resyncronizing), and getting a set a
burst of 1s helps the PLL to get a bit more centered on the data. My
experience though is that with FPGAs (as would be on topic for this
group), this sort of PLL synchronism is not normally used, but
oversampling clocks with phase correction is fairly standard.

alb · Aug 2, 2013

On 02/08/2013 06:19, rickman wrote:
[]

This is what I meant indeed. I believe I confused DPLL with ADPLL...

I am not familiar with ADPLL. What is that?

It is an All Digital PLL:

http://www.aicdesign.org/2003%20PLL%20Slides/L050-ADPLLs-2UP(9_1_03).pdf

all the elements of a PLL are implemented in the digital domain.

alb · Aug 2, 2013

Hi Lasse,

On 01/08/2013 00:03, langwadt@fonz.dk wrote:
[]

It does not work not because of Nyquist limit, but because the
recovery

of a phase shift cannot be done with just two clocks per bit.

may not technically be Nyquist limit, but like so many things in
nature the same relations are repeated

A signal traveling on a physical channel (be it on a cable, a PCB route,
an FPGA interconnection...) will have sharp transitions at the beginning
of its journey and sloppier ones at the end due to losses, but if you
take a comparator and discriminate a '1' or '0', then you do not 'need'
higher frequencies than half the data rate itself (or symbol rate to be
precise). If you take a sinusoidal waveform and put a threshold at 0,
then you have two symbols per cycle.

Why sampling at the data rate is not sufficient then? Because there are
several other factors. First of all encoding and decoding are processes
which do introduce 'noise' as well as 'limitations'. Having a comparator
to discriminate 0/1 does introduce noise in the time of transaction,
therefore distorting the phase of the signal. The medium itself might be
source of other jitter since it is sensitive to the environment
(temperature, pressure, humidity, ...).

TRANSMITTER MEDIUM RECEIVER
+-------------------+ +-------------------+
| +---| |---+ |
| '10100101' -> |ENC| -\/\/\/\/->|DEC| -> '10101110' |
| +---| physical |---+ |
+-------------------+ signal +-------------------+
^ ^
| |
+-----+ +-----+
| clk | | clk |
+-----+ +-----+

You do not care about reconstructing a physical signal (like in ADC
sampling), you *do* care about reconstructing a data stream.
Another source of troubles are the two clock generators on the TX and
RX. They cannot be assumed to be perfectly matching and any difference
will lead to a phase drift which eventually will spoil your data sampling.

and if you take NRZ you'll notice that the highest "frequency"
(0101010101..) is only half of the data rate

that is why a clock frequency = to data rate is sufficient to 'sample'
the information.

<nitpick mode on>
the NRZ is a line code, i.e. a translation of your data stream with
appropriate physical signal (light, current, sound, ...) for the chosen
physical medium (fiber, cable, air, ...) and has nothing to do with a
toggling bit.
<nitpick mode off>

alb · Aug 2, 2013

Hi Richard,

On 02/08/2013 06:22, Richard Damon wrote:
[]

The bit-stuffing in long sequences of zeroes is almost certainly there to
facilitate a conventional clock recovery method, which I am proposing not
using PROVIDED THAT the clocks at each end are within a sufficiently tight
tolerance. Detect the ones in the as-sent stream first, then decide which
are due to bit-stuffing, and remove them.

Deciding how tight a tolerance is 'sufficiently tight' is probably
non-trivial, so I won't be doing it for free.

Since a 4x clock allows for a 25% data period correction, and we will
get an opportunity to do so every 7 data periods, we can tolerate about
a 25/7 ~ 3% error in clock frequency. (To get a more exact value we will
need to know details like jitter and sampling apertures, but this gives
us a good ball-park figure). Higher sampling rates can about double
this, the key is we need to be able to know which direction the error is
in, so we need to be less than a 50% of a data period error including
the variation within a sample clock.

According to your math it looks like a 2x clock allows for a 50% data
period correction and therefore a 50/7 ~6% error in clock frequency,
which seems to me quite counter intuitive... Am I missing something?

[]

The protocol, with its 3 one preamble, does seem like there may have
been some effort to enable the use of a PLL to generate the data
sampling clock, which may have been the original method. This does have
the advantage the the data clock out of the sampler is more regular (not
having the sudden jumps from the resyncronizing), and getting a set a
burst of 1s helps the PLL to get a bit more centered on the data. My
experience though is that with FPGAs (as would be on topic for this
group), this sort of PLL synchronism is not normally used, but
oversampling clocks with phase correction is fairly standard.

This is indeed what I'm looking for, oversampling (4x or 8x) and phase
correct.

RCIngham · Aug 2, 2013

On 8/1/13 5:56 AM, RCIngham wrote:
On 7/31/13 9:36 AM, RCIngham wrote:
[snip]

Unless 'length' is limited, your worst case has header
"0000001111111111"
(with an extra bit stuffed) followed by 16 * 1023 = 16368 zeros
which
will
have 2728 ones stuffed into them. Total line packet length is 19113
symbols. If the clocks are within 1/19114 of each other, the sam
number
of
symbols will be received as sent, ASSUMING no jitter. You can'
assume
that, but if there is 'not much' jitter then perhaps 1/100k will be
good
enough for relative drift to not need to be corrected for.

So, for version 1, use the 'sync' to establish the start of frame and
the
sampling point, simulate the 'Rx fast' and 'Rx slow' cases i
parallel,
and
see whether it works.

BTW, this is off-topic for C.A.F., as it is a system design proble
not
related to the implementation method.

Since you can resynchronize your sampling clock on each transition
received, you only need to "hold lock" for the maximum time between
transitions, which is 7 bit times. This would mean that if you have a
nominal 4x clock, some sample points will be only 3 clocks apart (i
you
are slow) or some will be 5 clocks apart (if you are fast), while most
will be 4 clock apart. This is the reason for the 1 bit stuffing.

The bit-stuffing in long sequences of zeroes is almost certainly ther
to
facilitate a conventional clock recovery method, which I am proposin
not
using PROVIDED THAT the clocks at each end are within a sufficientl
tight
tolerance. Detect the ones in the as-sent stream first, then decid
which
are due to bit-stuffing, and remove them.

Deciding how tight a tolerance is 'sufficiently tight' is probably
non-trivial, so I won't be doing it for free.

Since a 4x clock allows for a 25% data period correction, and we will
get an opportunity to do so every 7 data periods, we can tolerate about
a 25/7 ~ 3% error in clock frequency. (To get a more exact value we will
need to know details like jitter and sampling apertures, but this gives
us a good ball-park figure). Higher sampling rates can about double
this, the key is we need to be able to know which direction the error is
in, so we need to be less than a 50% of a data period error including
the variation within a sample clock.

To try to gather the data without resynchronizing VASTLY decreases your
tolerance for clock errors as you need to stay within a clock cycle over
the entire message.

The protocol, with its 3 one preamble, does seem like there may have
been some effort to enable the use of a PLL to generate the data
sampling clock, which may have been the original method. This does have
the advantage the the data clock out of the sampler is more regular (not
having the sudden jumps from the resyncronizing), and getting a set a
burst of 1s helps the PLL to get a bit more centered on the data. My
experience though is that with FPGAs (as would be on topic for this
group), this sort of PLL synchronism is not normally used, but
oversampling clocks with phase correction is fairly standard.

Some form of clock recovery is essential for continuous ('synchronous'
data streams. It is not required for 'sufficiently short' asynchronous dat
bursts, the classic example of which is RS-232. What I am suggesting i
that the OP determines - using simulation - whether these frames are to
long given the relative clock tolerances for a system design without cloc
recovery.

As I previously noted, this is first a 'system design' problem. Only afte
that has been completed does it become an 'FPGA design' problem.

---------------------------------------
Posted through http://www.FPGARelated.com

rickman · Aug 2, 2013

On 8/2/2013 3:49 AM, alb wrote:

On 02/08/2013 06:19, rickman wrote:
[]

This is what I meant indeed. I believe I confused DPLL with ADPLL...

I am not familiar with ADPLL. What is that?

It is an All Digital PLL:

http://www.aicdesign.org/2003%20PLL%20Slides/L050-ADPLLs-2UP(9_1_03).pdf

all the elements of a PLL are implemented in the digital domain.

I guess I wasn't aware that a digital PLL wasn't *all* digital. That is
what I have been referring to as digital.

--

Rick

rickman · Aug 2, 2013

On 8/2/2013 6:35 AM, RCIngham wrote:

On 8/1/13 5:56 AM, RCIngham wrote:
On 7/31/13 9:36 AM, RCIngham wrote:
[snip]

Unless 'length' is limited, your worst case has header
"0000001111111111"
(with an extra bit stuffed) followed by 16 * 1023 = 16368 zeros,
which
will
have 2728 ones stuffed into them. Total line packet length is 19113
symbols. If the clocks are within 1/19114 of each other, the same
number
of
symbols will be received as sent, ASSUMING no jitter. You can't
assume
that, but if there is 'not much' jitter then perhaps 1/100k will be
good
enough for relative drift to not need to be corrected for.

So, for version 1, use the 'sync' to establish the start of frame and
the
sampling point, simulate the 'Rx fast' and 'Rx slow' cases in
parallel,
and
see whether it works.

BTW, this is off-topic for C.A.F., as it is a system design problem
not
related to the implementation method.

Since you can resynchronize your sampling clock on each transition
received, you only need to "hold lock" for the maximum time between
transitions, which is 7 bit times. This would mean that if you have a
nominal 4x clock, some sample points will be only 3 clocks apart (if
you
are slow) or some will be 5 clocks apart (if you are fast), while most
will be 4 clock apart. This is the reason for the 1 bit stuffing.

The bit-stuffing in long sequences of zeroes is almost certainly there
to
facilitate a conventional clock recovery method, which I am proposing
not
using PROVIDED THAT the clocks at each end are within a sufficiently
tight
tolerance. Detect the ones in the as-sent stream first, then decide
which
are due to bit-stuffing, and remove them.

Deciding how tight a tolerance is 'sufficiently tight' is probably
non-trivial, so I won't be doing it for free.

Since a 4x clock allows for a 25% data period correction, and we will
get an opportunity to do so every 7 data periods, we can tolerate about
a 25/7 ~ 3% error in clock frequency. (To get a more exact value we will
need to know details like jitter and sampling apertures, but this gives
us a good ball-park figure). Higher sampling rates can about double
this, the key is we need to be able to know which direction the error is
in, so we need to be less than a 50% of a data period error including
the variation within a sample clock.

To try to gather the data without resynchronizing VASTLY decreases your
tolerance for clock errors as you need to stay within a clock cycle over
the entire message.

The protocol, with its 3 one preamble, does seem like there may have
been some effort to enable the use of a PLL to generate the data
sampling clock, which may have been the original method. This does have
the advantage the the data clock out of the sampler is more regular (not
having the sudden jumps from the resyncronizing), and getting a set a
burst of 1s helps the PLL to get a bit more centered on the data. My
experience though is that with FPGAs (as would be on topic for this
group), this sort of PLL synchronism is not normally used, but
oversampling clocks with phase correction is fairly standard.

Some form of clock recovery is essential for continuous ('synchronous')
data streams. It is not required for 'sufficiently short' asynchronous data
bursts, the classic example of which is RS-232. What I am suggesting is
that the OP determines - using simulation - whether these frames are too
long given the relative clock tolerances for a system design without clock
recovery.

As I previously noted, this is first a 'system design' problem. Only after
that has been completed does it become an 'FPGA design' problem.

I don't think the frame length is the key parameter, rather it is the 6
zero, one insertion that guarantees a transition every 7 bits.

--

Rick

alb · Aug 2, 2013

On 02/08/2013 16:16, rickman wrote:
[]

It is an All Digital PLL:

http://www.aicdesign.org/2003%20PLL%20Slides/L050-ADPLLs-2UP(9_1_03).pdf

all the elements of a PLL are implemented in the digital domain.

I guess I wasn't aware that a digital PLL wasn't *all* digital. That is
what I have been referring to as digital.

you might find this article interesting:

http://www.silabs.com/Support%20Documents/TechnicalDocs/AN575.pdf

Richard Damon · Aug 2, 2013

On 8/2/13 6:30 AM, alb wrote:

Hi Richard,

On 02/08/2013 06:22, Richard Damon wrote:
[]
The bit-stuffing in long sequences of zeroes is almost certainly there to
facilitate a conventional clock recovery method, which I am proposing not
using PROVIDED THAT the clocks at each end are within a sufficiently tight
tolerance. Detect the ones in the as-sent stream first, then decide which
are due to bit-stuffing, and remove them.

Deciding how tight a tolerance is 'sufficiently tight' is probably
non-trivial, so I won't be doing it for free.

Since a 4x clock allows for a 25% data period correction, and we will
get an opportunity to do so every 7 data periods, we can tolerate about
a 25/7 ~ 3% error in clock frequency. (To get a more exact value we will
need to know details like jitter and sampling apertures, but this gives
us a good ball-park figure). Higher sampling rates can about double
this, the key is we need to be able to know which direction the error is
in, so we need to be less than a 50% of a data period error including
the variation within a sample clock.

According to your math it looks like a 2x clock allows for a 50% data
period correction and therefore a 50/7 ~6% error in clock frequency,
which seems to me quite counter intuitive... Am I missing something?

The details are that for a Nx sampling clocks, every time you see a
clock, you can possibly shift N/2-1 high speed clock cycles every
adjustment. For example, with a 16x clock, you can correct for the edge
being between -7 and +7 sampling clocks from the expected point. If it
is 8 clocks off, you don't know if is should be +8 or -8, so you are in
trouble. If N is odd, you can possibly handle (N-1)/2 cycles. (Note that
this assumes negligible jitter.) So our final allowable shift in data
clocks is (N/2-1)/N which can also be written as 1/2-1/N, which leads to
my 6% for N large (50% correction) and 3% for N=4. For N-2 this gives us 0%.

The protocol, with its 3 one preamble, does seem like there may have
been some effort to enable the use of a PLL to generate the data
sampling clock, which may have been the original method. This does have
the advantage the the data clock out of the sampler is more regular (not
having the sudden jumps from the resyncronizing), and getting a set a
burst of 1s helps the PLL to get a bit more centered on the data. My
experience though is that with FPGAs (as would be on topic for this
group), this sort of PLL synchronism is not normally used, but
oversampling clocks with phase correction is fairly standard.

This is indeed what I'm looking for, oversampling (4x or 8x) and phase
correct.

serial protocol specs and verification

RCIngham

Guest

Guest

rickman

Guest

alb

Guest

alb

Guest

Guest

Richard Damon

Guest

RCIngham

Guest

alb

Guest

rickman

Guest

rickman

Guest

Richard Damon

Guest

alb

Guest

alb

Guest

alb

Guest

RCIngham

Guest

rickman

Guest

rickman

Guest

alb

Guest

Richard Damon

Guest

Log in

Welcome to EDABoard.com

Sponsor