Pseudorandom Hashing

T

Tim Wescott

Guest
I am having trouble coming up with the right keywords to do a web
search, so help me out here:

There is a technique where, to significantly reduce the probability of
getting a long string of zeros, a message is run through a CRC
generator, and the output bits are taken off. The transmitted message
is thoroughly hashed, yet it is a simple matter of a shift register and
some XOR gates to decode the message on the other end.

I thought I knew how to do this, yet in trying to actually make it work
I find that over half of my brain cells appear to be attending a
management seminar.

So, know where I can find out how to do this right? "pseudorandom" and
"hash" get me tons of cryptography, but not what I'm looking for.

Thanks.

--

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com
 
On Thu, 23 Sep 2004 14:40:28 -0700, Tim Wescott
<tim@wescottnospamdesign.com> wrote:

I am having trouble coming up with the right keywords to do a web
search, so help me out here:

There is a technique where, to significantly reduce the probability of
getting a long string of zeros, a message is run through a CRC
generator, and the output bits are taken off. The transmitted message
is thoroughly hashed, yet it is a simple matter of a shift register and
some XOR gates to decode the message on the other end.

I thought I knew how to do this, yet in trying to actually make it work
I find that over half of my brain cells appear to be attending a
management seminar.

So, know where I can find out how to do this right? "pseudorandom" and
"hash" get me tons of cryptography, but not what I'm looking for.

Thanks.

It's called data scrambling. Modems sometimes use it to make sure
there are enough data transitions to keep the receive end in sync.

Try "modem data scrambling" or something.

There are also n/m codes, where each group of n data bits is replaced
with a mapped group of m bits, where m = n+1, often. This is also a
variation on RLL (run length limiting) coding, I think.

John
 
On Thu, 23 Sep 2004 14:40:28 -0700, Tim Wescott
<tim@wescottnospamdesign.com> wrote:

I am having trouble coming up with the right keywords to do a web
search, so help me out here:

There is a technique where, to significantly reduce the probability of
getting a long string of zeros, a message is run through a CRC
generator, and the output bits are taken off. The transmitted message
is thoroughly hashed, yet it is a simple matter of a shift register and
some XOR gates to decode the message on the other end.

I thought I knew how to do this, yet in trying to actually make it work
I find that over half of my brain cells appear to be attending a
management seminar.

So, know where I can find out how to do this right? "pseudorandom" and
"hash" get me tons of cryptography, but not what I'm looking for.

---
Is this what you had in mind?

A-----------+
+------------------------Y EXOR |
| B--+ |
+--A | |
EXOR Y--[D Q]---[D Q]---[D Q]-+-[D Q]--+--->DATA OUT
DATA IN>-- B

Each [D Q] is a stage of a shift register, and as each new data bit is
presented to the first EXOR, it's EXORed with the EXOR of older bits
which have propagated down the chain and been EXORed with each other.

The result is that the DATA OUT stream of bits is scrambled and bears
no resemblance, on the surface, to DATA IN.

In order to recover DATA IN, the DATA OUT stream is presented as DATA
IN to an identical shift register / EXOR circuit and, if they're both
synced up (the circuits) what comes out of the receiver will be
exactly what went into of the transmitter.

Depending on what goes into the transmitter input, there may be long
strings of ones or zeroes which come out of the output. This can be a
bad thing for synchronous data modems which use data transitions to
keep their their 'dot clocks' in sync with incoming data, so what's
done to circumvent that is that a counter is placed in the transmitter
circuitry which counts ones and/or zeroes, and if a certain number of
them occur, in a row, then a one or a zero is stuffed into the data
stream (or a bit is flipped) to break up the sequence.

--
John Fields
 
Tim Wescott wrote:

[...]

There is a technique where, to significantly reduce the probability of
getting a long string of zeros, a message is run through a CRC
generator, and the output bits are taken off. The transmitted message
is thoroughly hashed, yet it is a simple matter of a shift register and
some XOR gates to decode the message on the other end.

I thought I knew how to do this, yet in trying to actually make it work
I find that over half of my brain cells appear to be attending a
management seminar.

So, know where I can find out how to do this right? "pseudorandom" and
"hash" get me tons of cryptography, but not what I'm looking for.

Thanks.

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com
Tim,

It's called scrambling - but are you sure you want to do it? Recall the
probability of a given bit pattern in a random sequence is 1/2^n, where n
is the length of the pattern.

So in any random group of 8 bits, you have a 1/256 chance of finding the
bit pattern 0000000. If this matches or partially matches a similar
pattern in the data, the output will be a long string of zeros. If this
causes a system failure, it may occur fairly often.

Many encoding schemes have no dc component and can tolerate long strings
of ones or zeros. The penalty is low efficiency. Other methods limit the
length of the runs, for example the RLL codes that were used in hard disk
drives. Optical data transmission uses this technique, and I believe
Ethernet.

One method that achieves high bit efficiency and has no dc component is
the old MFM, or Miller code that was used on hard disk drives prior to
RLL. It is very simple to encode and decode, and has twice the efficiency
of Manchester encoding.

So if the data absolutely has got to get there, any of the RLL encoding
methods may be a better choice than data scrambling.

Mike
 
On Fri, 24 Sep 2004 01:41:38 GMT, Rich Grise <null@example.net> wrote:

And the other question is, how do you sync-up the TX and RX anyway?
If I had to take a WAG, I'd think you'd maybe skip one clock pulse
from time to time until sync characters start coming out, but I'd
be surprised if someone hasn't already come up with something more
sophisticated.
Here are a few ways of obtaining sync:
http://groups.google.com/groups?threadm=37fdb8dd.26812738%40newshost.fujitsu.com.au

Regards,
Allan
 
On Fri, 24 Sep 2004 01:41:38 GMT, Rich Grise <null@example.net> wrote:

And the other question is, how do you sync-up the TX and RX anyway?
If I had to take a WAG, I'd think you'd maybe skip one clock pulse
from time to time until sync characters start coming out, but I'd
be surprised if someone hasn't already come up with something more
sophisticated.
Here are a few ways of obtaining sync:
http://groups.google.com/groups?threadm=37fdb8dd.26812738%40newshost.fujitsu.com.au

Regards,
Allan
 
In article <10l6gmbq4c3ngff@corp.supernews.com>,
Tim Wescott <tim@wescottnospamdesign.com> wrote:
[...]
There is a technique where, to significantly reduce the probability of
getting a long string of zeros, a message is run through a CRC
generator, and the output bits are taken off. The transmitted message
is thoroughly hashed, yet it is a simple matter of a shift register and
some XOR gates to decode the message on the other end.
Another thing to look up is the GCR method that used to be used on tape
decks. It worked to reduce the number of edges per byte and thus limit
the high frequencies. IIRC: the math for developing the codes for GCR is
very like the math for ensuring transitions happen every so often.

--
--
kensmith@rahul.net forging knowledge
 
On Thu, 23 Sep 2004 14:40:28 -0700, Tim Wescott
<tim@wescottnospamdesign.com> wrote:

I am having trouble coming up with the right keywords to do a web
search, so help me out here:

There is a technique where, to significantly reduce the probability of
getting a long string of zeros, a message is run through a CRC
generator, and the output bits are taken off. The transmitted message
is thoroughly hashed, yet it is a simple matter of a shift register and
some XOR gates to decode the message on the other end.

I thought I knew how to do this, yet in trying to actually make it work
I find that over half of my brain cells appear to be attending a
management seminar.

So, know where I can find out how to do this right? "pseudorandom" and
"hash" get me tons of cryptography, but not what I'm looking for.

Thanks.

Tim Wescott
Hi Tim

just a quick guess, but ya' might
search on "linear recursive sequences".

Ya' know. Dr. Mike (Rosing) will know
how to help you.
(Haven't heard much from Dr. Mike recently here.)

Good Luck,
[-Rick-]
 
On Fri, 24 Sep 2004 01:41:38 GMT, Rich Grise <null@example.net> wrote:


And the other question is,
---
What was the first question?
---

how do you sync-up the TX and RX anyway?
If I had to take a WAG, I'd think you'd maybe skip one clock pulse
from time to time until sync characters start coming out, but I'd
be surprised if someone hasn't already come up with something more
sophisticated.
---
If TX and RX clock sync is adequate at the beginning of transmission
of synchronous data, it's only necessary to to have the TX in a known
state when it starts accepting outgoing data and to have the RX in the
same state when it starts accepting incoming data. That can be as
simple as clearing all the stages of both the TX and RX registers
before data starts being accepted, and transmitted, by the TX. In
actuality, there's usually quite an involved procedure which takes
place involving sending a known pattern (a 'preamble') to the
receiver. The preamble precedes actual data, and since the receiver
knows what the contents of the preamble are supposed to be, it uses
that data to set itself up until the incoming data from the preamble
is, bit for bit, what it predicts it will be. Once that happens, the
receiver will be ready to receive real data and will be waiting for
the last edge from the preamble, which will signal that all the
following bits will be data.

--
John Fields
 
Tim Wescott wrote:

I am having trouble coming up with the right keywords to do a web
search, so help me out here:

There is a technique where, to significantly reduce the probability of
getting a long string of zeros, a message is run through a CRC
generator, and the output bits are taken off. The transmitted message
is thoroughly hashed, yet it is a simple matter of a shift register and
some XOR gates to decode the message on the other end.

I thought I knew how to do this, yet in trying to actually make it work
I find that over half of my brain cells appear to be attending a
management seminar.

So, know where I can find out how to do this right? "pseudorandom" and
"hash" get me tons of cryptography, but not what I'm looking for.

Thanks.

Rather than answering all the threads with overlapping information and
advise:

1. Thanks. There's useful information here.
2. I wish that I _could_ avoid doing it, but I'm trying to
reverse-engineer a digital link on a proprietary phone system (to
improve your quality of service this call may be monitored), so I'm stuck.
3. I know it's generated by a 7-bit recursive linear shift register because
a. When the phone's on hook it repeats once every 127 times.
b. If I xor three bytes of message to get a fourth, and xor the
three corresponding following bytes I get the corresponding fourth.
4. I think I'm getting the state of the shift register + one bit, after
the "real" message has been fed into it.
5. I don't know the generator polynomial.
6. I don't know what to do with it if I had the frigging thing (but
your answers are getting me there).

--

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com
 
John Larkin wrote:

On Thu, 23 Sep 2004 14:40:28 -0700, Tim Wescott
tim@wescottnospamdesign.com> wrote:


I am having trouble coming up with the right keywords to do a web
search, so help me out here:

There is a technique where, to significantly reduce the probability of
getting a long string of zeros, a message is run through a CRC
generator, and the output bits are taken off. The transmitted message
is thoroughly hashed, yet it is a simple matter of a shift register and
some XOR gates to decode the message on the other end.

I thought I knew how to do this, yet in trying to actually make it work
I find that over half of my brain cells appear to be attending a
management seminar.

So, know where I can find out how to do this right? "pseudorandom" and
"hash" get me tons of cryptography, but not what I'm looking for.

Thanks.



It's called data scrambling. Modems sometimes use it to make sure
there are enough data transitions to keep the receive end in sync.

Try "modem data scrambling" or something.

There are also n/m codes, where each group of n data bits is replaced
with a mapped group of m bits, where m = n+1, often. This is also a
variation on RLL (run length limiting) coding, I think.

John

Boy, I should have known that -- thanks, I'll look.

--

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com
 
On Fri, 24 Sep 2004 14:23:56 -0700, Tim Wescott
<tim@wescottnospamdesign.com> wrote:



1. Thanks. There's useful information here.
2. I wish that I _could_ avoid doing it, but I'm trying to
reverse-engineer a digital link on a proprietary phone system (to
improve your quality of service this call may be monitored), so I'm stuck.
3. I know it's generated by a 7-bit recursive linear shift register because
a. When the phone's on hook it repeats once every 127 times.
b. If I xor three bytes of message to get a fourth, and xor the
three corresponding following bytes I get the corresponding fourth.
4. I think I'm getting the state of the shift register + one bit, after
the "real" message has been fed into it.
5. I don't know the generator polynomial.
6. I don't know what to do with it if I had the frigging thing (but
your answers are getting me there).
---
What is it you're trying to do, exactly?

--
John Fields
 
"Tim Wescott" schrieb
There is a technique where, to significantly reduce the
probability of getting a long string of zeros, a message
is run through a CRC generator,


It's called data scrambling. Modems sometimes use it to
make sure there are enough data transitions to keep the
receive end in sync.


Boy, I should have known that -- thanks, I'll look.
To my beginner's eye,
http://www.analog.com/Analog_Root/static/library/dspManuals/pdf/2100
Chapter_2_part1.pdf
is a good introduction. Of course, geared towards Analog's
2100 DSP family, bu with just enough theoretical background.

HTH
Martin
 
On Friday 24 September 2004 05:22 pm, Tim Wescott did deign to grace us with
the following:

John Fields wrote:
On Fri, 24 Sep 2004 14:23:56 -0700, Tim Wescott
tim@wescottnospamdesign.com> wrote:




1. Thanks. There's useful information here.
2. I wish that I _could_ avoid doing it, but I'm trying to
reverse-engineer a digital link on a proprietary phone system (to
improve your quality of service this call may be monitored), so I'm
stuck.
3. I know it's generated by a 7-bit recursive linear shift register
because
a. When the phone's on hook it repeats once every 127 times.
b. If I xor three bytes of message to get a fourth, and xor the
three corresponding following bytes I get the corresponding fourth.
4. I think I'm getting the state of the shift register + one bit, after
the "real" message has been fed into it.
5. I don't know the generator polynomial.
6. I don't know what to do with it if I had the frigging thing (but
your answers are getting me there).


---
What is it you're trying to do, exactly?

Build a call center recorder for digital phones. It needs to be able to
tap the lines and record the incoming & outgoing audio. It basically
consists of a bit receiver, packet parser, some glue logic and a big
serial interface to a DSP which does mu-law expanding & compressing.
The DSP then sends the audio out on a PCI bus to be recorded on disk.

I'm adding a new brand of PBX interface to the thing and it appears that
the audio data is scrambled -- I don't get a sensible signal when I'm
hearing dial tone, and the data is scrambled in the patterns described
above when there is "silence" on the line.

What is it you mean by "don't get a sensible signal when ... dial tone"?

Is it not in the same format as the pattern described above? Or is it
just a scrambled dialtone? I'd think that'd give you another baseline
point to help zero in on your algorithm.

Another thing I remember, I'm pretty sure it was in Don Lancaster's
TTL Cookbook, was a chart with some repetition cycle "times" for
various feedback patterns, i.e. picking off different bits from
the shift register(s) to xor together will give you different
scrambling patterns.

Sounds like a way fun project! Need a subcontractor? :)

Good Luck!
Rich
 
On Thu, 23 Sep 2004 18:39:10 -0500, John Fields
<jfields@austininstruments.com> wrote:

On Thu, 23 Sep 2004 14:40:28 -0700, Tim Wescott
tim@wescottnospamdesign.com> wrote:

I am having trouble coming up with the right keywords to do a web
search, so help me out here:

There is a technique where, to significantly reduce the probability of
getting a long string of zeros, a message is run through a CRC
generator, and the output bits are taken off. The transmitted message
is thoroughly hashed, yet it is a simple matter of a shift register and
some XOR gates to decode the message on the other end.

I thought I knew how to do this, yet in trying to actually make it work
I find that over half of my brain cells appear to be attending a
management seminar.

So, know where I can find out how to do this right? "pseudorandom" and
"hash" get me tons of cryptography, but not what I'm looking for.


---
Is this what you had in mind?

A-----------+
+------------------------Y EXOR |
| B--+ |
+--A | |
EXOR Y--[D Q]---[D Q]---[D Q]-+-[D Q]--+--->DATA OUT
DATA IN>-- B

Each [D Q] is a stage of a shift register, and as each new data bit is
presented to the first EXOR, it's EXORed with the EXOR of older bits
which have propagated down the chain and been EXORed with each other.

The result is that the DATA OUT stream of bits is scrambled and bears
no resemblance, on the surface, to DATA IN.

In order to recover DATA IN, the DATA OUT stream is presented as DATA
IN to an identical shift register / EXOR circuit and, if they're both
synced up (the circuits) what comes out of the receiver will be
exactly what went into of the transmitter.

Depending on what goes into the transmitter input, there may be long
strings of ones or zeroes which come out of the output. This can be a
bad thing for synchronous data modems which use data transitions to
keep their their 'dot clocks' in sync with incoming data, so what's
done to circumvent that is that a counter is placed in the transmitter
circuitry which counts ones and/or zeroes, and if a certain number of
them occur, in a row, then a one or a zero is stuffed into the data
stream (or a bit is flipped) to break up the sequence.
I posted...

Newsgroups: alt.binaries.schematics.electronic
Subject: Pseudorandom Hashing (S.E.D) - NRZ-AddTransitions.pdf
Message-ID: <9mo6l0tpq9jn3q07fsrfpkd1lbot33330i@4ax.com>

which is INCORRECT. John has it right... that's what I was trying to
remember from eons ago.

...Jim Thompson
--
| James E.Thompson, P.E. | mens |
| Analog Innovations, Inc. | et |
| Analog/Mixed-Signal ASIC's and Discrete Systems | manus |
| Phoenix, Arizona Voice:(480)460-2350 | |
| E-mail Address at Website Fax:(480)460-2142 | Brass Rat |
| http://www.analog-innovations.com | 1962 |

I love to cook with wine. Sometimes I even put it in the food.
 
On Thursday 23 September 2004 04:39 pm, John Fields did deign to grace us
with the following:

On Thu, 23 Sep 2004 14:40:28 -0700, Tim Wescott

So, know where I can find out how to do this right? "pseudorandom" and
"hash" get me tons of cryptography, but not what I'm looking for.
Is this what you had in mind?

A-----------+
+------------------------Y EXOR |
| B--+ |
+--A | |
EXOR Y--[D Q]---[D Q]---[D Q]-+-[D Q]--+--->DATA OUT
DATA IN>-- B

Depending on what goes into the transmitter input, there may be long
strings of ones or zeroes which come out of the output. This can be a
bad thing for synchronous data modems which use data transitions to
keep their their 'dot clocks' in sync with incoming data, so what's
done to circumvent that is that a counter is placed in the transmitter
circuitry which counts ones and/or zeroes, and if a certain number of
them occur, in a row, then a one or a zero is stuffed into the data
stream (or a bit is flipped) to break up the sequence.
I did a bunch of these on protoboard, right out of Don Lancaster's
TTL Cookbook, about a thousand years ago. ;-) Just want to mention
a couple of things ... well, one is moot for a data stream (unless,
as noted, there's a long string of 1s or 0s), but for a free-running
PRNG, you have to account for the "dead" state. I think that if it
somehow accidentally becomes all 0's or all 1's, I think which is
which is determined by how many xors or something, it gets stuck.
So I had like an 8-input AND or something to kick it out of that.

And the other question is, how do you sync-up the TX and RX anyway?
If I had to take a WAG, I'd think you'd maybe skip one clock pulse
from time to time until sync characters start coming out, but I'd
be surprised if someone hasn't already come up with something more
sophisticated.

Thanks!
Rich
 

Welcome to EDABoard.com

Sponsor

Back
Top