A Sorting Circuit in Digital Logic Design

J

Johnathan Coll

Guest
Hello;

I am trying to implement a Linear Array Sort that takes in 10 8-bit
inputs serially and outputs the inputs in increasing order. I figured
out that Bubble Sort is the easiest kind of sort that can be
implemented in hardware.

The first step to implement this Linear Array Sorter was to read in the
inputs. I used a Modulo 10 counter to read in the inputs from the user.
Now how can I continue from here? I thought that using a 1-bit
comparator (1-bit slicing) can work. But I then realized that I need
buffers and other counters for further implementaion. Please, any
ideas? I would really appreciate your help.

Thanks.
 
The author of this answer assumes this is not a homework assignment and
thus added comments between the lines.

The first step to implement this Linear Array Sorter was to read in the
inputs. I used a Modulo 10 counter to read in the inputs from the user.
Why do you use a mod-10 counter?

Now how can I continue from here? I thought that using a 1-bit
comparator (1-bit slicing) can work.
1-bit comparator can do 2 different comparisons in the same time, in
one clock cycle.
But you have 10 different values to be compared. How can you implement
this hardware?

But I then realized that I need buffers and other counters for further implementaion.
Please, any ideas? I would really appreciate your help.
You took Bubble Sort as start point. How is the computation complexity
of this algorithm?
Describe the complexity proportional in size of RAM consumption and
time consumption.
Define the worst case and try to implement the hardware that meets this
worst case.

The questions have been asked knowingly to give you another aspect to
think about.

Utku.
 
I am an unemployed VLSI technician. VLSI is not
common in Florida. I was inspired to come up with such a project while
browsing IEEEXPLORE, that's why I did not give you enough details. The
thing is that I first want to implement this Linear Array Sorter using
Altera i.e. as a digital design project and then generate the VLSI
layouts using magic. I just thought that this is an interesting topic
to start in this google group. If you find it hard to help, it is OK. I

am sure that I will eventually figure it out on my own, but I would
love to hear the suggestions of other professionals.

Going into details, let us say that I need to sort 10 8-bit numbers
that are entered serially. I do not want to use FPGA nor RAM nor VHDL.
I just want to start from scratch. Since it is a personal project, I
decided to read in all the 10 inputs from the user serially and save
them in registers. Then, what do you suggest the second step would be?
I usually use 1-bit slicing in all of my designs. It is like my blue
print. I will use 1-bit comparator to compare the Most Significant bits
and then swap, or move on ... !!???? Do you have any other chemes in
mind?


Thanks guys for your patience.
 
Johnathan Coll wrote:
I am an unemployed VLSI technician. VLSI is not
common in Florida. I was inspired to come up with such a project while
browsing IEEEXPLORE, that's why I did not give you enough details. The
thing is that I first want to implement this Linear Array Sorter using
Altera i.e. as a digital design project and then generate the VLSI
layouts using magic. I just thought that this is an interesting topic
to start in this google group. If you find it hard to help, it is OK. I

am sure that I will eventually figure it out on my own, but I would
love to hear the suggestions of other professionals.

Going into details, let us say that I need to sort 10 8-bit numbers
that are entered serially. I do not want to use FPGA nor RAM nor VHDL.
I just want to start from scratch. Since it is a personal project, I
decided to read in all the 10 inputs from the user serially and save
them in registers. Then, what do you suggest the second step would be?
I usually use 1-bit slicing in all of my designs. It is like my blue
print. I will use 1-bit comparator to compare the Most Significant bits
and then swap, or move on ... !!???? Do you have any other chemes in
mind?


Thanks guys for your patience.
Sort methods depend on your desired performance (area, speed,
pipelining) and most of the sorting information you see is based on
computer algorithms that are explicitly serial in operation.

If you bring in 10 values serially and want 10 outputs serially, the
results can be made available once the last value has been clocked in if
you want to dedicate resources. Using a single comparator, the bubble
sort will give you the first value after the 10 values are shifted in
but the next 9 values take another 45 clock cycles to obtain. The
results show up consecutively in each stage suggesting a 10:1 mux
(pipelined or not) would be needed to bring the values out on
consecutive clocks.

To think about how you could increase resources to get your result,
think of the first stage as finding the greatest of 10 numbers. If the
number isn't the greatest shifted in so far, it goes to a second stage
with the task of finding the greatest of 9 numbers using the same
single-comparator scheme. The one-less staging goes on until you have
just 2 values left, one isn't greater than the other.

Binary tree sorts are fun but may not give you advantage in speed or area.

I think there's an "insertion sort" that would assemble the list of
elements as they come in. When you get in a new number that is greater
than one and not the other of two adjacent numbers in the list, the new
number gets inserted between those two values in the list. You'd need
one comparator per stored value making the resources similar to the
cascaded "greatest out of n" stages.

Bitonic sorts are designed to work on pipelined parallel number sets
providing extreme throughput with better overall efficiency in
results/clock for a given area.

With different sort methods considered, how would one effectively
implement a bubble sort with one comparator? The trouble here is that
the 10 values that got shifted in need to be compared as adjacent pairs.
The dual 9:1 mux is a pretty huge use of resources to come into the
comparator; you might get some resource savings over 9 comparators but
not enormous.

Methods like the bubble sort also have a compare and swap or compare and
shift/load/stay structure. The greater challenge in getting a good sort
together is figuring out the cleanest way to implement the basic
structures to obtain your result.

To many new possibilities? Bounding the problem and implementing well
toward your desired performance metrics is the mark of a good engineer.
 
Johnathan Coll wrote:

Going into details, let us say that I need to sort 10 8-bit numbers
that are entered serially. I do not want to use FPGA nor RAM nor VHDL.
In that case, your crosspost to comp.lang.vhdl is baffling.
see http://en.wikipedia.org/wiki/Crosspost

I just want to start from scratch.
If not FPGA, what is your preferred media?

-- Mike Treseler
 
If not FPGA, what is your preferred media?

-- Mike Treseler
My prefered Media is simple Digital Logic i.e. using NAND, NOR gates,
inverters, comparators, counters, buffers etc.... No RAMs or FPGAs or
ROMs or any external media. It is really challenging, right?
Thanks.
 
My prefered Media is simple Digital Logic i.e. using NAND, NOR gates,
inverters, comparators, counters, buffers etc.... No RAMs or FPGAs or
ROMs or any external media. It is really challenging, right?
Yes it is. If you can find discrete components in your area to design
this circuit, then it should not be a problem.

However, buying an FPGA demo board from a vendor is probably much more
practical, in which you can also design in the future hobby circuits
other than sort logic.

Utku.
 
I do not want to use FPGAs! I want to convert this Linear Array Sorter
into VLSI and place it on a single CHIP! so NO FPGAs should be used.
 
On Tue, 02 Jan 2007 18:17:55 +0000, Jonathan Bromley
<jonathan.bromley@MYCOMPANY.com> wrote:


But it
[insertion sort]
does not work well
if you want a sorted list of the 10 most recent
values.
Sorry, that wasn't very clear. What I meant was that
insertion sort doesn't work very well if you have a
CONTINUOUS input stream, and you would like at
any given moment to get a sorted list of the 10 most
recent values from the input stream.

For batch-mode sorting of small lists it's still
pretty useful. I've also used it to keep track
of the N largest values seen in a long
input stream.
--
Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how
VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services

Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK
jonathan.bromley@MYCOMPANY.com
http://www.MYCOMPANY.com

The contents of this message may contain personal views which
are not the views of Doulos Ltd., unless specifically stated.
 
Looks like, from your specs that you can do this with an "array sort".
If all the numbers are distinct:
1. create an array of 2^8 bits(tags).
2. As you read in each number, set the tag for that number to 1.
3. When ready to output the results of the sort,
Scan the tag array linearly, starting at 0, up to 2^8-1,
and place the index of each tag onto your output bus.

If all the numbers are not distinct, you'll need to modify your tags to
hold the count
of the number of times that tag was hit.

alan

Johnathan Coll wrote:
I do not want to use FPGAs! I want to convert this Linear Array Sorter
into VLSI and place it on a single CHIP! so NO FPGAs should be used.
 
On 27 Dec 2006 15:01:19 -0800, "Johnathan Coll"
<johnathan.coll@gmail.com> wrote:

I am trying to implement a Linear Array Sort that takes in 10 8-bit
inputs serially and outputs the inputs in increasing order. I figured
out that Bubble Sort is the easiest kind of sort that can be
implemented in hardware.

The first step to implement this Linear Array Sorter was to read in the
inputs. I used a Modulo 10 counter to read in the inputs from the user.
Now how can I continue from here? I thought that using a 1-bit
comparator (1-bit slicing) can work. But I then realized that I need
buffers and other counters for further implementaion. Please, any
ideas? I would really appreciate your help.
The FPGA-vs-ASIC thing is not really an issue. Whatever logic you
design is likely to map quite easily to either kind of platform.

The really important question is: what are your timing
considerations? I gather that you expect to get a new
data value on each clock cycle, but when you've received
your ten data values, how long are you prepared to wait
before starting to stream out the sorted array? And while
you are doing the sort, are there more, new values
appearing on the input; or is it a batch-mode sort, where
you get 10 values, sort them, spit out the results, and
then start all over again? I can imagine a system with
an 8-bit input that gets a new value on every clock, and
ten 8-bit outputs that continuously present the most
recent 10 inputs, sorted into order. To do that I'd
probably build a bitonic sorter, which would require
31 compare/swap modules, each of which is an 8-bit
magnitude comparator and two 8-bit multiplexers.

At the other extreme, if you have lots of time to
perform the calculation, it may be conceptually easier
to implement a small soft-core CPU in your ASIC, and write
software to do the sort.

You've heard here about insertion, bitonic and array sort;
there is also radix sort, selection sort... it's a long list,
sincesorting is a key operation. Many of the standard
texts describe sorting primarily from a software
perspective, where you can't exploit hardware
parallelism. Nevertheless, you surely should take
a look at Knuth's superb "Sorting and Searching"
book even though a lot of the implementation
details look quite dated today.

Just about the only thing I can tell you for sure is that
bubble sort is likely to be a bad choice. It's messy to
do in hardware, slow in software, and for only 10
numbers it optimises all the wrong things.

If your numbers are arriving sequentially in a batch,
an insertion sort is almost certainly the easiest to
implement and it represents a good compromise
between size and performance - your array of 10
numbers is sorted as soon as the last number has
been captured from the input. Its size scales
linearly with the number of values you are sorting,
and its speed is essentially independent of the
number of values - each new value is sorted
as soon as it arrives. But it does not work well
if you want a sorted list of the 10 most recent
values.

Here's the insertion sort algorithm in words.
Build an array of 10 sorting registers - I'll
describe their behaviour in a moment. Lay
them out in a linear array; the register at
the left will ultimately hold the biggest
value, the register at the right the smallest.

Here's the behaviour of each register on
each clock tick:

if (incoming number is smaller than me)
hold my current value;
else if (incoming number is smaller
than my neighbour to the left)
update my value from the incoming number;
else
update my value from my neighbour to the left.

Note that the incoming value must be distributed
to all ten registers. I've left it up to you to worry
about how to initialise the whole mess before
the first value appears, and what to do at the
two ends where "my neighbour to the left" may
not be quite self-evident.

The net result of this algorithm is that each
new value squeezes itself into the array at the
right position, and all smaller values shift one
place to the right to make space for the new one.

Enjoy.
--
Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how
VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services

Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK
jonathan.bromley@MYCOMPANY.com
http://www.MYCOMPANY.com

The contents of this message may contain personal views which
are not the views of Doulos Ltd., unless specifically stated.
 

Welcome to EDABoard.com

Sponsor

Back
Top