J
Josh Model
Guest
Hi all,
Just wanted to delve into the groups accumulated knowledge. I've got an app
where I must find the index of the maximum of 40 or so (unsorted, of course)
20-bit numbers. --the specifics aren't that important, just helps me think.
It's pretty time critical, with throughput more important than latency,
unless latency becomes absurd, and not area critical.
The "brute force" solution of cycling through the 40 numbers and doing a
compare-store on each is pretty slow for FPGA implementation, and cascading
comparators is pretty ugly.
I did some googling and literature searching and came up with plenty of
CS-style results, but very few hardware results.
This was one...
http://www.fpgacpu.org/usenet/max.html
which was seemed the FPGA equivalent of a very clever IEE paper entitled
"Efficient Parallel Pipelineable VLSI architechture for finding the maximum
binary number" by F. Daneshgaran and K. Yao. which solved the problem using
tristates.
The bitwise parallel-max approach is pretty attactive and reduces the clock
cycles to the bitwidth, e.g. 20 (though I guess you could look at m-bits at
a time, but it becomes more complex) but I think it will get somewhat hung
up on the "check for no candidates" logic.
I was wondering if there were any other "clever" approaches around to this
sort of thing from an FPGA synthesis viewpoint? I'll be the first to admit
to it being a while since I've looked at an algorithms book, so feel free to
club me over the head with any idiocies.
--Josh Model
Asst. Technical Staff
MIT Lincoln Laboratory
Just wanted to delve into the groups accumulated knowledge. I've got an app
where I must find the index of the maximum of 40 or so (unsorted, of course)
20-bit numbers. --the specifics aren't that important, just helps me think.
It's pretty time critical, with throughput more important than latency,
unless latency becomes absurd, and not area critical.
The "brute force" solution of cycling through the 40 numbers and doing a
compare-store on each is pretty slow for FPGA implementation, and cascading
comparators is pretty ugly.
I did some googling and literature searching and came up with plenty of
CS-style results, but very few hardware results.
This was one...
http://www.fpgacpu.org/usenet/max.html
which was seemed the FPGA equivalent of a very clever IEE paper entitled
"Efficient Parallel Pipelineable VLSI architechture for finding the maximum
binary number" by F. Daneshgaran and K. Yao. which solved the problem using
tristates.
The bitwise parallel-max approach is pretty attactive and reduces the clock
cycles to the bitwidth, e.g. 20 (though I guess you could look at m-bits at
a time, but it becomes more complex) but I think it will get somewhat hung
up on the "check for no candidates" logic.
I was wondering if there were any other "clever" approaches around to this
sort of thing from an FPGA synthesis viewpoint? I'll be the first to admit
to it being a while since I've looked at an algorithms book, so feel free to
club me over the head with any idiocies.
--Josh Model
Asst. Technical Staff
MIT Lincoln Laboratory