R
Ray Andraka
Guest
SRL16's are fine for muxes that are relatively static. I wouldn't use them in
a barrel shift used for floating point normalize or denormalize though, as it
takes 16 clocks to change it. If you have that much time between samples and
are that concerned about area, you should be doing bit or digit serial logic.
As long as you are keeping the floating point reasonable, the barrel shifts are
not too bad. Reasonable means maybe 12-16 bits mantissa and a few bits of
exponent. In most cases, you don't need full IEEE floating point. Another
trick I use frequently is to do a series of operations treating the mantissa a
fixed point and renormalizing after the series instead of after each
operation. This saves quite a bit of area and latency that would otherwise be
needed for de/re-normalizing. Block floating point works well in situations
where the dynamic range within a block of data is small but you don't know
apriori what the scaling will be. This is often used in FFTs.
Glen Herrmannsfeldt wrote:
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930 Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com
"They that give up essential liberty to obtain a little
temporary safety deserve neither liberty nor safety."
-Benjamin Franklin, 1759
a barrel shift used for floating point normalize or denormalize though, as it
takes 16 clocks to change it. If you have that much time between samples and
are that concerned about area, you should be doing bit or digit serial logic.
As long as you are keeping the floating point reasonable, the barrel shifts are
not too bad. Reasonable means maybe 12-16 bits mantissa and a few bits of
exponent. In most cases, you don't need full IEEE floating point. Another
trick I use frequently is to do a series of operations treating the mantissa a
fixed point and renormalizing after the series instead of after each
operation. This saves quite a bit of area and latency that would otherwise be
needed for de/re-normalizing. Block floating point works well in situations
where the dynamic range within a block of data is small but you don't know
apriori what the scaling will be. This is often used in FFTs.
Glen Herrmannsfeldt wrote:
--"Ray Andraka" <ray@andraka.com> wrote in message
news:3F579ECC.D5C92BF@andraka.com...
Actually, the xilinx structure can make a very efficient cross bar. One
way is
to do a partial reconfiguration to switch the crossbar connections, in
which
case it uses mostly just the routing resources, not CLBs. If partial
reconfiguration is not your cup of tea, you can make efficient 4:1 muxes
using
SRL16's. These take 16 clocks to reroute, and require a simple loader
which can
be shared among many bits, but they are compact and fast.
A subject that comes up reasonably often is doing floating point arithmetic
in FPGA's. For example, as a systolic array. The
prenormalization/postnormalization for floating point add/subtract, using
barrel shifters in CLB's are so big that it is just about impractical. I
was considering the crossbar switch as an array of muxes, which would also
be huge.
I do believe that reconfiguration is too slow for floating point
normalization, but maybe the SRL16's.
There is something called block floating point (I have never used it) where
you have a whole array that has one characteristic but different mantissa
for each element. (Apparently very useful for some algorithms.) In that
case, the 16 clocks to load the SRL16's might be fast enough for a whole
array of numbers. Post normalization could still be a problem, though.
-- glen
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930 Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com
"They that give up essential liberty to obtain a little
temporary safety deserve neither liberty nor safety."
-Benjamin Franklin, 1759