R
Ray Andraka
Guest
The problem with the multiplier block approach is that the
construction is predicated on the specific coefficients. As
a result it is considerably harder to use for an arbitrary
set of coefficients. It may reduce area over a straight FIR
filter running at the same clocks per sample, but at a
considerable cost in design time and flexibility. You also
give up regularity in the structure, which may reduce the
overall performance. Essentially what the block multiplier
and distributed arithmetic approaches are is a rearrangement
of the bitwise product terms. The mutliplier block takes
advantage of duplicate terms by adding the inputs before
they are multiplied by the term.
Michael Spencer wrote:
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930 Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com
"They that give up essential liberty to obtain a little
temporary safety deserve neither liberty nor safety."
-Benjamin
Franklin, 1759
construction is predicated on the specific coefficients. As
a result it is considerably harder to use for an arbitrary
set of coefficients. It may reduce area over a straight FIR
filter running at the same clocks per sample, but at a
considerable cost in design time and flexibility. You also
give up regularity in the structure, which may reduce the
overall performance. Essentially what the block multiplier
and distributed arithmetic approaches are is a rearrangement
of the bitwise product terms. The mutliplier block takes
advantage of duplicate terms by adding the inputs before
they are multiplied by the term.
Michael Spencer wrote:
--Hello,
Has anyone compared FPGA implementations of full-rate
digital FIR filters based on the use of Multiplier Blocks
vs. traditional FIRs with constant coefficient
multipliers? By full rate, I mean: one output result per
clock cycle and no interpolation or decimation.
For anyone not familiar, a multiplier block is a network
of shifters and adders that performs multiplications by
several coefficients efficiently by exploiting common
sub-expressions. The multiplier block can be exploited in
FIR filters by transposing the standard filter so that the
products of all the coefficients with the current
input-sample are required simultaneously.
Also, by representing the coefficients in the
Canonical-Signed-Digit number system (a small number of
+1 and -1s) along common sub-expression sharing the
multiplier block can get even smaller.
For example, the multiplier block for a 100 tap FIR filter
(fp=0.10 and fs=0.12) can be realized with only 61 adds
(zero explicit multiplications). See filter example #4 in
FIR Filter Synthesis Algorithms for Minimizing the Delay
and the Number of Adders,
http://ics.kaist.ac.kr/~dk/papers/TCAD2001.pdf
If the adder depth is constrained to a maximum of four,
then the authors algorithm can do the multiplier block in
69 additions.
It would seem that this approach would be very efficient
in a target such as the Xilinx Spartan-IIE (with no
dedicated multipliers).
Another question: If we only need one result per K clock
periods (K ~= 1000 for audio applications), could a
multiplier block approach realized with, say, bit-serial
addition be more efficient than some other approach such
as distributed arithmetic?
Comments welcome. Thanks.
-Michael
______________________
Michael E. Spencer, Ph.D.
President
Signal Processing Solutions, Inc.
Web: http://www.spsolutions.com
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930 Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com
"They that give up essential liberty to obtain a little
temporary safety deserve neither liberty nor safety."
-Benjamin
Franklin, 1759