Guest
Dear List,
I am trying to implement a 16-tap FIR Low-Pass Filter and have written
the convolution in VHDL (of which I am a beginner). The input sequence
'x' is a 1-bit sequence of 1's and 0's. This is to be converted to 1's
and -1's and convolved with the impulse sequence 'h'. My goal is for
the convolution portion of the filter to be completely asynchronous
and parallel. That is with each clock cycle 16 bits of the input
sequence are convolved with the impulse response providing a single 12-
bit output. Each element of the impulse response 'h' is a 10 bit
signed integer. The input sequence 'x' is a known sequence and I am
sure the output sequence 'y' will always fit into 12 bits.
Here is the code:
-- 16-tap FIR Low-Pass Filter Convolution Function
--
--
-- When convolved with the code it will produce a maximum value that
will fit into 12-bits
library ieee;
use ieee.std_logic_1164.all;
use ieee.std_logic_arith.all;
use ieee.numeric_std.all;
entity fir_lpf_conv is
port (
x: in std_logic_vector(15 downto 0);
y: out std_logic_vector(11 downto 0)
);
end fir_lpf_conv;
architecture fir_lpf_conv_arch of fir_lpf_conv is
type coef_type is array(0 to 15) of integer range -511 to 511;
constant h: coef_type :=
(4,-2,-28,-53,-17,128,345,511,511,345,128,-17,-53,-28,-2,4);
signal mult: coef_type;
signal sum: integer range -2047 to 2047;
begin
blabla: for i in x'range generate
mult(i) <= h(i) when x(i)='1' else -h(i);
end generate;
sum <= mult(0) + mult(1) + mult(2) + mult(3) + mult(4) + mult(5) +
mult(6) + mult(7)
+ mult(8) + mult(9) + mult(10) + mult(11) + mult(12) + mult(13)
+ mult(14) + mult(15);
y <= std_logic_vector(to_signed(sum,12));
end fir_lpf_arch;
I haven't simulated it yet, but I have a sneaky feeling it will not do
what I expect. Even if it does do what I want it to then I'd like to
understand why.
The code should multiply each h element by the corresponding x element
(with zeros converted to -1s) in parallel, AND THEN sum the result
into sum AND THEN put the 'sum' result into 'y'. My use of AND THEN in
that statement makes me think I need sequential code, that is the
multiply should be done in parallel, and the sum should be done in
parallel, but the sum should use the results of the multiply. However
when I look at sequential code it is always clock or event driven and
I don't think that's what I need. All this should be done in less than
1/2 clock cycle.
I could see the compiler synthesizing the above code in two different
ways:
1. Multiply in parallel AND THEN add the results in parallel. (this
would be good)
2. Multiply in parallel and add in parallel. The parallel sum will use
the previous values stored in 'mult', and possibly some updated values
in 'mult' depending on the exact timing. (this would be bad)
So my question is: if the code is correct, then what is the rule for
synthesis? How does the compiler know that I want 'AND THEN' behavior?
If the code is incorrect, what do I write to get 'AND THEN' behavior
that is not clock driven?
I also have a couple less important questions:
Is there a better way to write my sum using a for loop? I couldn't get
it to compile.
I really don't need the intermediate signal 'sum'. I'd like to just
sum into 'y' but I get a type error because the synthesizer doesn't
know if the stuff on the right is signed or unsigned.
Thank You!
Brian
I am trying to implement a 16-tap FIR Low-Pass Filter and have written
the convolution in VHDL (of which I am a beginner). The input sequence
'x' is a 1-bit sequence of 1's and 0's. This is to be converted to 1's
and -1's and convolved with the impulse sequence 'h'. My goal is for
the convolution portion of the filter to be completely asynchronous
and parallel. That is with each clock cycle 16 bits of the input
sequence are convolved with the impulse response providing a single 12-
bit output. Each element of the impulse response 'h' is a 10 bit
signed integer. The input sequence 'x' is a known sequence and I am
sure the output sequence 'y' will always fit into 12 bits.
Here is the code:
-- 16-tap FIR Low-Pass Filter Convolution Function
--
--
-- When convolved with the code it will produce a maximum value that
will fit into 12-bits
library ieee;
use ieee.std_logic_1164.all;
use ieee.std_logic_arith.all;
use ieee.numeric_std.all;
entity fir_lpf_conv is
port (
x: in std_logic_vector(15 downto 0);
y: out std_logic_vector(11 downto 0)
);
end fir_lpf_conv;
architecture fir_lpf_conv_arch of fir_lpf_conv is
type coef_type is array(0 to 15) of integer range -511 to 511;
constant h: coef_type :=
(4,-2,-28,-53,-17,128,345,511,511,345,128,-17,-53,-28,-2,4);
signal mult: coef_type;
signal sum: integer range -2047 to 2047;
begin
blabla: for i in x'range generate
mult(i) <= h(i) when x(i)='1' else -h(i);
end generate;
sum <= mult(0) + mult(1) + mult(2) + mult(3) + mult(4) + mult(5) +
mult(6) + mult(7)
+ mult(8) + mult(9) + mult(10) + mult(11) + mult(12) + mult(13)
+ mult(14) + mult(15);
y <= std_logic_vector(to_signed(sum,12));
end fir_lpf_arch;
I haven't simulated it yet, but I have a sneaky feeling it will not do
what I expect. Even if it does do what I want it to then I'd like to
understand why.
The code should multiply each h element by the corresponding x element
(with zeros converted to -1s) in parallel, AND THEN sum the result
into sum AND THEN put the 'sum' result into 'y'. My use of AND THEN in
that statement makes me think I need sequential code, that is the
multiply should be done in parallel, and the sum should be done in
parallel, but the sum should use the results of the multiply. However
when I look at sequential code it is always clock or event driven and
I don't think that's what I need. All this should be done in less than
1/2 clock cycle.
I could see the compiler synthesizing the above code in two different
ways:
1. Multiply in parallel AND THEN add the results in parallel. (this
would be good)
2. Multiply in parallel and add in parallel. The parallel sum will use
the previous values stored in 'mult', and possibly some updated values
in 'mult' depending on the exact timing. (this would be bad)
So my question is: if the code is correct, then what is the rule for
synthesis? How does the compiler know that I want 'AND THEN' behavior?
If the code is incorrect, what do I write to get 'AND THEN' behavior
that is not clock driven?
I also have a couple less important questions:
Is there a better way to write my sum using a for loop? I couldn't get
it to compile.
I really don't need the intermediate signal 'sum'. I'd like to just
sum into 'y' but I get a type error because the synthesizer doesn't
know if the stuff on the right is signed or unsigned.
Thank You!
Brian