balancing IIR filter (after adding extra registers)

Z

zak

Guest
I designed a low pass IIR filter in starix iv but I got speed problem.
need to run it on 245MHz but can only achieve about 180. I was advised b
experts to insert extra registers and this improved speed but the output o
filter went wrong.

I was advised to balance the filter since I inserted extra registers. Bu
how ?

I did some modeling and realized with a surprise that it seems just no
possible that I can balance any IIR filter(but can with FIR filter).

Has anybody any idea about balancing IIR filters. The difficulty is in th
feedback terms.

The filter I am using is Yn = (1-alpha)*Xn + alpha*Yn-1

Thanks in advance



---------------------------------------
Posted through http://www.FPGARelated.com
 
On Thu, 12 Jan 2012 06:49:42 -0600, zak wrote:

I designed a low pass IIR filter in starix iv but I got speed problem. I
need to run it on 245MHz but can only achieve about 180. I was advised
by experts to insert extra registers and this improved speed but the
output of filter went wrong.

I was advised to balance the filter since I inserted extra registers.
But how ?

I did some modeling and realized with a surprise that it seems just not
possible that I can balance any IIR filter(but can with FIR filter).

Has anybody any idea about balancing IIR filters. The difficulty is in
the feedback terms.

The filter I am using is Yn = (1-alpha)*Xn + alpha*Yn-1

Thanks in advance
There have to be books on this...

What's holding up the train? The addition? The multiplication? The
logic in between?

I don't do FPGA design anywhere near full time -- does the Stratix IV
have hardware multiply? Hardware add? Perhaps even hardware MAC? If it
has a hardware multiply-and-add, then you need to make sure you're using
it efficiently.

If all else fails and you just have to put in delays, then all is not
lost (presuming that you can stand some delay in the output). You're
designing a pretty elementary low-pass filter, so the first thing you can
do is just see what happens when you stick some extra delay in there.

Let

y_n = a^2 * y_{n-2} + (1 - a^2) * x_{n-1}

This should be easier to realize than your difference equation.

Now perform a z-transform on this (see:
http://www.wescottdesign.com/articles/zTransform/z-transforms.html, and
please forgive any broken links, &c):

Y(z) = z^-2 * a^2 * Y(z) + z^-1 * (1 - a^2) * X

and solve for the transfer function:

Y(z) (1 - a^2) z (1 - a^2) z
H(z) = ---- = ----------- = --------------
X(z) z^2 - a^2 (z - a)(z + a)

If you limit |a| < 1, then H(z) is stable, (and an unstable system is the
first "wrong" that you might encounter) but while it has a generally low-
pass character up to Fs/4 (Fs = sampling frequency), the response rises
after that back to unity -- and that's bad.

If you doctor this up a bit with a felicitously placed zero, then you can
get

Y(z) 0.5 (1 - a^2) (z + 1)
H(z) = ---- = ---------------------
X(z) (z - a)(z + a)

There are a number of ways that you can achieve this, but your result is
going to be a filter with unity gain at DC (good), the same general
transfer function as your example difference equation (good), except at
Fs/2 where the response will be zero (better than yours), and --
hopefully -- the extra delay in the difference equation will be enough to
let you pipeline your math enough to realize this thing and get the speed
you need.

--
My liberal friends think I'm a conservative kook.
My conservative friends think I'm a liberal kook.
Why am I not happy that they have found common ground?

Tim Wescott, Communications, Control, Circuits & Software
http://www.wescottdesign.com
 
"zak" <kazimayob2@n_o_s_p_a_m.aol.com> wrote in message
news:Ne-dndbs46R7S5PSnZ2dnUVZ_s2dnZ2d@giganews.com...
I designed a low pass IIR filter in starix iv but I got speed problem. I
need to run it on 245MHz but can only achieve about 180. I was advised by
experts to insert extra registers and this improved speed but the output
of
filter went wrong.

I was advised to balance the filter since I inserted extra registers. But
how ?

I did some modeling and realized with a surprise that it seems just not
possible that I can balance any IIR filter(but can with FIR filter).

Has anybody any idea about balancing IIR filters. The difficulty is in the
feedback terms.

The filter I am using is Yn = (1-alpha)*Xn + alpha*Yn-1

My guess is you need to add some registers on your input and outputs (and
add latencty). I guess this implementation gets put into a DSP core within a
tiny area and io's has to run a long distance before getting there or
getting out.. The feedback has dedicated routing inside a DSP and should be
very fast. Do you know that this gets implemented in a DSP or does the tool
try to build it with gates?

To really be able to help I would like to see the source, the timing report
and details and/or knowledge about input and outputs (like are they IO
pins?) of this IIR.
 
"zak" <kazimayob2@n_o_s_p_a_m.aol.com> wrote:

I designed a low pass IIR filter in starix iv but I got speed problem. I
need to run it on 245MHz but can only achieve about 180. I was advised by
experts to insert extra registers and this improved speed but the output of
filter went wrong.

I was advised to balance the filter since I inserted extra registers. But
how ?

I did some modeling and realized with a surprise that it seems just not
possible that I can balance any IIR filter(but can with FIR filter).

Has anybody any idea about balancing IIR filters. The difficulty is in the
feedback terms.

The filter I am using is Yn = (1-alpha)*Xn + alpha*Yn-1
You can't use much registers and just adding registers will make
routing worse, not better.

Your filter seems to consist of 2 multipliers and an adder. The first
optimisation you can do is using one's complement instead of two's
complement. When using one's complement you don't need to sign extend
the multiplicants. In Xilinx FPGAs the multiplipliers get faster when
you use less bits.

--
Failure does not prove something is impossible, failure simply
indicates you are not using the right tools...
nico@nctdevpuntnl (punt=.)
--------------------------------------------------------------
 
Thanks all for the replies.

My main concern was not the timing per se as I may eventually get over it
But specifically "Can we balance a given IIR filter" if we have to ad
extra registers??

In my simple filter design there is chain of [a subtractor=> a multiplier=
a subtractor] without any register in between. Obviously this causes lon
paths and need be broken by registers according to RTL methodology.

I understand Tim is suggesting redesigning IIR with inherent registers i
it. It is interesting idea and I managed to verify that the suggested fina
filter is better than mine but still it will have - I believe - some lon
paths.

Regards

Zak

---------------------------------------
Posted through http://www.FPGARelated.com
 
On Sat, 14 Jan 2012 15:49:06 -0600, zak wrote:

Thanks all for the replies.

My main concern was not the timing per se as I may eventually get over
it. But specifically "Can we balance a given IIR filter" if we have to
add extra registers??

In my simple filter design there is chain of [a subtractor=> a
multiplier=> a subtractor] without any register in between. Obviously
this causes long paths and need be broken by registers according to RTL
methodology.

I understand Tim is suggesting redesigning IIR with inherent registers
in it. It is interesting idea and I managed to verify that the suggested
final filter is better than mine but still it will have - I believe -
some long paths.
Actually what I was suggesting was a difference equation that you might
be able to realize with a structure that has more pipelining, not
something that you would attempt to implement directly.

Pipelining is for you to do -- I'm just being the math egghead.

Whether just one clock worth of delay is going to be enough to do all the
pipelining you need -- I dunno.

OTOH, the math itself imposes no limit to the amount of delay you can
have in the filter -- you can have three, four, or 1000 clocks worth.
But each delay you add puts a null in the response and increases the
overall delay of the filter; at some point the null will encroach on your
desired response and that would be a Bad Thing.

The difference equation is easy:

y_n = d^N * y_{n-N} + (1-d^N) * (1/N) * sum from {k=0} to {N-1} x_{n-k}

This gives you the transfer function

H(z) = ((1-d^N)/N)*(z^(N-1) + ... + z + 1) / (z^N - d^N),

the denominator is basically the same old difference equation, only with
as much delay as you need for pipelining. The numerator describes a CIC
filter, which is the Easiest FIR of All.

Presumably, in order to pipeline this effectively you'd have to add an
additional N counts of delay -- at some point the filter output is going
to be useless to you just because of delay, if nothing else.

--
Tim Wescott
Control system and signal processing consulting
www.wescottdesign.com
 
On Thu, 12 Jan 2012 06:49:42 -0600, zak wrote:

I designed a low pass IIR filter in starix iv but I got speed problem. I
need to run it on 245MHz but can only achieve about 180. I was advised
by experts to insert extra registers and this improved speed but the
output of filter went wrong.

I was advised to balance the filter since I inserted extra registers.
But how ?

I did some modeling and realized with a surprise that it seems just not
possible that I can balance any IIR filter(but can with FIR filter).

Has anybody any idea about balancing IIR filters. The difficulty is in
the feedback terms.

The filter I am using is Yn = (1-alpha)*Xn + alpha*Yn-1
I got another thought.

What frequency are you filtering _to_? Why are you using an IIR at all?
If you are filtering heavily enough you should be able to prefilter with
a CIC, decimate, and run your IIR filter (if you still need it) at a
lower rate. Would that meet your requirements?

--
Tim Wescott
Control system and signal processing consulting
www.wescottdesign.com
 
Tim Wescott <tim@seemywebsite.please> wrote:

(snip)
Actually what I was suggesting was a difference equation that you might
be able to realize with a structure that has more pipelining, not
something that you would attempt to implement directly.

Pipelining is for you to do -- I'm just being the math egghead.

Whether just one clock worth of delay is going to be enough to do all the
pipelining you need -- I dunno.

OTOH, the math itself imposes no limit to the amount of delay you can
have in the filter -- you can have three, four, or 1000 clocks worth.
But each delay you add puts a null in the response and increases the
overall delay of the filter; at some point the null will encroach on your
desired response and that would be a Bad Thing.
The way I usually think about this, partly because of the way ones
I work on are used, is that with added pipelining you can run
interleaved data streams. Now popularly known as Simultaneous
Multithreading, instead of processing one data stream faster, process
many data streams at about the same speed. (Once in a while, remind
the marketing department of the difference. Too often they quote
the faster speed without qualification.)

-- glen
 
Let me explain myself further.

A filter (FIR or IIR) has obviously its own terms(z terms of transfer
function) which are implemented as registers as you know(let us call them
term registers).

On the other hand device speed may require its own registers(I call it
pipeline registers).

I am not worried about input to output delay(let it be 10s of cloc
periods)
i.e I can insert registers at input and output freely.
But inside filter stages I need care to keep filter transfer function
accurate. For FIR, computations are forward and the rule I found is that i
I
need to delay any FIR term I should delay all its other terms equally. Fo
IIR filter, there are both forward and feedback computations. I can delay
forward terms equally and the result stays correct upto to its end but I
cannot do that for feedback terms.

example: suppose y(n) = a*x(n) + b*y(n-1)

Obviously meaning current output = a*current input + b*previous output.
Suppose I wanted to use a structure that ended up with no register between
result of b*y(n-1)and adder. So I decided to add a pipeline register. Thi

implies that I am adding b*y(n-2) which can be correct if I added it to
a*x(n-1).
so I delayed x input and this makes adder result as a*x(n-1). But thi
also
means now feedback term becomes b*y(n-3) naturally.

Am I missing the obvious?

Zak





---------------------------------------
Posted through http://www.FPGARelated.com
 
On Sun, 15 Jan 2012 07:40:54 -0600, zak wrote:

Let me explain myself further.

A filter (FIR or IIR) has obviously its own terms(z terms of transfer
function) which are implemented as registers as you know(let us call
them term registers).

On the other hand device speed may require its own registers(I call it
pipeline registers).

I am not worried about input to output delay(let it be 10s of clock
periods)
i.e I can insert registers at input and output freely. But inside filter
stages I need care to keep filter transfer function accurate. For FIR,
computations are forward and the rule I found is that if I
need to delay any FIR term I should delay all its other terms equally.
For IIR filter, there are both forward and feedback computations. I can
delay forward terms equally and the result stays correct upto to its end
but I cannot do that for feedback terms.

example: suppose y(n) = a*x(n) + b*y(n-1)

Obviously meaning current output = a*current input + b*previous output.
Suppose I wanted to use a structure that ended up with no register
between result of b*y(n-1)and adder. So I decided to add a pipeline
register. This

implies that I am adding b*y(n-2) which can be correct if I added it to
a*x(n-1).
so I delayed x input and this makes adder result as a*x(n-1). But this
also
means now feedback term becomes b*y(n-3) naturally.

Am I missing the obvious?
Let us say that you want to do an operation

a = b * c + d,

and that this operation can only be done with three stages of pipeline
delay, such that a is good at the beginning of the 3rd clock after you
start:

a_n = b * c_{n-3} + d

(Let's also say that you're doing this in a true pipeline, so that a_3,
a_4, ... are all good assuming that c_0, c_1, ... are good)

Let c_{n-3} = a_{n-3}, which we can do by definition because a is good at
the beginning of the 3rd clock.

Now we have

a_n = b * a_{n-3} + d

Do I need to continue, or is it all obvious?

--
Tim Wescott
Control system and signal processing consulting
www.wescottdesign.com
 
On Jan 12, 12:49 pm, "zak" <kazimayob2@n_o_s_p_a_m.aol.com> wrote:
I designed a low pass IIR filter in starix iv but I got speed problem. I
need to run it on 245MHz but can only achieve about 180. I was advised by
experts to insert extra registers and this improved speed but the output of
filter went wrong.

I was advised to balance the filter since I inserted extra registers. But
how ?

I did some modeling and realized with a surprise that it seems just not
possible that I can balance any IIR filter(but can with FIR filter).

Has anybody any idea about balancing IIR filters. The difficulty is in the
feedback terms.

The filter I am using is Yn = (1-alpha)*Xn + alpha*Yn-1

Thanks in advance

---------------------------------------
Posted throughhttp://www.FPGARelated.com
If you can design the module so it processes a specified (or
parameterised) number of channels it is fairly straightforward. (You
can design it this way and then simply use only one channel). Say
your "processing path" i.e. the x.k + y.(1-k) calculation has a
pipeline delay of 3 clock cycles overall, then you create a pipeline
delay from output back to the input of 3 - 3 = 0 clock cycles (i.e.
direct connection). However if you have number_of_channels set to say
16, then this pipeline delay would be 16-3 = 13 clock cycles long.

That way, the previous output for each channel lines up with the
current input for that same channel. The number of channels has to be
a minimum of this pipeline delay for it to work.

If you add a clock enable to the whole shebang, then you can enable
the logic for the specified number of channels at the start of the new
sample frame. At the next sample frame, repeat and the pipelining
takes care of itself. Everything inside the module must be controlled
by the clock enable though.
 
If you can design the module so it processes a specified (or
parameterised) number of channels it is fairly straightforward. (You
can design it this way and then simply use only one channel). Say
your "processing path" i.e. the x.k + y.(1-k) calculation has a
pipeline delay of 3 clock cycles overall, then you create a pipeline
delay from output back to the input of 3 - 3 =3D 0 clock cycles (i.e.
direct connection). However if you have number_of_channels set to say
16, then this pipeline delay would be 16-3 =3D 13 clock cycles long.

That way, the previous output for each channel lines up with the
current input for that same channel. The number of channels has to be
a minimum of this pipeline delay for it to work.

If you add a clock enable to the whole shebang, then you can enable
the logic for the specified number of channels at the start of the new
sample frame. At the next sample frame, repeat and the pipelining
takes care of itself. Everything inside the module must be controlled
by the clock enable though.
Thanks Dave,

That is indeed a solution to inserting extra registers, ofcourse provide
one can afford extra slots or just use clock enable and survive highe
clock rates (exploiting multicycle constraints).

In principle I believe you agreed with me of the nature of the problem.

Zak




---------------------------------------
Posted through http://www.FPGARelated.com
 
On Jan 16, 9:19 pm, "zak" <kazimayob2@n_o_s_p_a_m.n_o_s_p_a_m.aol.com>
wrote:
If you can design the module so it processes a specified (or
parameterised) number of channels it is fairly straightforward.  (You
can  design it this way and then simply use only one channel).  Say
your "processing path" i.e. the x.k + y.(1-k) calculation has a
pipeline delay of 3 clock cycles overall, then you create a pipeline
delay from output back to the input of 3 - 3 =3D 0 clock cycles (i.e.
direct connection).  However if you have number_of_channels set to say
16, then this pipeline delay would be 16-3 =3D 13 clock cycles long.

That way, the previous output for each channel lines up with the
current input for that same channel.  The number of channels has to be
a minimum of this pipeline delay for it to work.

If you add a clock enable to the whole shebang, then you can enable
the logic for the specified number of channels at the start of the new
sample frame.  At the next sample frame, repeat and the pipelining
takes care of itself.  Everything inside the module must be controlled
by the clock enable though.

Thanks Dave,

That is indeed a solution to inserting extra registers, ofcourse provided
one can afford extra slots or just use clock enable and survive higher
clock rates (exploiting multicycle constraints).

In principle I believe you agreed with me of the nature of the problem.

Zak

---------------------------------------
Posted throughhttp://www.FPGARelated.com
Yes I agree. Here's a potential solution for you using the principle
of superposition:

1. Implement a number of the multi-channel filter module instances I
suggested in parallel (assuming you have enough logic resource). e.g.
Let's say 4:
2. Feed instance 0 with sample 0, 4, 8... and zero values for
1,2,3,5,6,7,9,10,11 etc
3. Feed instance 1 with sample 1,5,9... and zero values for
0,2,3,4,6,7,8,10 etc
4. Feed instance 2 with sample 2,6,10... and zero values for
0,1,3,4,5,7,8,9,11 etc
5. Feed instance 3 with sample 3,7,11... and zero values for
0,1,2,4,5,6,8,9,10 etc

and so on - you would need a mux to select the real sample or zero
value for the input to each filter module.

Sum the results together (you might have to pipeline this some more to
achieve the required clock rate).

The result will be identical to a single instance processing all the
samples but with higher raw performance.
 
On Jan 16, 9:19 pm, "zak" <kazimayob2@n_o_s_p_a_m.n_o_s_p_a_m.aol.com>
wrote:
If you can design the module so it processes a specified (or
parameterised) number of channels it is fairly straightforward.  (You
can  design it this way and then simply use only one channel).  Say
your "processing path" i.e. the x.k + y.(1-k) calculation has a
pipeline delay of 3 clock cycles overall, then you create a pipeline
delay from output back to the input of 3 - 3 =3D 0 clock cycles (i.e.
direct connection).  However if you have number_of_channels set to say
16, then this pipeline delay would be 16-3 =3D 13 clock cycles long.

That way, the previous output for each channel lines up with the
current input for that same channel.  The number of channels has to be
a minimum of this pipeline delay for it to work.

If you add a clock enable to the whole shebang, then you can enable
the logic for the specified number of channels at the start of the new
sample frame.  At the next sample frame, repeat and the pipelining
takes care of itself.  Everything inside the module must be controlled
by the clock enable though.

Thanks Dave,

That is indeed a solution to inserting extra registers, ofcourse provided
one can afford extra slots or just use clock enable and survive higher
clock rates (exploiting multicycle constraints).

In principle I believe you agreed with me of the nature of the problem.

Zak

---------------------------------------
Posted throughhttp://www.FPGARelated.com
Yes. I think you may be stuffed. I can't help thinking there's a way
utilising parallel instances of the multi-channel module I mentioned
and the principle of superposition but whichever way I look at it, it
doesn't seem to work out. I think you need faster logic or clever
calculation. Maybe splitting x.k+(1-k)y into y+x.k-y.k might help.
Also, using convenient powers of 2 for k will help since these can be
done using a simple part select in verilog speak. Basically needs to
be done all within one clock cycle. Asking a lot at those speeds.
 
On Jan 12, 6:49 am, "zak" <kazimayob2@n_o_s_p_a_m.aol.com> wrote:
I designed a low pass IIR filter in starix iv but I got speed problem. I
need to run it on 245MHz but can only achieve about 180. I was advised by
experts to insert extra registers and this improved speed but the output of
filter went wrong.

I was advised to balance the filter since I inserted extra registers. But
how ?

I did some modeling and realized with a surprise that it seems just not
possible that I can balance any IIR filter(but can with FIR filter).

Has anybody any idea about balancing IIR filters. The difficulty is in the
feedback terms.

The filter I am using is Yn = (1-alpha)*Xn + alpha*Yn-1

Thanks in advance

---------------------------------------
Posted throughhttp://www.FPGARelated.com
Classic problem: need for speed

* Add registers : to reduce delay due to your logic (someone's better
than others)

* Manual routing ...yeah: man vs. machine

* Split the process: if we can't do it in 254 MHz we will do it in
254/2 MHz
 
On Wed, 18 Jan 2012 07:16:23 -0800, Mawa_fugo wrote:

On Jan 12, 6:49 am, "zak" <kazimayob2@n_o_s_p_a_m.aol.com> wrote:
I designed a low pass IIR filter in starix iv but I got speed problem.
I need to run it on 245MHz but can only achieve about 180. I was
advised by experts to insert extra registers and this improved speed
but the output of filter went wrong.

I was advised to balance the filter since I inserted extra registers.
But how ?

I did some modeling and realized with a surprise that it seems just not
possible that I can balance any IIR filter(but can with FIR filter).

Has anybody any idea about balancing IIR filters. The difficulty is in
the feedback terms.

The filter I am using is Yn = (1-alpha)*Xn + alpha*Yn-1

Thanks in advance

--------------------------------------- Posted
throughhttp://www.FPGARelated.com

Classic problem: need for speed

* Add registers : to reduce delay due to your logic (someone's better
than others)

* Manual routing ...yeah: man vs. machine

* Split the process: if we can't do it in 254 MHz we will do it in 254/2
MHz
The OP understands that. The complication is that he's trying to
implement an IIR filter, which in its simplest form requires that you use
data that depends on the state of a computation that is only one clock
old. Taken at face value, this means that you can't use pipelining.

I have offered him a couple of different suggestions on how to get around
this problem, but he does not seem interested in them. I'm not sure if
he is failing to understand that they are valid solutions, or if he is
just more comfortable moaning about the problem being "impossible" rather
than being willing to address it as difficult.

--
Tim Wescott
Control system and signal processing consulting
www.wescottdesign.com
 
(Hi Tim,)
sorry I must admit I waded straight in myself without reading all of
the thread however:

Let c_{n-3} = a_{n-3}, which we can do by definition because a is good at
the beginning of the 3rd clock.
I'm not sure I follow this. Please continue.

As we have one sample per clock cycle on a single channel stream,
x{0}, x{1}, x{2} etc. what we want is:

y{0} = A * x{-1} + B * y{-1}

But the best we can do at the desired clock rate (assuming we have a
pipeline delay of 3) is:

y{0} = A * x{-3} + B * y{-3}

Is this doable?
 
On Jan 18, 6:49 pm, davew <david.wo...@gmail.com> wrote:
(Hi Tim,)
sorry I must admit I waded straight in myself without reading all of
the thread however:

Let c_{n-3} = a_{n-3}, which we can do by definition because a is good at
the beginning of the 3rd clock.

I'm not sure I follow this.  Please continue.

As we have one sample per clock cycle on a single channel stream,
x{0}, x{1}, x{2} etc. what we want is:

y{0} = A * x{-1} + B * y{-1}

But the best we can do at the desired clock rate (assuming we have a
pipeline delay of 3) is:

y{0} = A * x{-3} + B * y{-3}

Is this doable?
Or to put it in words, if it takes 3 clocks to produce an output y
from x, but you need y on the next clock cycle to combine with the
next value of x then you simply can't have it, or can you?
 
On Wed, 18 Jan 2012 10:53:09 -0800, davew wrote:

On Jan 18, 6:49 pm, davew <david.wo...@gmail.com> wrote:
(Hi Tim,)
sorry I must admit I waded straight in myself without reading all of
the thread however:

Let c_{n-3} = a_{n-3}, which we can do by definition because a is
good at the beginning of the 3rd clock.

I'm not sure I follow this.  Please continue.

As we have one sample per clock cycle on a single channel stream, x{0},
x{1}, x{2} etc. what we want is:

y{0} = A * x{-1} + B * y{-1}

But the best we can do at the desired clock rate (assuming we have a
pipeline delay of 3) is:

y{0} = A * x{-3} + B * y{-3}

Is this doable?

Or to put it in words, if it takes 3 clocks to produce an output y from
x, but you need y on the next clock cycle to combine with the next value
of x then you simply can't have it, or can you?
Pretty much.

So I was trying to elucidate what you _can_ do if you have some imposed
delay, which is to make a stable IIR filter that happens to work for
minimum delays that are greater than 1.

There are limits, the two chief ones being bandwidth and delay. Your
answer is going to have some pretty healthy delays both because of the
computation of the feedback portion of the filter (which is limited to
delays of N or more) and the feed-forward part of your filter (which
needs to have at least N terms if you're going to get a sensible
frequency response). A your bandwidth (as a proportion to your sample
rate) that you can sensibly "ask for" gets ever narrower as your delays
get greater (really, it's probably better to say that your bandwidth in
real terms reaches a plateau above which it's hard to get).

In amongst all the math don't miss the point that I also made, that you
can pre-filter with something easy like a CIC, then decimate, then either
be done or follow that with an IIR filter.

--
Tim Wescott
Control system and signal processing consulting
www.wescottdesign.com
 
Tim Wescott wrote:
On Wed, 18 Jan 2012 10:53:09 -0800, davew wrote:

On Jan 18, 6:49 pm, davew <david.wo...@gmail.com> wrote:
(Hi Tim,)
sorry I must admit I waded straight in myself without reading all of
the thread however:

Let c_{n-3} = a_{n-3}, which we can do by definition because a is
good at the beginning of the 3rd clock.
I'm not sure I follow this. Please continue.

As we have one sample per clock cycle on a single channel stream, x{0},
x{1}, x{2} etc. what we want is:

y{0} = A * x{-1} + B * y{-1}

But the best we can do at the desired clock rate (assuming we have a
pipeline delay of 3) is:

y{0} = A * x{-3} + B * y{-3}

Is this doable?
Or to put it in words, if it takes 3 clocks to produce an output y from
x, but you need y on the next clock cycle to combine with the next value
of x then you simply can't have it, or can you?

Pretty much.

So I was trying to elucidate what you _can_ do if you have some imposed
delay, which is to make a stable IIR filter that happens to work for
minimum delays that are greater than 1.

There are limits, the two chief ones being bandwidth and delay. Your
answer is going to have some pretty healthy delays both because of the
computation of the feedback portion of the filter (which is limited to
delays of N or more) and the feed-forward part of your filter (which
needs to have at least N terms if you're going to get a sensible
frequency response). A your bandwidth (as a proportion to your sample
rate) that you can sensibly "ask for" gets ever narrower as your delays
get greater (really, it's probably better to say that your bandwidth in
real terms reaches a plateau above which it's hard to get).

In amongst all the math don't miss the point that I also made, that you
can pre-filter with something easy like a CIC, then decimate, then either
be done or follow that with an IIR filter.
From:

y{0} = A * x{-3} + B * y{-3}

A point that should not be missed is that you really have three
interleaved IIR filters each running at 1/3 the sample rate. If
the signal bandwidth is high enough, there will be a significant
output component at 1/3 the sample rate due to this. As a worst
case, an input with a lot of energy at 1/3 the sample rate would
be almost unfiltered.

-- Gabor
 

Welcome to EDABoard.com

Sponsor

Back
Top