Ways to create a variable multi-tap delay line; and if/gener

M

Marty Ryba

Guest
Hi gang,

I had some interesting results over the last couple weeks that gets me
to thinking there may even be more ways to attack this problem. Picture one
signal coming in (one bit for now, but maybe more later). I want copies of
this signal at various delays from "now". I have a related question
regarding the (less used) if/generate syntax: can this not be in a process?
I tried to have a for...generate around a process with one little localized
if/generate to handle the "first buffer" case in the second approach.
Modelsim (5.8d PE) didn't like it, so I coded the first buffer separately.

The first delay line method creates one buffer, one input address, and an
array of output addresses. Each output is then read from the respective
output address, and all of them get incremented on each read/write cycle
(Clock Enable high). I noticed it synthesized fairly large, so I tried a
second way, reasoning that since the synthesis tool couldn't assume anything
about the arrangement of the addresses that it had to register the data out
the wazoo to keep all the parallel reads happy.

The second way was a lot more code. I created multiple buffers, each with
its own input and output address. I made them shorter, but still the sum of
them was larger than the first buffer. The output of the first feeds the
second, and so on. The only disadvantage is that there is now a tighter
constraint on the relative spacing of the delays, but I can live with that.

Any ideas on even cuter ways to code this up?

Here's the bit of code regarding if/generate that caused problems; it also
illustrates my second way of doing it:
gen_delay: for N = 0 to NUM_DELAYS-1 generate
delay_process: process(CLK)
begin
if rising_edge(CLK) then
....
if CE = '1' then -- valid data coming in
G1: if N = 0 generate
DBuffer(N)(InAddress(N)) <= X_IN; -- first buffer takes input
signal
end generate G1;
GN: if N /= 0 generate
DBuffer(N)(InAddress(N)) <= DBuffer(N-1)(OutAddress(N-1)); --
others take output of previous
end generate GN;
InAddress(N) <= InAddress(N) + 1;
OutAddress(N) <= OutAddress(N) + 1;
X_OUT(N) <= DBuffer(N)(OutAddress(N));
end if
end if
end process delay_process;
end generate gen_delay;
 
"Tricky" <Trickyhead@gmail.com> wrote in message
news:5de2d9e2-14d5-41d4-b27a-dd0ee515b366@25g2000hsx.googlegroups.com...
As for the example:
Why not use a dual port ram:

have the write side address incrementing by 1 every clock cycle. And
then set the read address to (wr_addr - delay) with delay being the
variable delay you want?

Otherwise, using the usual shift register/taps is going to eat up a
lot of logic, and with a large mux could possibly cause horrible
timing problems.
That's what I did (sorry I didn't post the whole thing). The problem is if I
have one input and *more* than one simultaneous output, then it doesn't
become DPRAM but xPRAM of which there is no quick synthesis. That's why I
ended up (so far) with N smaller DPRAMs. As another poster pointed out, if
the input rate is slower then I could use some tricks to serialize things.
In this case, new data can come in on every 66 MHz clock cycle.

Thanks for the tip regarding the optimization of if statements with
constants ("doh!" moment when I read that).
 
On Aug 18, 7:31 am, "Marty Ryba" <martin.ryba.nos...@verizon.net>
wrote:
Hi gang,

    I had some interesting results over the last couple weeks that gets me
to thinking there may even be more ways to attack this problem. Picture one
signal coming in (one bit for now, but maybe more later). I want copies of
this signal at various delays from "now". I have a related question
regarding the (less used) if/generate syntax: can this not be in a process?
I tried to have a for...generate around a process with one little localized
if/generate to handle the "first buffer" case in the second approach.
Modelsim (5.8d PE) didn't like it, so I coded the first buffer separately..

The first delay line method creates one buffer, one input address, and an
array of output addresses. Each output is then read from the respective
output address, and all of them get incremented on each read/write cycle
(Clock Enable high). I noticed it synthesized fairly large, so I tried a
second way, reasoning that since the synthesis tool couldn't assume anything
about the arrangement of the addresses that it had to register the data out
the wazoo to keep all the parallel reads happy.

The second way was a lot more code. I created multiple buffers, each with
its own input and output address. I made them shorter, but still the sum of
them was larger than the first buffer. The output of the first feeds the
second, and so on. The only disadvantage is that there is now a tighter
constraint on the relative spacing of the delays, but I can live with that.

Any ideas on even cuter ways to code this up?

Here's the bit of code regarding if/generate that caused problems; it also
illustrates my second way of doing it:
gen_delay: for N = 0 to NUM_DELAYS-1 generate
  delay_process: process(CLK)
  begin
    if rising_edge(CLK) then
...
      if CE = '1' then -- valid data coming in
        G1: if N = 0 generate
            DBuffer(N)(InAddress(N)) <= X_IN; -- first buffer takes input
signal
           end generate G1;
        GN: if N /= 0 generate
           DBuffer(N)(InAddress(N)) <= DBuffer(N-1)(OutAddress(N-1)); --  
others take output of previous
          end generate GN;
        InAddress(N) <= InAddress(N) + 1;
        OutAddress(N) <= OutAddress(N) + 1;
        X_OUT(N) <= DBuffer(N)(OutAddress(N));
    end if
  end if
 end process delay_process;
end generate gen_delay;
if-generate is a concurrent statement and you can't use it inside a
process.

Regards,
JK
 
On Aug 18, 12:25 pm, JK <krishna.januman...@gmail.com> wrote:
On Aug 18, 7:31 am, "Marty Ryba" <martin.ryba.nos...@verizon.net
wrote:





Hi gang,

    I had some interesting results over the last couple weeks that gets me
to thinking there may even be more ways to attack this problem. Picture one
signal coming in (one bit for now, but maybe more later). I want copies of
this signal at various delays from "now". I have a related question
regarding the (less used) if/generate syntax: can this not be in a process?
I tried to have a for...generate around a process with one little localized
if/generate to handle the "first buffer" case in the second approach.
Modelsim (5.8d PE) didn't like it, so I coded the first buffer separately.

The first delay line method creates one buffer, one input address, and an
array of output addresses. Each output is then read from the respective
output address, and all of them get incremented on each read/write cycle
(Clock Enable high). I noticed it synthesized fairly large, so I tried a
second way, reasoning that since the synthesis tool couldn't assume anything
about the arrangement of the addresses that it had to register the data out
the wazoo to keep all the parallel reads happy.

The second way was a lot more code. I created multiple buffers, each with
its own input and output address. I made them shorter, but still the sum of
them was larger than the first buffer. The output of the first feeds the
second, and so on. The only disadvantage is that there is now a tighter
constraint on the relative spacing of the delays, but I can live with that.

Any ideas on even cuter ways to code this up?

Here's the bit of code regarding if/generate that caused problems; it also
illustrates my second way of doing it:
gen_delay: for N = 0 to NUM_DELAYS-1 generate
  delay_process: process(CLK)
  begin
    if rising_edge(CLK) then
...
      if CE = '1' then -- valid data coming in
        G1: if N = 0 generate
            DBuffer(N)(InAddress(N)) <= X_IN; -- first buffer takes input
signal
           end generate G1;
        GN: if N /= 0 generate
           DBuffer(N)(InAddress(N)) <= DBuffer(N-1)(OutAddress(N-1)); --  
others take output of previous
          end generate GN;
        InAddress(N) <= InAddress(N) + 1;
        OutAddress(N) <= OutAddress(N) + 1;
        X_OUT(N) <= DBuffer(N)(OutAddress(N));
    end if
  end if
 end process delay_process;
end generate gen_delay;

if-generate is a concurrent statement and you can't use it inside a
process.

Regards,
JK- Hide quoted text -

- Show quoted text -
Hi
I really did not understand the use of input and output address.
For me..

DBuffer(0) = X_IN;
gen_delay: for N = 1 to NUM_DELAYS-1 generate
delay_process: process(CLK)
begin
if rising_edge(CLK) then
if CE = '1' then -- valid data coming in
DBuffer(N) <= DBuffer(N-1);
end if
end if
end process delay_process;
end generate gen_delay;
X_OUT = DBuffer(NUM_DELAYS);

Also i think no need to use if/generate (As JK said it is not legal)..
Simply we can write..Expects synthesis tools to optimize it...

if CE = '1' then -- valid data coming in
if N = 0 then
DBuffer(N)(InAddress(N)) <= X_IN; -- first buffer takes
input
signal
end if;
if N /= 0 then
DBuffer(N)(InAddress(N)) <= DBuffer(N-1)(OutAddress(N-1));
--
others take output of previous
end if;

regards
 
You dont need to put generate inside a process, just use the normal if-
then-else statements. When a synthesizer sees that a certain
conditional statement is based on a constant (N in this case), it will
only generate the valid option, and the unreachable conditions
ignored.

if/for - generate is mostly used for the easy auto-generation of
repeatable bits of logic. It is processed as a simulation is started,
not during simulation. So generates can only be conditioned via
constants.
 
As for the example:

Why not use a dual port ram:

have the write side address incrementing by 1 every clock cycle. And
then set the read address to (wr_addr - delay) with delay being the
variable delay you want?

Otherwise, using the usual shift register/taps is going to eat up a
lot of logic, and with a large mux could possibly cause horrible
timing problems.
 
On Aug 17, 10:31 pm, "Marty Ryba" <martin.ryba.nos...@verizon.net>
wrote:
Hi gang,

I had some interesting results over the last couple weeks that gets me
to thinking there may even be more ways to attack this problem. Picture one
signal coming in (one bit for now, but maybe more later). I want copies of
this signal at various delays from "now". I have a related question
regarding the (less used) if/generate syntax: can this not be in a process?
I tried to have a for...generate around a process with one little localized
if/generate to handle the "first buffer" case in the second approach.
Modelsim (5.8d PE) didn't like it, so I coded the first buffer separately.

The first delay line method creates one buffer, one input address, and an
array of output addresses. Each output is then read from the respective
output address, and all of them get incremented on each read/write cycle
(Clock Enable high). I noticed it synthesized fairly large, so I tried a
second way, reasoning that since the synthesis tool couldn't assume anything
about the arrangement of the addresses that it had to register the data out
the wazoo to keep all the parallel reads happy.

The second way was a lot more code. I created multiple buffers, each with
its own input and output address. I made them shorter, but still the sum of
them was larger than the first buffer. The output of the first feeds the
second, and so on. The only disadvantage is that there is now a tighter
constraint on the relative spacing of the delays, but I can live with that.
So to make sure I understand, your delay module has N outputs, and
from the input to any output number k there's a delay of Dk samples.
The first approach wrote to one big RAM, then read out from addresses
offset from the input by D0, then D1, then D2... through DN. The
second approach built a bunch of cascaded delays, where the first
implemented (D2-D1) samples, the next implemented (D3-D2) samples
etc. This might be what you need in, for example, a FIR filter with
very sparse coefficients.

What you can achieve depends on what the data rate is with respect to
the clock. If the data can come fast (i.e. N clock enables back to
back, then you need to physically instantiate at least N output
registers, and the second approach is probably best. If there are
more than N clocks between clock enables, the first approach is
probably best, where you read out one 'tap' per clock cycle and then
store it into its own register. If this holds, you might even be able
to "serialize" the operation you do with the taps, and avoid the
intermediate registers altogether.

- Kenn
 

Welcome to EDABoard.com

Sponsor

Back
Top