Pipelining a multi-dimensional array.

N

Novlednes

Guest
Hi,

I'm trying to pipeline the multi-dimensional array T in the following
code snippet. The pipelining is done in process p_Pipe. However, there
is a driver on T(0) in process p_Sim. No problem you would think,
since the for loop in p_Pipe only addresses elements T(1), T(2) and T
(3). For some reason, my simulator (Modelsim) appears to translate the
for loop in a driver on T(0) as well, leading to all Xs on T(0). But
when unfolding the for loop in individual assignments (see the inline
comments) for T(1), T(2) and T(3), there is no problem anymore: the
pipeline behaves as expected.

What I cannot put my finger on, is why the individual assignments
behave differently than the for loop? Does anyone have some insights
on that?


library IEEE;
use IEEE.std_logic_1164.all;


entity tb_MultiDimArray is
end tb_MultiDimArray;


architecture Simulation of tb_MultiDimArray is

subtype t_Word is std_logic_vector(31 downto 0);
type t_T is array(0 to 3) of t_Word;

signal Clk : std_logic := '1';
signal T : t_T := (others => (others => '1'));

begin

p_Clock : process
begin
loop
Clk <= not Clk; -- do not forget to initialize the clock !
wait for 5 ns;
end loop;
end process p_Clock;


p_Pipe : process(Clk)
begin
if rising_edge(Clk) then
-- T(1) <= T(0);
-- T(2) <= T(1);
-- T(3) <= T(2);

for i in 0 to 2 loop
T(i+1) <= T(i);
end loop;

end if; -- rising_edge
end process p_Pipe;


p_Sim: process
begin
wait until falling_edge(Clk);
T(0) <= (others => '0');
wait; -- Will wait forever.
end process p_Sim;

end Simulation;
 
On 8 Apr, 15:07, Novlednes <novled...@gmail.com> wrote:
Hi,

I'm trying to pipeline the multi-dimensional array T in the following
code snippet. The pipelining is done in process p_Pipe. However, there
is a driver on T(0) in process p_Sim. No problem you would think,
since the for loop in p_Pipe only addresses elements T(1), T(2) and T
(3). For some reason, my simulator (Modelsim) appears to translate the
for loop in a driver on T(0) as well, leading to all Xs on T(0). But
when unfolding the for loop in individual assignments (see the inline
comments) for T(1), T(2) and T(3), there is no problem anymore: the
pipeline behaves as expected.

What I cannot put my finger on, is why the individual assignments
behave differently than the for loop? Does anyone have some insights
on that?

library IEEE;
use IEEE.std_logic_1164.all;

entity tb_MultiDimArray is
end tb_MultiDimArray;

architecture Simulation of tb_MultiDimArray is

   subtype t_Word    is std_logic_vector(31 downto 0);
   type t_T          is array(0 to 3) of t_Word;

   signal Clk     : std_logic := '1';
   signal T       : t_T := (others => (others => '1'));

begin

   p_Clock : process
   begin
      loop
         Clk <= not Clk;  -- do not forget to initialize the clock !
        wait for 5 ns;
        end loop;
   end process p_Clock;

   p_Pipe : process(Clk)
   begin
      if rising_edge(Clk) then
         -- T(1) <= T(0);
         -- T(2) <= T(1);
         -- T(3) <= T(2);

         for i in 0 to 2 loop
            T(i+1) <= T(i);
         end loop;

      end if; -- rising_edge
   end process p_Pipe;

   p_Sim: process
   begin
      wait until falling_edge(Clk);
      T(0) <= (others => '0');
      wait; -- Will wait forever.
   end process p_Sim;

end Simulation;
It's because you have got the same signal, T, spread out in 2
processes. T(0) and T(1-3) are all part of the same signal, and so
therefore it you have multiple drivers on the same signal.

To fix it, you'll either have to put T in the same process like this:
p_Pipe : process(Clk)
variable first : boolean := true;
begin

if falling_edge(clk) then
if first then
T(0) <= (others => '0');
first := false;
end if;


elsif rising_edge(Clk) then
T(1) <= T(0);
T(2) <= T(1);
T(3) <= T(2);

for i in 0 to 2 loop
T(i+1) <= T(i);
end loop;

end if; -- rising_edge
end process p_Pipe;

or do something like this for the shift register:

if rising_edge(clk) then
T <= some_other_signal & T(0 to T'high-1);
end if;

Some other signal can then be modified inside another process.
 
On Wed, 8 Apr 2009 07:07:24 -0700 (PDT), Novlednes wrote:

there
is a driver on T(0) in process p_Sim. No problem you would think,
since the for loop in p_Pipe only addresses elements T(1), T(2) and T
(3). For some reason, my simulator (Modelsim) appears to translate the
for loop in a driver on T(0) as well
That is correct.

It's a well-known VHDL "gotcha". FOR-loops are
dynamically elaborated; in other words, even if the
loop bounds are constant, the simulator does not know
that at compile time. Consequently, in the
following example....

architecture Foo of Bar is
signal S: std_logic_vector(3 downto 0);
begin
P: process begin
for i in 2 to 3 loop
S(i) <= '1';
end loop;
end process;
end;

....the process P drives ALL FOUR elements of S,
even though you and I can easily see that the FOR-loop
can only iterate over elements 2 and 3.

Sometimes you can work around that using a
GENERATE-loop, which is elaborated statically.
And it is always possible to work around it using
additional intermediate signals.

If you really want a fun-filled day, you may care
to scan the VHDL LRM for mentions of "longest static
prefix", where you will find all the gory details.

Good luck.
--
Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how
VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services

Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK
jonathan.bromley@MYCOMPANY.com
http://www.MYCOMPANY.com

The contents of this message may contain personal views which
are not the views of Doulos Ltd., unless specifically stated.
 
Thank you both for your answers, Tricky and Jonathan. The dynamic
elaboration of for-loops was not so well-known to me yet. Good to know
though. I'll keep the LRM lookup for some other day ;-)

Regards
 
Jonathan Bromley wrote:

...the process P drives ALL FOUR elements of S,
even though you and I can easily see that the FOR-loop
can only iterate over elements 2 and 3.

Sometimes you can work around that using a
GENERATE-loop, which is elaborated statically.
And it is always possible to work around it using
additional intermediate signals.
Or I can use variables in a single process entity
for internal regs, and reserve the signal assignments
for ports only.

-- Mike Treseler
 
Jonathan is correct; from the language/simulation point of view, at
compilation time, the for loop's index values are not known.

However, for synthesis, the notion of "static" is a little different
(and "static" may not even be the correct term). The impact of this is
that for synthesis, for-loops are unrolled, and references to the
index are implemented the same way constants are. Unlike simulation,
the bounds of the for-loop index must be static for synthesis (so that
the loop can be unrolled). You can have an exit statement to exit a
loop early based on dynamic conditions, but the loop index itself must
be statically bound (again, for synthesis only!).

A quick example:

for i in 0 to 3 loop
if i = addr then
reg(i) := data_a;
reg(i+4) := data_b;
end if;
end loop;

is likely to be implemented differently than:

for i in 0 to 3 loop
if i = addr then
reg(addr) := data_a;
reg(addr+4) := data_b;
end if;
end loop;

The behavior is the same, but the hardware may not be optimized the
same, because (i+4) is static in synthesis, whereas (addr+4) is not.

Andy
 
On Apr 8, 5:07 pm, Novlednes <novled...@gmail.com> wrote:
Hi,

I'm trying to pipeline the multi-dimensional array T in the following
code snippet. The pipelining is done in process p_Pipe. However, there
is a driver on T(0) in process p_Sim. No problem you would think,
since the for loop in p_Pipe only addresses elements T(1), T(2) and T
(3). For some reason, my simulator (Modelsim) appears to translate the
for loop in a driver on T(0) as well, leading to all Xs on T(0). But
when unfolding the for loop in individual assignments (see the inline
comments) for T(1), T(2) and T(3), there is no problem anymore: the
pipeline behaves as expected.

What I cannot put my finger on, is why the individual assignments
behave differently than the for loop? Does anyone have some insights
on that?

library IEEE;
use IEEE.std_logic_1164.all;

entity tb_MultiDimArray is
end tb_MultiDimArray;

architecture Simulation of tb_MultiDimArray is

   subtype t_Word    is std_logic_vector(31 downto 0);
   type t_T          is array(0 to 3) of t_Word;

   signal Clk     : std_logic := '1';
   signal T       : t_T := (others => (others => '1'));

begin

   p_Clock : process
   begin
      loop
         Clk <= not Clk;  -- do not forget to initialize the clock !
        wait for 5 ns;
        end loop;
   end process p_Clock;

   p_Pipe : process(Clk)
   begin
      if rising_edge(Clk) then
         -- T(1) <= T(0);
         -- T(2) <= T(1);
         -- T(3) <= T(2);

         for i in 0 to 2 loop
            T(i+1) <= T(i);
         end loop;

      end if; -- rising_edge
   end process p_Pipe;

   p_Sim: process
   begin
      wait until falling_edge(Clk);
      T(0) <= (others => '0');
      wait; -- Will wait forever.
   end process p_Sim;

end Simulation;
Hi,

You can try this (I hope this is what you intended in the first
place):

library IEEE;
use IEEE.std_logic_1164.all;

entity tb_MultiDimArray is
end tb_MultiDimArray;

architecture Simulation of tb_MultiDimArray is

type t_T is array(0 to 3) of std_logic_vector(31 downto 0);
signal Clk : std_logic := '1';
signal T : t_T := (others => (others => '1'));

begin

Clk <= not Clk after 5 ns;

p_Pipe : process(Clk)
variable vZEROS : std_logic_vector(31 downto 0) := (others => '0');
begin
if rising_edge(Clk) then
T <= vZEROS & T(0 to 2);
end if; -- rising_edge
end process p_Pipe;

end Simulation;


F
 

Welcome to EDABoard.com

Sponsor

Back
Top