Dual Port RAM Inference

On May 9, 2:26 pm, Jacko <jackokr...@gmail.com> wrote:
The BRAM does not have the necessary dual address decoders. The best
option is to clock at half speed and multiplex. Read before write is
most usual.
All Xilinx BRAMs have dual address decoders, and each port also has
the option of read before or after write or retain previous output.
It seems there is no argument about the hardware, but there is about
the software...
Peter Alfke
 
On May 10, 12:15 am, Peter Alfke <al...@sbcglobal.net> wrote:
All Xilinx BRAMs have dual address decoders, and each port also has
the option of read before or after write or retain previous output.
It seems there is no argument about the hardware, but there is about
the software...
Peter Alfke
Peter,
This time... (quite) no argument about the software too (see my
previous post).
XST (your software [xilinx]) infers the bram with two r/w ports both
with "READ FIRST" and with "WRITE FIRST" options...

Maybe the only software (vhdl) argument could be "how to infer dual
port BRAM with
different bus sizes for the two ports"

regards
Sandro
 
On May 9, 4:31 pm, Sandro <sdro...@netscape.net> wrote:
Peter,
This time... (quite) no argument about the software too (see my
previous post).
XST (your software [xilinx]) infers the bram with two r/w ports both
with "READ FIRST" and with "WRITE FIRST" options...

Maybe the only software (vhdl) argument could be "how to infer dual
port BRAM with
different bus sizes for the two ports"

regards
Sandro
Thought I would chime in on some of the comments and observations from
this thread. Starting with the most recent comment, if you need
different port widths in either the read vs. write of the same port or
different widths on the dual port, you do need to instantiate.
Neither XST, Synplify or Precision support RAMs with different port
widths. I can comment from the XST side that we have investigated
this and plan to some day offer this however to date, have not been
able to include this capability.

As Sandro explains, you should be able to infer a common clock dual
port RAM (assuming same port widths) in any of the READ_FIRST,
WRITE_FIRST or NO_CHANGE modes. It is fairly straightforward in
verilog to code this however for VHDL as explained, you do need to use
a shared variable to accomplish this. I am more familiar with Verilog
than VHDL but my understanding is that the shared variable is
necessary for proper simulation when accessing the same array at the
same time. In terms of coding examples for these RAMs, most of the
coding examples can be found in the Xilinx Language Templates which
are accessible from Xilinx Project Navigator. Open the Templates and
look in VHDL or Verilog --> Synthesis Constructs --> Coding Examples --
RAM to see several examples. In the Single-Port descriptions you
can see the differences between READ_FIRST, WRITE_FIRST and NO_CHANGE
mode however unfortunately for the dual port not all have been adapted
there but in theory should work. I will see if in 11.2 we can get the
templates updated to include all of the dual port examples for these.
One other note, if you are inferring a BRAM in which you never plan to
read from the same port at the time you are writing, describe
NO_CHANGE mode. It will save power but not many realize this.

In terms of memory collisions (writing to the same memory address on a
dual port RAM as either reading or writing on the other) this
described in the device User Guides and the Synthesis and Simulation
Design Guide so I hope that most understand what it is and what should
be done to avoid them however as for inferring dual-port BRAM, you do
need to heed more caution. A behavioral RTL simulation will not alert
or model a collision so you can very well simulate a collision
behaviorally and get a seemingly valid result but the implementation
can give something different. This is not covered by static timing
analysis as this is a dynamic situation. It can be covered and
alerted by timing simulation however many choose not to do timing
simulations so in lieu of that some synthesis tools have decided to
arbitrate the access to the same memory locations with additional
logic around the BRAM. Both Synplicity and Precision do this however
XST does not. Most people who are aware of this, disable the addition
of the collision avoidance logic using a synthesis attribute as it can
slow the RAM down, add more resources and add more power to the FPGA
design and in many cases is not needed however if you do disable this,
you need to take extra care to ensure an undetected collision will not
give undesired results in your design. I too try to avoid
instantiation of BRAM however one advantage it does give you is it
will alert you to a memory collision as it is modeled in the UNISIM.
As mentioned before a timing simulation (no matter how the RAM was
entered) can also detect this. In system testing, can not detect
this. Reason being, collisions are as unpredictable as a timing error
and while a system may behave one way in one device in one
environmental condition (temperature or voltage) during a collision,
it may behave differently in another device or under a different
environmental condition) so I would not trust in-system testing to
this any more than I would a timing violation.

Hopefully this clears up some of the issues identified in this
thread. I often do infer RAMs in my designs however there are certain
circumstances (such as different port widths) that necessitate
instantiation so we are still not in a full RTL world when it comes to
RAMs. However more situations than most know can be inferred with
relative ease (i.e. dual-port, byte enables, read modes,
initialization from an external file, all can be inferred now).

Regards,

-- Brian Philofsky
-- Xilinx Applications
 
On May 10, 6:00 am, Brian <scubabr...@gmail.com> wrote:
On May 9, 4:31 pm, Sandro <sdro...@netscape.net> wrote:



Peter,
This time... (quite) no argument about the software too (see my
previous post).
XST (your software [xilinx]) infers the bram with two r/w ports both
with "READ FIRST" and with "WRITE FIRST" options...

Maybe the only software (vhdl) argument could be "how to infer dual
port BRAM with
different bus sizes for the two ports"

regards
Sandro

Thought I would chime in on some of the comments and observations from
this thread.  Starting with the most recent comment, if you need
different port widths in either the read vs. write of the same port or
different widths on the dual port, you do need to instantiate.
Neither XST, Synplify or Precision support RAMs with different port
widths.  I can comment from the XST side that we have investigated
this and plan to some day offer this however to date, have not been
able to include this capability.

As Sandro explains, you should be able to infer a common clock dual
port RAM (assuming same port widths) in any of the READ_FIRST,
WRITE_FIRST or NO_CHANGE modes.  It is fairly straightforward in
verilog to code this however for VHDL as explained, you do need to use
a shared variable to accomplish this.  I am more familiar with Verilog
than VHDL but my understanding is that the shared variable is
necessary for proper simulation when accessing the same array at the
same time.  In terms of coding examples for these RAMs, most of the
coding examples can be found in the Xilinx Language Templates which
are accessible from Xilinx Project Navigator.  Open the Templates and
look in VHDL or Verilog --> Synthesis Constructs --> Coding Examples --> RAM to see several examples.  In the Single-Port descriptions you

can see the differences between READ_FIRST, WRITE_FIRST and NO_CHANGE
mode however unfortunately for the dual port not all have been adapted
there but in theory should work.  I will see if in 11.2 we can get the
templates updated to include all of the dual port examples for these.
One other note, if you are inferring a BRAM in which you never plan to
read from the same port at the time you are writing, describe
NO_CHANGE mode.  It will save power but not many realize this.

In terms of memory collisions (writing to the same memory address on a
dual port RAM as either reading or writing on the other) this
described in the device User Guides and the Synthesis and Simulation
Design Guide so I hope that most understand what it is and what should
be done to avoid them however as for inferring dual-port BRAM, you do
need to heed more caution.  A behavioral RTL simulation will not alert
or model a collision so you can very well simulate a collision
behaviorally and get a seemingly valid result but the implementation
can give something different.  This is not covered by static timing
analysis as this is a dynamic situation.  It can be covered and
alerted by timing simulation however many choose not to do timing
simulations so in lieu of that some synthesis tools have decided to
arbitrate the access to the same memory locations with additional
logic around the BRAM.  Both Synplicity and Precision do this however
XST does not.  Most people who are aware of this, disable the addition
of the collision avoidance logic using a synthesis attribute as it can
slow the RAM down, add more resources and add more power to the FPGA
design and in many cases is not needed however if you do disable this,
you need to take extra care to ensure an undetected collision will not
give undesired results in your design.  I too try to avoid
instantiation of BRAM however one advantage it does give you is it
will alert you to a memory collision as it is modeled in the UNISIM.
As mentioned before a timing simulation (no matter how the RAM was
entered) can also detect this. In system testing, can not detect
this.  Reason being, collisions are as unpredictable as a timing error
and while a system may behave one way in one device in one
environmental condition (temperature or voltage) during a collision,
it may behave differently in another device or under a different
environmental condition) so I would not trust in-system testing to
this any more than I would a timing violation.

Hopefully this clears up some of the issues identified in this
thread.  I often do infer RAMs in my designs however there are certain
circumstances (such as different port widths) that necessitate
instantiation so we are still not in a full RTL world when it comes to
RAMs.  However more situations than most know can be inferred with
relative ease (i.e. dual-port, byte enables, read modes,
initialization from an external file, all can be inferred now).

Regards,

--  Brian Philofsky
--  Xilinx Applications
Brian,
thanks for your answer ... you avoided me to waste time
trying to figure out how the ram can be represented (in vhdl) as two
array with "different geometry" (read dual port with different
bus sizes).


Peter Alfke wrote:
...
What does the user community expect from us (Xilinx)?
...
(...winking to Peter) that is what the user
community expect from you (Xilinx) ;-)

regards
Sandro
 
On Sat, 9 May 2009 21:00:40 -0700 (PDT), Brian wrote:

Thought I would chime in on some of the comments and observations from
this thread.
Many thanks for an authoritative and very helpful post. I'd
worked some of that stuff out for myself, but there are
a few important things you've taught me. Much appreciated.

Neither XST, Synplify or Precision support RAMs with different port
with different port widths.
Thanks for saving me some investigation time. This happens to
be a feature that I don't need in my current project, but it's
worth being aware of the restriction.

shared variable is
necessary for proper simulation when accessing the same array at the
array at the same time.
Yes, if you want WRITE_FIRST behaviour. It should be possible
to avoid it if you want READ_FIRST, but since we need shared
variables for the WRITE_FIRST template it seems sensible to
use them in all forms. Alternatively, a doubly-clocked
template would allow the variable to stay local to the
process - but that's stretching things a bit.

One downside of the shared variable is that it's not strictly
VHDL language-compliant, since it doesn't use protected types.
Those simulators that understand protected types (VHDL >= 2000)
will usually issue a warning for it. This warning can safely
be ignored (although, of course, it is in a sense alerting
you to the risk of write-write collision!).

In the Single-Port descriptions you
can see the differences between READ_FIRST, WRITE_FIRST and NO_CHANGE
mode however unfortunately for the dual port not all have been adapted
Yes, and sometimes it's far from obvious how to extrapolate
from the single-port to dual-write description.

I will see if in 11.2 we can get the
templates updated to include all of the dual port examples for these.
Simply having authoritative and complete examples in the docs
would be a really good start.

One other note, if you are inferring a BRAM in which you never plan to
read from the same port at the time you are writing, describe
NO_CHANGE mode. It will save power but not many realize this.
I certainly didn't. Can you (or Peter???) elaborate a bit more?
Is the power saving significant in practice?

A behavioral RTL simulation will not alert
or model a collision
An assertion could easily be added. I would be very suspicious
of an inference template that did not include such an assertion
to generate at least a warning.

some synthesis tools have decided to
arbitrate the access to the same memory locations with additional
logic around the BRAM. Both Synplicity and Precision do this however
XST does not.
Yes, I'd noticed that. My preference would by far be for XST's
approach, but with a suitable assertion in the template.
Best of all would be the synth tool checking for existence
of such an assertion, and complaining if it was absent :)

Most people who are aware of this, disable the addition
of the collision avoidance logic using a synthesis attribute
I have not yet had the persistence to track down those
attributes in the vendor docs. Thanks for alerting me
to their existence.

--
Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how
VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services

Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK
jonathan.bromley@MYCOMPANY.com
http://www.MYCOMPANY.com

The contents of this message may contain personal views which
are not the views of Doulos Ltd., unless specifically stated.
 
My code is virtually this same thing. The tool tells me I am trying
to infer two ports using distributed memory which it can't do.

Rick

On May 9, 5:26 pm, Sandro <sdro...@netscape.net> wrote:
Rick,
I hope this can help...
below you can find the code to infer dual port-ram with
both port sharing the same clock.
I suppose the secret could be using a shared variable (instead of a
signal) as RAM...

regards
Sandro

entity ramInference is
  generic (
    g_data_w : natural := 9;
    g_addr_w : natural := 11
    );
  port (
    i_clkA  : in  std_logic;
    --i_clkB  : in  std_logic;
    i_enA   :     std_logic;
    i_weA   :     std_logic;
    i_addrA : in  std_logic_vector (g_addr_w - 1 downto 0);
    i_dataA : in  std_logic_vector (g_data_w - 1 downto 0);
    o_dataA : out std_logic_vector (g_data_w - 1 downto 0);

    i_enB   :     std_logic;
    i_weB   :     std_logic;
    i_addrB : in  std_logic_vector (g_addr_w - 1 downto 0);
    i_dataB : in  std_logic_vector (g_data_w - 1 downto 0);
    o_dataB : out std_logic_vector (g_data_w - 1 downto 0)
    );
end ramInference;

architecture Behavioral of ramInference is

  constant c_ram_sz : natural := 2**(g_addr_w);

  type t_ram is array (c_ram_sz - 1 downto 0) of
    std_logic_vector (g_data_w - 1 downto 0);

  shared variable v_ram : t_ram := (
    1      => X"05",
    2      => X"08",
    3      => X"1A",
    -- ...
    others => X"00"
    );

begin

  p_portA : process (i_clkA)
  begin
    if rising_edge(i_clkA) then
      if (i_enA = '1') then
        -- READ FIRST
        o_dataA(g_data_w - 1 downto 0) <= v_ram(conv_integer
(i_addrA));
        -- WRITE AFTER
        if (i_weA = '1') then
          v_ram(conv_integer(i_addrA)) := i_dataA(g_data_w - 1 downto
0);
        end if;
      end if;
    end if;
  end process;

  p_portB : process (i_clkA)
  begin
    if rising_edge(i_clkA) then
      if (i_enB = '1') then
        -- WRITE FIRST
        if (i_weB = '1') then
          v_ram(conv_integer(i_addrB)) := i_dataB(g_data_w - 1 downto
0);
        end if;
        -- READ AFTER
        o_dataB(g_data_w - 1 downto 0) <= v_ram(conv_integer
(i_addrB));
      end if;
    end if;
  end process;

end Behavioral;
 
I know I'm a little late on this thread, but offer my two cents,
on what we use, and a warning as well.

We use dual-port RAMS (same clock) with inference, and don't have
trouble. It's in verilog, and it's READ_FIRST. So two strikes
against it for what you're looking for Rick. (you want VHDL, and
WRITE_FIRST, I beleive). We call this our "mem2rw1clk" module.

But here's what we do (minus header/etc):

always @( posedge clk )
begin
if( en0 )
begin
if( wren0 )
mem[ addr0 ] <= wdata0;
rdata0 <= mem[ addr0 ];
end
end

always @( posedge clk )
begin
if( en1 )
begin
if( wren1 )
mem[ addr1 ] <= wdata1;
rdata1 <= mem[ addr1 ];
end
end

So, two almost identical always blocks, operating on the same RAM.
Since we use non-blocking assignments, the READ_FIRST is implied
(correctly by XST).

Works, and we've been using it for many designs no trouble.

Now the warning:

We use almost the EXACT same structure for implementing a pseudo
dual port - i.e. an independant READ port, and a WRITE port
(same clock) "mem1r1w1clk". I.e. the type of memory you'd
use for a synchronous fifo. The logic is again clearly coded for
READ_FIRST.

Well, XST was (sometimes) inferring WRITE_FIRST. So, simulation
vs implementation mismatch. It only mattered in a few places
we were specifically ALWAYS reading the same location as we
were writing in the same cycle. You get quite different results.
Spent 2-3 weeks on the bench figuring out this one.

So - check the XST report to make sure it's inferring the
correct READ_FIRST vs. WRITE_FIRST behaviour. XST can get things
wrong here.


Regards,

Mark
 
On May 11, 12:43 pm, Mark <m...@cacurry.net> wrote:
I know I'm a little late on this thread, but offer my two cents,
on what we use, and a warning as well.

We use dual-port RAMS (same clock) with inference, and don't have
trouble.  It's in verilog, and it's READ_FIRST.  So two strikes
against it for what you're looking for Rick.  (you want VHDL, and
WRITE_FIRST, I beleive).  We call this our "mem2rw1clk" module.

But here's what we do (minus header/etc):

always @( posedge clk )
begin
  if( en0 )
  begin
    if( wren0 )
      mem[ addr0 ] <= wdata0;
    rdata0 <= mem[ addr0 ];
  end
end

always @( posedge clk )
begin
  if( en1 )
  begin
    if( wren1 )
      mem[ addr1 ] <= wdata1;
    rdata1 <= mem[ addr1 ];
  end
end

So, two almost identical always blocks, operating on the same RAM.
Since we use non-blocking assignments, the READ_FIRST is implied
(correctly by XST).

Works, and we've been using it for many designs no trouble.

Now the warning:

We use almost the EXACT same structure for implementing a pseudo
dual port - i.e. an independant READ port, and a WRITE port
(same clock) "mem1r1w1clk".    I.e. the type of memory you'd
use for a synchronous fifo.  The logic is again clearly coded for
READ_FIRST.

Well, XST was (sometimes) inferring WRITE_FIRST.  So, simulation
vs implementation mismatch.  It only mattered in a few places
we were specifically ALWAYS reading the same location as we
were writing in the same cycle. You get quite different results.
Spent 2-3 weeks on the bench figuring out this one.

So - check the XST report to make sure it's inferring the
correct READ_FIRST vs. WRITE_FIRST behaviour.  XST can get things
wrong here.

Regards,

Mark
I am surprised about the interest in write_first vs read_first.
The read output during a write operation came really about as an
afterthought. ("It's easy, the port is already there, so it costs
nothing").
But why do you want to read from the same location that you are
writing to?
Especially when you are reading what you already know, since you
simultaneously are writing it (which was the original mode).
Then we found that read-before-write was an easy modification, and
more valuable.
But still: why do you read from the write address, when you have a
separate read port with its own dedicated addressing available?

But, judging from the interest in this thread, it seems to be
valuable.
Peter Alfke
 
On May 11, 7:46 pm, rickman <gnu...@gmail.com> wrote:
My code is virtually this same thing.  The tool tells me I am trying
to infer two ports using distributed memory which it can't do.
Rick,
I don't know! It works fine to me (webpack ISE 10.1.03 - linux).

Did you try to use shared variable ? In your previous example
I saw only "<=" instead of ":=" to "assign" a shared variable...

Sandro
 
On May 11, 1:29 pm, pe...@xilinx.com wrote:

I am surprised about the interest in write_first vs read_first.
The read output during a write operation came really about as an
afterthought. ("It's easy, theportis already there, so it costs
nothing").
But why do you want to read from the same location that you are
writing to?
For "READ_FIRST" it makes sense. Your reading an old value and
the same time your updating a new value. For us it's an image
processing algorithm, where pixels are going into a line buffer.
We needed line[n-1] pixel value now (the READ data), along with
the current value (the WRITE data). On the next line the
previously written data is now line[n-1], repeat. So the
address of the READ, and WRITE are ALWAYS the same (the column
address). So (depending on how you count things) this consumes
one RAM port.

Especially when you are reading what you already know, since you
simultaneously are writing it (which was the original mode).
Then we found that read-before-write was an easy modification, and
more valuable.
I agree, "WRITE_FIRST" has more limited utility. I didn't know
the history that was the only available mode previously.

But still: why do you read from the write address, when you have a
separate readportwith its own dedicated addressing available?
Yes, Xilinx has "True Dual Port", but I'd rather
code to the minimum that I need in tech independant manner,
and let the tool build from what's available.

If the tool can't build it I'd rather it barf and quit, rather
than just build something willy-nilly that doesn't match the
description. (okay, a bit snarky - I guess I'm still a little
sore over all that time in the lab debugging an XST issue...)

--Mark
 
On May 11, 4:04 pm, Mark <m...@cacurry.net> wrote:
On May 11, 1:29 pm, pe...@xilinx.com wrote:

I am surprised about the interest in write_first vs read_first.
The read output during a write operation came really about as an
afterthought. ("It's easy, theportis already there, so it costs
nothing").
But why do you want to read from the same location that you are
writing to?

For "READ_FIRST" it makes sense.  Your reading an old value and
the same time your updating a new value.  For us it's an image
processing algorithm, where pixels are going into a line buffer.
We needed line[n-1] pixel value now (the READ data), along with
the current value (the WRITE data).  On the next line the
previously written data is now line[n-1], repeat.  So the
address of the READ, and WRITE are ALWAYS the same (the column
address).  So (depending on how you count things) this consumes
one RAM port.

Especially when you are reading what you already know, since you
simultaneously are writing it (which was the original mode).
Then we found that read-before-write was an easy modification, and
more valuable.

I agree, "WRITE_FIRST" has more limited utility.  I didn't know
the history that was the only available mode previously.

But still: why do you read from the write address, when you have a
separate readportwith its own dedicated addressing available?

Yes, Xilinx has "True Dual Port", but I'd rather
code to the minimum that I need in tech independant manner,
and let the tool build from what's available.

If the tool can't build it I'd rather it barf and quit, rather
than just build something willy-nilly that doesn't match the
description.  (okay, a bit snarky - I guess I'm still a little
sore over all that time in the lab debugging an XST issue...)

--Mark
Mark, there are clearly several different ways to implement your
design: single port with read-before-write (the most elegant way),
or dual-port with duplicated offset addressing,
or even time-sequenced read-then-write, time permitting.
It is frustrating to know that it can be done, but not be able to do
it.
Maybe you expect the synthesizers to be more versatile and smarter
than they really are.
Peter Alfke
 
On May 11, 4:39 pm, pe...@xilinx.com wrote:
On May 11, 4:04 pm, Mark <m...@cacurry.net> wrote:





On May 11, 1:29 pm, pe...@xilinx.com wrote:

I am surprised about the interest in write_first vs read_first.
The read output during a write operation came really about as an
afterthought. ("It's easy, theportis already there, so it costs
nothing").
But why do you want to read from the same location that you are
writing to?

For "READ_FIRST" it makes sense.  Your reading an old value and
the same time your updating a new value.  For us it's an image
processing algorithm, where pixels are going into a line buffer.
We needed line[n-1] pixel value now (the READ data), along with
the current value (the WRITE data).  On the next line the
previously written data is now line[n-1], repeat.  So the
address of the READ, and WRITE are ALWAYS the same (the column
address).  So (depending on how you count things) this consumes
one RAMport.

Especially when you are reading what you already know, since you
simultaneously are writing it (which was the original mode).
Then we found that read-before-write was an easy modification, and
more valuable.

I agree, "WRITE_FIRST" has more limited utility.  I didn't know
the history that was the only available mode previously.

But still: why do you read from the write address, when you have a
separate readportwith its own dedicated addressing available?

Yes, Xilinx has "TrueDualPort", but I'd rather
code to the minimum that I need in tech independant manner,
and let the tool build from what's available.

If the tool can't build it I'd rather it barf and quit, rather
than just build something willy-nilly that doesn't match the
description.  (okay, a bit snarky - I guess I'm still a little
sore over all that time in the lab debugging an XST issue...)

--Mark

Mark, there are clearly several different ways to implement your
design: singleportwith read-before-write (the most elegant way),
ordual-portwith duplicated offset addressing,
or even time-sequenced read-then-write, time permitting.
It is frustrating to know that it can be done, but not be able to do
it.
Maybe you expect the synthesizers to be more versatile and smarter
than they really are.
I'm not asking the synthesis tool to optimize across multiple
solutions.

A time-sequenced read-then-write would be an architecture change that
certainly outside the scope of a synthesis tool.

Inferring "Read_First" vs. "Write_First" behaviour correctly is quite
easily within the scope of the tool.

I just wanted to warn folks to check their template results closely
in the log files for these inferred RAMS. I was bitten, and
don't want others to repeat.

--Mark
 
On May 11, 4:04 pm, Mark <m...@cacurry.net> wrote:
On May 11, 1:29 pm, pe...@xilinx.com wrote:

I am surprised about the interest in write_first vs read_first.
The read output during a write operation came really about as an
afterthought. ("It's easy, theportis already there, so it costs
nothing").
But why do you want to read from the same location that you are
writing to?

For "READ_FIRST" it makes sense.  Your reading an old value and
the same time your updating a new value.  For us it's an image
processing algorithm, where pixels are going into a line buffer.
We needed line[n-1] pixel value now (the READ data), along with
the current value (the WRITE data).  On the next line the
previously written data is now line[n-1], repeat.  So the
address of the READ, and WRITE are ALWAYS the same (the column
address).  So (depending on how you count things) this consumes
one RAM port.

Especially when you are reading what you already know, since you
simultaneously are writing it (which was the original mode).
Then we found that read-before-write was an easy modification, and
more valuable.

I agree, "WRITE_FIRST" has more limited utility.  I didn't know
the history that was the only available mode previously.

But still: why do you read from the write address, when you have a
separate readportwith its own dedicated addressing available?

Yes, Xilinx has "True Dual Port", but I'd rather
code to the minimum that I need in tech independant manner,
and let the tool build from what's available.

If the tool can't build it I'd rather it barf and quit, rather
than just build something willy-nilly that doesn't match the
description.  (okay, a bit snarky - I guess I'm still a little
sore over all that time in the lab debugging an XST issue...)

--Mark
Mark, I listened to Obama's comment that his wife "has the right to
bear (bare) arms". That was a very clever pun, but it would not
translate into any other language.
If you expect your talk to be translated automatically (or even by
humans) into French, German, or Chinese, you have to avoid all such
clever constructs, and go for boring middle-of-the-road statements.
Same with logic design. If you design generically, you miss out on
many subtleties.
This is not meant as an excuse for Xilinx to misunderstand relatively
simple BRAM constructs...
Peter Alfke
 
On May 11, 4:29 pm, pe...@xilinx.com wrote:
On May 11, 12:43 pm, Mark <m...@cacurry.net> wrote:



I know I'm a little late on this thread, but offer my two cents,
on what we use, and a warning as well.

We use dual-port RAMS (same clock) with inference, and don't have
trouble.  It's in verilog, and it's READ_FIRST.  So two strikes
against it for what you're looking for Rick.  (you want VHDL, and
WRITE_FIRST, I beleive).  We call this our "mem2rw1clk" module.

But here's what we do (minus header/etc):

always @( posedge clk )
begin
  if( en0 )
  begin
    if( wren0 )
      mem[ addr0 ] <= wdata0;
    rdata0 <= mem[ addr0 ];
  end
end

always @( posedge clk )
begin
  if( en1 )
  begin
    if( wren1 )
      mem[ addr1 ] <= wdata1;
    rdata1 <= mem[ addr1 ];
  end
end

So, two almost identical always blocks, operating on the same RAM.
Since we use non-blocking assignments, the READ_FIRST is implied
(correctly by XST).

Works, and we've been using it for many designs no trouble.

Now the warning:

We use almost the EXACT same structure for implementing a pseudo
dual port - i.e. an independant READ port, and a WRITE port
(same clock) "mem1r1w1clk".    I.e. the type of memory you'd
use for a synchronous fifo.  The logic is again clearly coded for
READ_FIRST.

Well, XST was (sometimes) inferring WRITE_FIRST.  So, simulation
vs implementation mismatch.  It only mattered in a few places
we were specifically ALWAYS reading the same location as we
were writing in the same cycle. You get quite different results.
Spent 2-3 weeks on the bench figuring out this one.

So - check the XST report to make sure it's inferring the
correct READ_FIRST vs. WRITE_FIRST behaviour.  XST can get things
wrong here.

Regards,

Mark

I am surprised about the interest in write_first vs read_first.
The read output during a write operation came really about as an
afterthought. ("It's easy, the port is already there, so it costs
nothing").
But why do you want to read from the same location that you are
writing to?
Especially when you are reading what you already know, since you
simultaneously are writing it (which was the original mode).
Then we found that read-before-write was an easy modification, and
more valuable.
But still: why do you read from the write address, when you have a
separate read port with its own dedicated addressing available?

But, judging from the interest in this thread, it seems to be
valuable.
Peter Alfke
That's easy. In my case I am using the ram as a stack, two actually.
Each port has to read whatever was last written because it does *not*
have an independent read port and the read data has to reflect the top
of the stack at all times. The ram is shared as two stacks to save
space since the entire depth of the block ram is not needed.

Thinking that the read port does not need to reflect the last written
data is a very limited perspective. *I* may know what was written,
but whatever is connected to the read port does not know it unless the
read port reflects it.

Rick
 

Welcome to EDABoard.com

Sponsor

Back
Top