sram

On 8/11/17 3:04 AM, rickman wrote:
Richard Damon wrote on 8/11/2017 12:09 AM:
On 8/10/17 10:39 PM, rickman wrote:
Allan Herriman wrote on 8/10/2017 2:02 AM:
On Wed, 09 Aug 2017 22:33:40 -0400, rickman wrote:

brimdavis@gmail.com wrote on 8/8/2017 8:37 PM:
KJ wrote:

It's even easier than that to synchronously control a standard async
SRAM.
Simply connect WE to the clock and hold OE active all the time
except
for cycles where you want to write something new into the SRAM.

As has been explained to you in detail by several other posters, your
method is not 'easier' with modern FPGA's and SRAMs.

The simplest way to get a high speed clock {gated or not} off the
chip,
coincident with other registered I/O signals, is to use the dual-edge
IOB flip-flops as I suggested.

The DDR technique I mentioned would run synchronous single-cycle read
or write cycles at 50 MHz on a Spartan-3 Starter kit with an
(IIRC) 10
ns SRAM, 66 MHz if using a duty-cycle-skewed clock to meet the WE
pulse
width requirements.

Another advantage of the 'forwarding' method is that one can use the
internal FPGA clock resources for clock multiply/divides etc.
without
needing to also manage the board-level low-skew clock distribution
needed by your method.

I can't say I follow what you are proposing. How do you get the clock
out of the FPGA with a defined time relationship to the signals
clocked
through the IOB? Is this done with feedback from the output clock
using
the internal clocking circuits?


About a decade back, mainstream FPGAs gained greatly expanded IOB
clocking abilities to support DDR RAM (and other interfaces such as
RGMII).
In particular, one can forward a clock out of an FPGA pin phase aligned
with data on other pins. You can also use one of the internal PLLs to
generate phase shifted clocks, and thus have a phase shift on the pins
between two data signals or between the clock and the data signals.

This can be done without needing feedback from the pins.


You should try reading a datasheet occasionally - they can be very
informative.
Just in case someone has blocked Google where you are: here's an
example:
https://www.xilinx.com/support/documentation/user_guides/ug571-ultrascale-

selectio.pdf

Thank you for the link to the 356 page document. No, I have not
researched how every brand of FPGA implements DDR interfaces mostly
because I have not designed a DDR memory interface in an FPGA. I did
look
at the document and didn't find info on how the timing delays through
the
IOB might be synchronized with the output clock.

So how exactly does the tight alignment of a clock exiting a Xilinx FPGA
maintain alignment with data exiting the FPGA over time and differential
temperature? What will the timing relationship be and how tightly
can it
be maintained?

Just waving your hands and saying things can be aligned doesn't explain
how it works. This is a discussion. If you aren't interested in
discussing, then please don't bother to reply.


Thinking about it, YES, FPGAs normally have a few pins that can be
configured as dedicated clock drivers, and it will generally be
guaranteed
that if those pins are driving out a global clock, then any other pin
with
output clocked by that clock will change so as to have a known hold time
(over specified operating conditions). This being the way to run a
typical
synchronous interface.

Since this method requires the WE signal to be the clock, you need to
find a
part that has either a write mask signal, or perhaps is multi-ported
so this
port could be dedicated to writes and another port could be used to read
what is needed (the original part for this thread wouldn't be usable with
this method).

I'm not sure you read the full thread. The method for generating the WE
signal is to use the two DDR FFs to drive a one level during one half of
the clock and to drive the write signal during the other half of the
clock. I misspoke above when I called it a "clock". The *other* method
involved using the actual clock as WE and gating it with the OE signal
which won't work on all async RAMs.

So with the DDR method *all* of the signals will exit the chip with a
nominal zero timing delay relative to each other. This is literally the
edge of the async RAM spec. So you need to have some delays on the
other signals relative to the WE to allow for variation in timing of
individual outputs. It seems the method suggested is to drive the CS
and WE signals hard and lighten the drive on the other outputs.

This is a method that is not relying on any guaranteed spec from the
FPGA maker. This method uses trace capacitance to create delta t =
delta v * c / i to speed or slow the rising edge of the various
outputs. This relies on over compensating the FPGA spec by means that
depend on details of the board layout. It reminds me of the early days
of generating timing signals for DRAM with logic delays.

Yeah, you might get it to work, but the layout will need to be treated
with care and respect even more so than an impedance controlled trace.
It will need to be characterized over temperature and voltage and you
will have to design in enough margin to allow for process variations.

Driving them all with DDR signals isn't putting you at the edge of the
spec, but only at the edge of the spec assuming everything is matched
nominal no-skew. Since we KNOW that output matching is not perfect, and
we don't have a manufacture guaranteed bias (a guarantee that if any
output if faster, it will be this one), we are starting outside the
guaranteed specs. Yes, you can pull some tricks to try and get back into
spec and establish a bit of margin, but this puts the design over into
the realms of 'black arts', and best avoided if possible.
 
On Saturday, August 12, 2017 at 12:18:59 PM UTC-4, Richard Damon wrote:
Since we KNOW that output matching is not perfect

Both you and rickman seem to be missing the entire point
of my original post; i.e. you wrote earlier:
4) Discrete Pulse generation logic, have logic on
the board with delay lines to generate the write pulse
so that WE will pulse low shortly after the address
is stable, and comes back high shortly before the
address might change again.
The built in dual-edge I/O logic on many FPGAs provides
EXACTLY this capability, but with much better PVT tracking.

Although my ancient Spartan-3 example code didn't
explicitly adjust the edge delay with either a DCM/PLL
or IOB delay element (although I did mention DCM duty cycle
tweaks), this is very straightforward to do in many recent
FPGA families.

Yes, you can pull some tricks to try and get
back into spec and establish a bit of margin
but this puts the design over into the realms
of 'black arts'
Relying on characterized minimums to meet hold times
is not a 'black art'; it is the underlying reason why
synchronous digital logic works at all.

I.e. connecting two sections of a 74LS74 in series
at the board level requires that Tplh/Tphl (min) of
the first flip-flop be greater than the hold time of
the next. (And yes, I realize that some logic technologies
require dummy routing or buffers in the datapath to avoid
hold time violations.)

Particularly with a vendor evaluation board like I used
for that 2004 Spartan-3 Starter Kit SRAM example, it is
far more likely that signal integrity problems with the
WE line, rather than buffer delay behavior, causes problems.

-Brian
 
On 8/14/17 6:42 PM, Brian Davis wrote:
On Saturday, August 12, 2017 at 12:18:59 PM UTC-4, Richard Damon wrote:

Since we KNOW that output matching is not perfect


Both you and rickman seem to be missing the entire point
of my original post; i.e. you wrote earlier:

4) Discrete Pulse generation logic, have logic on
the board with delay lines to generate the write pulse
so that WE will pulse low shortly after the address
is stable, and comes back high shortly before the
address might change again.

The built in dual-edge I/O logic on many FPGAs provides
EXACTLY this capability, but with much better PVT tracking.

Although my ancient Spartan-3 example code didn't
explicitly adjust the edge delay with either a DCM/PLL
or IOB delay element (although I did mention DCM duty cycle
tweaks), this is very straightforward to do in many recent
FPGA families.


Yes, you can pull some tricks to try and get
back into spec and establish a bit of margin
but this puts the design over into the realms
of 'black arts'

Relying on characterized minimums to meet hold times
is not a 'black art'; it is the underlying reason why
synchronous digital logic works at all.

I.e. connecting two sections of a 74LS74 in series
at the board level requires that Tplh/Tphl (min) of
the first flip-flop be greater than the hold time of
the next. (And yes, I realize that some logic technologies
require dummy routing or buffers in the datapath to avoid
hold time violations.)

Particularly with a vendor evaluation board like I used
for that 2004 Spartan-3 Starter Kit SRAM example, it is
far more likely that signal integrity problems with the
WE line, rather than buffer delay behavior, causes problems.

-Brian

The 'Black Art' I was referring to was NOT datasheet min/maxs but using
strong/weak drive and bus loading to convert timing that doesn't meet
guaranteed performance (since given two outputs, there will be some
skew, and absent some unusual specification that pin x will always be
faster than pin y, we need to allow that x might be slower than y) into
something that likely 'works'.
 
Richard Damon wrote:
The 'Black Art' I was referring to was NOT datasheet min/maxs
but using strong/weak drive and bus loading to convert timing
that doesn't meet guaranteed performance
As I explained to rickman several posts ago, the Spartan-3
I/O buffer base delay vs. IOSTANDARD/SLEW/DRIVE is fully
characterized and published by Xilinx:

Xilinx characterizes and publishes I/O buffer switching
parameters vs. IOSTANDARD/SLEW/DRIVE settings; this
information is both summarized in the datasheet and used
in generating the timing reports, providing the base delay
of the I/O buffer independent of any external capacitive loading.

If you don't want to rely on a guaranteed minimum delay
between I/O buffer types at the FAST device corner, that's
fine with me, but please stop with the baseless criticism.

-Brian
 
On 8/15/17 5:59 PM, Brian Davis wrote:
Richard Damon wrote:

The 'Black Art' I was referring to was NOT datasheet min/maxs
but using strong/weak drive and bus loading to convert timing
that doesn't meet guaranteed performance

As I explained to rickman several posts ago, the Spartan-3
I/O buffer base delay vs. IOSTANDARD/SLEW/DRIVE is fully
characterized and published by Xilinx:

Looking at the datasheet for the Spartan-3 family, I see no minimum
times published. There is a footnote that minimums can be gotten out of
the timing analyzer. There are maximums fully specified out, including
adders for various cases, allowing you to do some of the analysis early
in the design, but to get minimum timing you need to first get the
program and a license to use it (license may be free, but you need to
give them enough information that they can contact you later asking
about your usage).

Being only available in the program, and not truly "Published" indicates
to me some significant uncertainty in the numbers. Timing reports out of
programs like this tend to be disclaimed to be valid only for THAT
particular design and the particular operating conditions requested. You
need to run the analysis over as many condition combinations they
provide and hope (they never seem to promise it) that the important
factors are at least monotonic between the data points so you can get
real min/maxs.
Xilinx characterizes and publishes I/O buffer switching
parameters vs. IOSTANDARD/SLEW/DRIVE settings; this
information is both summarized in the datasheet and used
in generating the timing reports, providing the base delay
of the I/O buffer independent of any external capacitive loading.


If you don't want to rely on a guaranteed minimum delay
between I/O buffer types at the FAST device corner, that's
fine with me, but please stop with the baseless criticism.

-Brian
 

Welcome to EDABoard.com

Sponsor

Back
Top