sram

K

kristoff

Guest
Hi,


OK, left the lora chips asside for a while, so .. now back to FPGAs.

I have two olimex ice40 boards where I would like to use the onboard
SRAM. The RAM chip is a samsung K5R4016V1B-10 (256K words * 16 bits).

The datasheets are here:
https://www.olimex.com/Products/_resources/ds_k6r4016v1d_rev40.pdf
The most important pages are page 7 (for "read"), pages 8 and 9 (for
"write") and page 10 (for the functional description of the pins).


I am trying to interprete the datasheets to see how to use the chip. I
think I understand how to read or write one word, but I still puzzled on
how to do bulk-write transfers


* For read, it seams to be simple:
set /WE high and /OE low (*)

1/ put the address on the address-bus
2/ 10 ns later, read the data from the data-out
(*) ignoring the /CS, /LB and /UB pins to keep things simple.

In bulk transfer, it is like this:
- set address 1 on the Address bus
- 10 ns later:
-> read the data of address 1 from data-out
-> (at the same time) set address 2 on the address bus
- 10 bs later:
-> read the data of address 2 from data-out
-> (at the same time) set address 3 on the address bus
(etc)


* For write, to write one single word, I think it goes like this

1/ set /WE low and /OE high to go to "write" mode
-> at the same time set te address on the address bus
-> do not yet put the data on the databus (as it still in "output" mode)
2/ 10 ns later:
-> put the data on the data-bus (by then, the data-bus has switched to
"data-in"
3/ another 10 ns later:
-> set /WE high and /OE low to leave "write" mode

But I am still puzzled on how to do a "bulk write" of data. The
datasheets do not mention anything on what happens if leave the chip in
"write" mode and just change the address on the address-bus (as is done
for bulk-read)

It there is no seperate bulk-write protocol, it looks like a write to
the chip takes 3 times as much steps then a bulk-read (3 steps compaired
to one single step).


Is this a correct interpretation of the datasheet?

Can somebody who has already interfaced an FPGA with SRAM confirm or
deny this. Or is there another trick on how to do a bulk-write on a SRAM
chip?




Cheerio! Kr. Bonne.
 
Static RAM chips do not have bulk mode, it's not needed, you write to it
one word at a time. Its EEPROM, FLASH, and similar memory with it's
complicated setup that are in need of bulk mode as they are slow and
bulk mode is faster, some only have bulk mode.


On 7/22/2017 12:52 PM, kristoff wrote:
Hi,


OK, left the lora chips asside for a while, so .. now back to FPGAs.

I have two olimex ice40 boards where I would like to use the onboard
SRAM. The RAM chip is a samsung K5R4016V1B-10 (256K words * 16 bits).

The datasheets are here:
https://www.olimex.com/Products/_resources/ds_k6r4016v1d_rev40.pdf
The most important pages are page 7 (for "read"), pages 8 and 9 (for
"write") and page 10 (for the functional description of the pins).


I am trying to interprete the datasheets to see how to use the chip. I
think I understand how to read or write one word, but I still puzzled on
how to do bulk-write transfers


* For read, it seams to be simple:
set /WE high and /OE low (*)

1/ put the address on the address-bus
2/ 10 ns later, read the data from the data-out
(*) ignoring the /CS, /LB and /UB pins to keep things simple.

In bulk transfer, it is like this:
- set address 1 on the Address bus
- 10 ns later:
-> read the data of address 1 from data-out
-> (at the same time) set address 2 on the address bus
- 10 bs later:
-> read the data of address 2 from data-out
-> (at the same time) set address 3 on the address bus
(etc)


* For write, to write one single word, I think it goes like this

1/ set /WE low and /OE high to go to "write" mode
-> at the same time set te address on the address bus
-> do not yet put the data on the databus (as it still in "output" mode)
2/ 10 ns later:
-> put the data on the data-bus (by then, the data-bus has switched to
"data-in"
3/ another 10 ns later:
-> set /WE high and /OE low to leave "write" mode

But I am still puzzled on how to do a "bulk write" of data. The
datasheets do not mention anything on what happens if leave the chip in
"write" mode and just change the address on the address-bus (as is done
for bulk-read)

It there is no seperate bulk-write protocol, it looks like a write to
the chip takes 3 times as much steps then a bulk-read (3 steps compaired
to one single step).


Is this a correct interpretation of the datasheet?

Can somebody who has already interfaced an FPGA with SRAM confirm or
deny this. Or is there another trick on how to do a bulk-write on a SRAM
chip?




Cheerio! Kr. Bonne.

--
Cecil - k5nwa
 
Hi Cecil,


Thanks for your reply.


I agree it's not a bulk-mode as such.

What I meant was that when doing multiple reads one after the other you
can stich them together:


Correct me if I am wrong, but how I interprete the datasheets, the "read
data from the address-bus" can be done at the same time as the "set next
address on address-bus". This -I think- means you can "overlap" two
concequative reads, resulting in one read per clock cycle.

At least, that is -I guess- what the "t OH" (Output Hold from Address
Change) means in the "ready cycle(1)" timing waveform on page 7 of the
datasheet).



But I do not see how (or if) something simular can be done for "write"
operations, but perhaps I am missing something.




Kristoff




On 22-07-17 20:19, Cecil Bayona wrote:
Static RAM chips do not have bulk mode, it's not needed, you write to it
one word at a time. Its EEPROM, FLASH, and similar memory with it's
complicated setup that are in need of bulk mode as they are slow and
bulk mode is faster, some only have bulk mode.


On 7/22/2017 12:52 PM, kristoff wrote:
Hi,


OK, left the lora chips asside for a while, so .. now back to FPGAs.

I have two olimex ice40 boards where I would like to use the onboard
SRAM. The RAM chip is a samsung K5R4016V1B-10 (256K words * 16 bits).

The datasheets are here:
https://www.olimex.com/Products/_resources/ds_k6r4016v1d_rev40.pdf
The most important pages are page 7 (for "read"), pages 8 and 9 (for
"write") and page 10 (for the functional description of the pins).


I am trying to interprete the datasheets to see how to use the chip. I
think I understand how to read or write one word, but I still puzzled
on how to do bulk-write transfers


* For read, it seams to be simple:
set /WE high and /OE low (*)

1/ put the address on the address-bus
2/ 10 ns later, read the data from the data-out
(*) ignoring the /CS, /LB and /UB pins to keep things simple.

In bulk transfer, it is like this:
- set address 1 on the Address bus
- 10 ns later:
-> read the data of address 1 from data-out
-> (at the same time) set address 2 on the address bus
- 10 bs later:
-> read the data of address 2 from data-out
-> (at the same time) set address 3 on the address bus
(etc)


* For write, to write one single word, I think it goes like this

1/ set /WE low and /OE high to go to "write" mode
-> at the same time set te address on the address bus
-> do not yet put the data on the databus (as it still in "output" mode)
2/ 10 ns later:
-> put the data on the data-bus (by then, the data-bus has switched to
"data-in"
3/ another 10 ns later:
-> set /WE high and /OE low to leave "write" mode

But I am still puzzled on how to do a "bulk write" of data. The
datasheets do not mention anything on what happens if leave the chip
in "write" mode and just change the address on the address-bus (as is
done for bulk-read)

It there is no seperate bulk-write protocol, it looks like a write to
the chip takes 3 times as much steps then a bulk-read (3 steps
compaired to one single step).


Is this a correct interpretation of the datasheet?

Can somebody who has already interfaced an FPGA with SRAM confirm or
deny this. Or is there another trick on how to do a bulk-write on a
SRAM chip?




Cheerio! Kr. Bonne.
 
This looks to be a fairly standard asynchronous static ram.

The basic requirement for a write cycle is that there is a Tas (Address
stable) which the address bus must be stable before you can pull the WE
line low, a Twp as the minimum length of time you can need to pull the
WE signal low, and a Taw address hold you need to hold the address bus
stable after WE goes high.

Sine Tas >= 0, and Taw >= 0, it is easy to think that you can just clock
the WE signal on the same clock edge as the address, but that requires
that the FPGA and the board layout has ZERO skew, which is basically
impossible.

As you note, it is easy to read at full speed, cycle after cycle, you
just need clock new addresses and one cycle later you can read the
results. Note, this is not really a 'burst' operation, but just running
full cycles one after the other (the burst terminology tend to imply
there is some setup you do and after that you can read a given number of
locations without needing to do the setup again).

For write with this sort of part there are several options:

1) Simplest, do every thing on rising edges and need 3 clock cycles to
write, cycle 1, change address, cycle 2: drop we, cycle 3: Raise we and
address hold.

2) Slightly more complicated, again do things on rising edges, but have
something to delay the WE signal slightly. 2 Cycles, 1) Set Address, and
with slight delay drop WE. 2) Hold address, and after a slight delay
raise WE.

3) Instead of a slight delay in WE, drive WE on the falling edge of the
clock, again 2 Cycles as above with the slight delay being the 1/2 cycle
delay of the falling edge.

4) Discrete Pulse generation logic, have logic on the board with delay
lines to generate the write pulse, so that WE will pulse low shortly
after the address is stable, and comes back high shortly before the
address might change again. This lets you do a write every cycle.

5) Like the Discrete Pulse Generation, but in the FPGA using a higher
speed clock. If you can be sure that the WE pulse is faster or slower
than the address bus (including FPGA skew), you could use a 400-500 MHz
clock and create a 7.5/8 ns pulse on WE. If you can enforce that, you
can use a 700 MHz clock and generate a 5 clock cycle pulse (7.14ns) in
the middle of the 10 ns cycle.

This is one of the limitations of asynchronous rams, write cycles take
more 'edges' to perform. Thus either needing more cycles or something to
generate higher speed edges.


On 7/22/17 1:52 PM, kristoff wrote:
Hi,


OK, left the lora chips asside for a while, so .. now back to FPGAs.

I have two olimex ice40 boards where I would like to use the onboard
SRAM. The RAM chip is a samsung K5R4016V1B-10 (256K words * 16 bits).

The datasheets are here:
https://www.olimex.com/Products/_resources/ds_k6r4016v1d_rev40.pdf
The most important pages are page 7 (for "read"), pages 8 and 9 (for
"write") and page 10 (for the functional description of the pins).


I am trying to interprete the datasheets to see how to use the chip. I
think I understand how to read or write one word, but I still puzzled on
how to do bulk-write transfers


* For read, it seams to be simple:
set /WE high and /OE low (*)

1/ put the address on the address-bus
2/ 10 ns later, read the data from the data-out
(*) ignoring the /CS, /LB and /UB pins to keep things simple.

In bulk transfer, it is like this:
- set address 1 on the Address bus
- 10 ns later:
-> read the data of address 1 from data-out
-> (at the same time) set address 2 on the address bus
- 10 bs later:
-> read the data of address 2 from data-out
-> (at the same time) set address 3 on the address bus
(etc)


* For write, to write one single word, I think it goes like this

1/ set /WE low and /OE high to go to "write" mode
-> at the same time set te address on the address bus
-> do not yet put the data on the databus (as it still in "output" mode)
2/ 10 ns later:
-> put the data on the data-bus (by then, the data-bus has switched to
"data-in"
3/ another 10 ns later:
-> set /WE high and /OE low to leave "write" mode

But I am still puzzled on how to do a "bulk write" of data. The
datasheets do not mention anything on what happens if leave the chip in
"write" mode and just change the address on the address-bus (as is done
for bulk-read)

It there is no seperate bulk-write protocol, it looks like a write to
the chip takes 3 times as much steps then a bulk-read (3 steps compaired
to one single step).


Is this a correct interpretation of the datasheet?

Can somebody who has already interfaced an FPGA with SRAM confirm or
deny this. Or is there another trick on how to do a bulk-write on a SRAM
chip?




Cheerio! Kr. Bonne.
 
Den lørdag den 22. juli 2017 kl. 20.32.52 UTC+2 skrev kristoff:
Hi Cecil,


Thanks for your reply.


I agree it's not a bulk-mode as such.

What I meant was that when doing multiple reads one after the other you
can stich them together:


Correct me if I am wrong, but how I interprete the datasheets, the "read
data from the address-bus" can be done at the same time as the "set next
address on address-bus". This -I think- means you can "overlap" two
concequative reads, resulting in one read per clock cycle.

SRAM doesn't have a clock, you just have to comply with the required timing

At least, that is -I guess- what the "t OH" (Output Hold from Address
Change) means in the "ready cycle(1)" timing waveform on page 7 of the
datasheet).



But I do not see how (or if) something simular can be done for "write"
operations, but perhaps I am missing something.

write happens on the rising edge on /WR

-Lasse
 
On 7/22/17 3:56 PM, lasselangwadtchristensen@gmail.com wrote:
Den lørdag den 22. juli 2017 kl. 20.32.52 UTC+2 skrev kristoff:
Hi Cecil,


Thanks for your reply.


I agree it's not a bulk-mode as such.

What I meant was that when doing multiple reads one after the other you
can stich them together:


Correct me if I am wrong, but how I interprete the datasheets, the "read
data from the address-bus" can be done at the same time as the "set next
address on address-bus". This -I think- means you can "overlap" two
concequative reads, resulting in one read per clock cycle.

SRAM doesn't have a clock, you just have to comply with the required timing


At least, that is -I guess- what the "t OH" (Output Hold from Address
Change) means in the "ready cycle(1)" timing waveform on page 7 of the
datasheet).



But I do not see how (or if) something simular can be done for "write"
operations, but perhaps I am missing something.


write happens on the rising edge on /WR

-Lasse

Actually, with asynchronous parts, things don't happen 'on edges' but on
levels (you measure timing requirements edge to edge). Asynchronous
Srams tend to be a sea of RS Flip flops, and when write is low, the
addresses flip flops will have their set or reset line asserted, so if
you wanted to talk of a time when the write happened, it was on the
falling edge, with a propagation delay/hold requirement.

Toh is the minimum guaranteed propagation delay from address to data,
just like Taa is the maximum delay from address to data. (Trc actually
isn't a critical parameter for the ram itself, but is a nominal system
parameter. With Asyncronuous SRam, changing the address inputs faster
than Trc won't cause any problems, except for the fact that you won't
get valid data out until you stop doing it.
 
Hi Richard,


Thank you for your reply.

Your message really helped to better understand the timing waveforms.


I'll start with the simpest setup and after that experiment with using
the falling edge of the clock to clear the /WE signal (option 3).



Kristoff
 
On 7/22/17 7:46 PM, kristoff wrote:
Hi Richard,


Thank you for your reply.

Your message really helped to better understand the timing waveforms.


I'll start with the simpest setup and after that experiment with using
the falling edge of the clock to clear the /WE signal (option 3).



Kristoff

One thing to remind about, having a 10ns memory part does NOT mean you
can talk to it with a 100MHz (10ns) clock. You will need to add in time
from Clock->output on your address bus, and the needed Setup time on the
data bus in. If you want the best performance, if possible you want both
of these to be using FF in the I/O block of the FPGA, as those will have
much lower propagation delays.

Asynchronous devices can be harder to use, but can give you
significantly improved read performance if you are worried about
latency, as synchronous interfaces can cost clock cycle. (on the other
hand, synchronous interfaces can often write faster as you can often
just stream the data, and the latency isn't important).
 
Richard Damon wrote on 7/22/2017 4:23 PM:
On 7/22/17 3:56 PM, lasselangwadtchristensen@gmail.com wrote:
Den lørdag den 22. juli 2017 kl. 20.32.52 UTC+2 skrev kristoff:
Hi Cecil,


Thanks for your reply.


I agree it's not a bulk-mode as such.

What I meant was that when doing multiple reads one after the other you
can stich them together:


Correct me if I am wrong, but how I interprete the datasheets, the "read
data from the address-bus" can be done at the same time as the "set next
address on address-bus". This -I think- means you can "overlap" two
concequative reads, resulting in one read per clock cycle.

SRAM doesn't have a clock, you just have to comply with the required timing


At least, that is -I guess- what the "t OH" (Output Hold from Address
Change) means in the "ready cycle(1)" timing waveform on page 7 of the
datasheet).



But I do not see how (or if) something simular can be done for "write"
operations, but perhaps I am missing something.


write happens on the rising edge on /WR

-Lasse


Actually, with asynchronous parts, things don't happen 'on edges' but on
levels (you measure timing requirements edge to edge). Asynchronous Srams
tend to be a sea of RS Flip flops, and when write is low, the addresses flip
flops will have their set or reset line asserted, so if you wanted to talk
of a time when the write happened, it was on the falling edge, with a
propagation delay/hold requirement.

Toh is the minimum guaranteed propagation delay from address to data, just
like Taa is the maximum delay from address to data. (Trc actually isn't a
critical parameter for the ram itself, but is a nominal system parameter.
With Asyncronuous SRam, changing the address inputs faster than Trc won't
cause any problems, except for the fact that you won't get valid data out
until you stop doing it.

I think what Richard wrote is the clearest explanation of why there is no
bulk write with async RAM. The level of the AND of WR- and CS-. So while
these two signals are low it is expected the address does *not* change. If
the address changed, the RAM cell selected will change and there can be
extraneous cells selected as the address lines settle. By writing to
location 3 and then 4 without removing WR or CS you can be writing to any
combination of 0 to 7 in the switch. Since none of this meets timing the
writing will be random garbage and not even the data you are trying to write
to locations 3 and 4.

When both WR and CS are asserted, keep the address stable and keep the data
stable for the last N ns before either control line is deasserted.

--

Rick C
 
On 22/07/2017 20:56, lasselangwadtchristensen@gmail.com wrote:
Den lørdag den 22. juli 2017 kl. 20.32.52 UTC+2 skrev kristoff:
Hi Cecil,


Thanks for your reply.


I agree it's not a bulk-mode as such.

What I meant was that when doing multiple reads one after the other you
can stich them together:


Correct me if I am wrong, but how I interprete the datasheets, the "read
data from the address-bus" can be done at the same time as the "set next
address on address-bus". This -I think- means you can "overlap" two
concequative reads, resulting in one read per clock cycle.

SRAM doesn't have a clock, you just have to comply with the required timing

There are some forms of clocked SRAM. ZBT was one type introduced by IDT.

I assume it still exists?


--
Mike Perkins
Video Solutions Ltd
www.videosolutions.ltd.uk
 
On 7/29/17 3:32 PM, Mike Perkins wrote:
On 22/07/2017 20:56, lasselangwadtchristensen@gmail.com wrote:
Den lørdag den 22. juli 2017 kl. 20.32.52 UTC+2 skrev kristoff:
Hi Cecil,


Thanks for your reply.


I agree it's not a bulk-mode as such.

What I meant was that when doing multiple reads one after the other you
can stich them together:


Correct me if I am wrong, but how I interprete the datasheets, the "read
data from the address-bus" can be done at the same time as the "set next
address on address-bus". This -I think- means you can "overlap" two
concequative reads, resulting in one read per clock cycle.

SRAM doesn't have a clock, you just have to comply with the required
timing

There are some forms of clocked SRAM. ZBT was one type introduced by IDT.

I assume it still exists?

The datasheet pointed to was a classical Asynchronous Static Ram, which
doesn't have a clock.

There ARE Synchronous Static Rams which do have a clock pin. Synchronous
devices tend to be a bit easier to interface to a synchronous systems,
which most FPGA systems tend to be. Sometimes you lose a bit in latency
when using them though.
 
Richard Damon wrote on 7/29/2017 6:06 PM:
On 7/29/17 3:32 PM, Mike Perkins wrote:
On 22/07/2017 20:56, lasselangwadtchristensen@gmail.com wrote:
Den lørdag den 22. juli 2017 kl. 20.32.52 UTC+2 skrev kristoff:
Hi Cecil,


Thanks for your reply.


I agree it's not a bulk-mode as such.

What I meant was that when doing multiple reads one after the other you
can stich them together:


Correct me if I am wrong, but how I interprete the datasheets, the "read
data from the address-bus" can be done at the same time as the "set next
address on address-bus". This -I think- means you can "overlap" two
concequative reads, resulting in one read per clock cycle.

SRAM doesn't have a clock, you just have to comply with the required timing

There are some forms of clocked SRAM. ZBT was one type introduced by IDT.

I assume it still exists?



The datasheet pointed to was a classical Asynchronous Static Ram, which
doesn't have a clock.

There ARE Synchronous Static Rams which do have a clock pin. Synchronous
devices tend to be a bit easier to interface to a synchronous systems, which
most FPGA systems tend to be. Sometimes you lose a bit in latency when using
them though.

I don't know the details of how SRAM is constructed, but there was a strong
market for it until maybe about 10 years ago. Then growth of SRAM sizes
pretty much stopped as new devices dwindled. DRAM has continued to improve
at the cutting edge of semiconductor technology along with Flash, but SRAM
is now the red headed stepchild. I guess the functionality of SRAM has
largely been incorporated internally in FPGAs. If more size is needed than
is convenient in FPGAs, DRAM is used. They may have longer latency, but
speed is certainly not lacking.

--

Rick C
 
On 31/07/17 05:26, rickman wrote:
Richard Damon wrote on 7/29/2017 6:06 PM:
On 7/29/17 3:32 PM, Mike Perkins wrote:
On 22/07/2017 20:56, lasselangwadtchristensen@gmail.com wrote:
Den lørdag den 22. juli 2017 kl. 20.32.52 UTC+2 skrev kristoff:
Hi Cecil,


Thanks for your reply.


I agree it's not a bulk-mode as such.

What I meant was that when doing multiple reads one after the other
you
can stich them together:


Correct me if I am wrong, but how I interprete the datasheets, the
"read
data from the address-bus" can be done at the same time as the "set
next
address on address-bus". This -I think- means you can "overlap" two
concequative reads, resulting in one read per clock cycle.

SRAM doesn't have a clock, you just have to comply with the required
timing

There are some forms of clocked SRAM. ZBT was one type introduced by
IDT.

I assume it still exists?



The datasheet pointed to was a classical Asynchronous Static Ram, which
doesn't have a clock.

There ARE Synchronous Static Rams which do have a clock pin. Synchronous
devices tend to be a bit easier to interface to a synchronous systems,
which
most FPGA systems tend to be. Sometimes you lose a bit in latency when
using
them though.

I don't know the details of how SRAM is constructed, but there was a
strong market for it until maybe about 10 years ago. Then growth of
SRAM sizes pretty much stopped as new devices dwindled. DRAM has
continued to improve at the cutting edge of semiconductor technology
along with Flash, but SRAM is now the red headed stepchild. I guess the
functionality of SRAM has largely been incorporated internally in
FPGAs. If more size is needed than is convenient in FPGAs, DRAM is
used. They may have longer latency, but speed is certainly not lacking.

Roughly speaking, DRAM needs one transistor and a capacitor for a cell -
SRAM needs more transistors (4, I think). So SRAM costs a good deal
more per bit than DRAM. Once speeds reached the point where bus speeds
were the limiting factor for throughput rather than the memory speed,
and after DRAM started having internal refresh rather than external
refresh (needing active read/re-write cycles from the memory
controller), DRAM was almost as fast as SRAM but much cheaper. SRAM
still wins out on latency (and lower standby power), but as you say the
SRAM has moved on board on devices (FPGAs, caches in processors, on-chip
ram in microcontrollers) for even lower latency.
 
kristoff wrote:
I'll start with the simpest setup and after that experiment with using
the falling edge of the clock to clear the /WE signal (option 3).

Generating a synchronously gated WE in a single cycle with a 1x clock can be done fairly easily by using the FPGA's dual-edge output flip-flop primitives.

I posted some notes on this technique (for a Spartan-3) to the fpga-cpu group many years ago:
https://groups.yahoo.com/neo/groups/fpga-cpu/conversations/messages/2076
https://groups.yahoo.com/neo/groups/fpga-cpu/conversations/messages/2177

That S3 example code can be found here:
https://sites.google.com/site/fpgastuff/ram_test.zip

The dual-edge I/O primitive for the ICE family would be SB_IO or SB_IO_OD, see:
https://www.latticesemi.com/~/media/latticesemi/documents/technicalbriefs/sbticetechnologylibrary201701.pdf

-Brian
 
On Tuesday, August 1, 2017 at 7:21:32 PM UTC-4, brim...@gmail.com wrote:

Generating a synchronously gated WE in a single cycle with a 1x clock can be done fairly easily by using the FPGA's dual-edge output flip-flop primitives.

I posted some notes on this technique (for a Spartan-3) to the fpga-cpu group many years ago:
https://groups.yahoo.com/neo/groups/fpga-cpu/conversations/messages/2076
https://groups.yahoo.com/neo/groups/fpga-cpu/conversations/messages/2177

It's even easier than that to synchronously control a standard async SRAM. Simply connect WE to the clock and hold OE active all the time except for cycles where you want to write something new into the SRAM.

Kevin Jennings
 
KJ wrote on 8/6/2017 8:01 AM:
On Tuesday, August 1, 2017 at 7:21:32 PM UTC-4, brim...@gmail.com wrote:

Generating a synchronously gated WE in a single cycle with a 1x clock can be done fairly easily by using the FPGA's dual-edge output flip-flop primitives.

I posted some notes on this technique (for a Spartan-3) to the fpga-cpu group many years ago:
https://groups.yahoo.com/neo/groups/fpga-cpu/conversations/messages/2076
https://groups.yahoo.com/neo/groups/fpga-cpu/conversations/messages/2177


It's even easier than that to synchronously control a standard async SRAM. Simply connect WE to the clock and hold OE active all the time except for cycles where you want to write something new into the SRAM.

That would depend a *lot* on the details of the setup and hold times for the
async SRAM, no? You can do what you want with data for much of the clock
cycle, but the address has to meet setup and hold for the entire WE time.
That's typically more than half a clock cycle and makes it hard to use it on
every clock cycle.

--

Rick C
 
Den søndag den 6. august 2017 kl. 18.40.25 UTC+2 skrev rickman:
KJ wrote on 8/6/2017 8:01 AM:
On Tuesday, August 1, 2017 at 7:21:32 PM UTC-4, brim...@gmail.com wrote:

Generating a synchronously gated WE in a single cycle with a 1x clock can be done fairly easily by using the FPGA's dual-edge output flip-flop primitives.

I posted some notes on this technique (for a Spartan-3) to the fpga-cpu group many years ago:
https://groups.yahoo.com/neo/groups/fpga-cpu/conversations/messages/2076
https://groups.yahoo.com/neo/groups/fpga-cpu/conversations/messages/2177


It's even easier than that to synchronously control a standard async SRAM. Simply connect WE to the clock and hold OE active all the time except for cycles where you want to write something new into the SRAM.

That would depend a *lot* on the details of the setup and hold times for the
async SRAM, no? You can do what you want with data for much of the clock
cycle, but the address has to meet setup and hold for the entire WE time.
That's typically more than half a clock cycle and makes it hard to use it on
every clock cycle.

and just using the clock give you the headache of trying to control routing
delays on data vs. WE

using the dual edge output flipflop makes it all much controllable
 
On Sunday, August 6, 2017 at 12:40:25 PM UTC-4, rickman wrote:
KJ wrote on 8/6/2017 8:01 AM:
It's even easier than that to synchronously control a standard async SRAM. Simply connect WE to the clock and hold OE active all the time except for cycles where you want to write something new into the SRAM.

That would depend a *lot* on the details of the setup and hold times for the
async SRAM, no? You can do what you want with data for much of the clock
cycle, but the address has to meet setup and hold for the entire WE time.
That's typically more than half a clock cycle and makes it hard to use it on
every clock cycle.

Address (and data) setup and hold times are easily met. As a first order approximation, the setup time will be T/2-Tco(max). The address hold time will be Tco(min).

What is your source for statement "That's typically more than half a clock cycle"? The ancient Cypress CY62256N lists both of these requirements (Tsa and Tha) as 0 ns [1].

The technique works. You get single cycle read or write on 100% of the clock cycles, timing is met, period...and it worked 20+ years ago on product I designed [2].

Kevin Jennings

[1] http://www.cypress.com/file/43841/download page 7.
[2] USPTO 6,169,703 (Patent status = Expired) https://patentimages.storage.googleapis.com/8a/c2/3d/f566f483f9e961/US6169703.pdf
 
On Sunday, August 6, 2017 at 1:30:46 PM UTC-4, lasselangwad...@gmail.com wrote:
Den søndag den 6. august 2017 kl. 18.40.25 UTC+2 skrev rickman:

and just using the clock give you the headache of trying to control routing
delays on data vs. WE

using the dual edge output flipflop makes it all much controllable

Not true. There is nothing special that needs to be done to "control routing delays on data vs. WE". Do you have any basis for that statement?

Using the method I described is absolutely the same as connecting up two 74X374 flip flops, nothing more, nothing less. How is that a 'headache'?

Kevin Jennings
 
KJ wrote on 8/6/2017 1:40 PM:
On Sunday, August 6, 2017 at 1:30:46 PM UTC-4, lasselangwad...@gmail.com wrote:
Den søndag den 6. august 2017 kl. 18.40.25 UTC+2 skrev rickman:

and just using the clock give you the headache of trying to control routing
delays on data vs. WE

using the dual edge output flipflop makes it all much controllable

Not true. There is nothing special that needs to be done to "control routing delays on data vs. WE". Do you have any basis for that statement?

As long as the signals are register in the output FFs that's true. But you
can't register the clock! So the routing delays will be *very* important if
running a fast asynch SRAM.

If the dual edge output flipflops are used the clock can in fact be
registers in essence giving all signals the same delays within a tolerance.


> Using the method I described is absolutely the same as connecting up two 74X374 flip flops, nothing more, nothing less. How is that a 'headache'?

Huh?

--

Rick C
 

Welcome to EDABoard.com

Sponsor

Back
Top