LPDDR on spartan-3e

J

jonpry

Guest
Hi All,

I have a spartan-3e board with a piece of LPDDR on it. After
modifying the MiG sources initialization stuff, I was able to get the
user_example running in the simulator. In hardware I can see that the
chip is bursting out what was written to it previously. However,
inside of the fpga the data read back from the memory is not correct.
I originally suspected the DQS delay circuitry and built a simple
module that causes the MiG design to cycle through all 6 dqs taps at 1
per second. None of the taps result in good read back. I am confused
as to what could be causing the problem.

I've noticed that in the simulator, things go badly if I run the
design too slow. Anything slower than 12ns period causes read errors.
Haven't managed to track down the source of this, but it seems to be
related to some confusion in the data generator.

Any advice would be appreciated.

Thanks,

Jon Pry
 
   I've noticed that in the simulator, things go badly if I run the
design too slow. Anything slower than 12ns period causes read errors.
Haven't managed to track down the source of this, but it seems to be
related to some confusion in the data generator.
I've looked into this a little further. It appears that at slower
clock speeds, rst_dqs_delay is not going low until slightly after the
last dqs clock in the burst. Causing the fifo write flag to stay
enabled until after the first word of the next transfer, be it read or
write. Thus getting the data patterns all screwed up. I've yet to
determine if this is really happening in hardware. I'm also not
convinced about the MiG behavioral test bench as it does not include
assignment delays anywhere.

Ideally I would like to run my LPDDR at very low speed to rule out
signal integrity problems until the logic is proven. There is no DLL
in the memory and it seems to operate fine down at 10mhz. That being
said, I have tried the design at all manner of speeds with little
difference. Any experience out there with getting MiG slowed down?
 
=A0 =A0I've noticed that in the simulator, things go badly if I run the
design too slow. Anything slower than 12ns period causes read errors.
Haven't managed to track down the source of this, but it seems to be
related to some confusion in the data generator.

I've looked into this a little further. It appears that at slower
clock speeds, rst_dqs_delay is not going low until slightly after the
last dqs clock in the burst. Causing the fifo write flag to stay
enabled until after the first word of the next transfer, be it read or
write. Thus getting the data patterns all screwed up. I've yet to
determine if this is really happening in hardware. I'm also not
convinced about the MiG behavioral test bench as it does not include
assignment delays anywhere.

Ideally I would like to run my LPDDR at very low speed to rule out
signal integrity problems until the logic is proven. There is no DLL
in the memory and it seems to operate fine down at 10mhz. That being
said, I have tried the design at all manner of speeds with little
difference. Any experience out there with getting MiG slowed down?
Cant say I am a great fan of MIG. The design seems incredibly bloated an
not very easy to get to run at a reasonable speed. I ended up writing m
own DDR2 controller.

You should check that MIG and the device allows you to run at such a slo
speed. You really need a good simulation to start with all timing
verified. Check the datasheet to verify that no timings are being violated
Can you not look at the data on a scope to see if you are getting th
correct signals and verify timing? Memory can be a pain to get working s
you need to be as meticulous as possible.

Regards

Jon

---------------------------------------
Posted through http://www.FPGARelated.com
 
On Dec 9, 11:47 am, "maxascent"
<maxascent@n_o_s_p_a_m.n_o_s_p_a_m.yahoo.co.uk> wrote:
=A0 =A0I've noticed that in the simulator, things go badly if I run the
design too slow. Anything slower than 12ns period causes read errors.
Haven't managed to track down the source of this, but it seems to be
related to some confusion in the data generator.

  I've looked into this a little further. It appears that at slower
clock speeds, rst_dqs_delay is not going low until slightly after the
last dqs clock in the burst. Causing the fifo write flag to stay
enabled until after the first word of the next transfer, be it read or
write. Thus getting the data patterns all screwed up. I've yet to
determine if this is really happening in hardware. I'm also not
convinced about the MiG behavioral test bench as it does not include
assignment delays anywhere.

  Ideally I would like to run my LPDDR at very low speed to rule out
signal integrity problems until the logic is proven. There is no DLL
in the memory and it seems to operate fine down at 10mhz. That being
said, I have tried the design at all manner of speeds with little
difference. Any experience out there with getting MiG slowed down?

Cant say I am a great fan of MIG. The design seems incredibly bloated and
not very easy to get to run at a reasonable speed. I ended up writing my
own DDR2 controller.

You should check that MIG and the device allows you to run at such a slow
speed. You really need a good simulation to start with all timings
verified. Check the datasheet to verify that no timings are being violated.
Can you not look at the data on a scope to see if you are getting the
correct signals and verify timing? Memory can be a pain to get working so
you need to be as meticulous as possible.

Regards

Jon        

---------------------------------------        
Posted throughhttp://www.FPGARelated.com
There are other issues with LPDDR if you mean "mobile" low-power
parts. The start-up initialization sequence is different as well as
the
I/O standards and DQS timing. They do _not_ have delay-locked
loops in them, so read timing almost works better using your internal
clock than the DQS signal. Also I used them with Lattice parts that
have special stunt logic for DDR, and had to scrap their DQS recovery
because of the I/O standard and the fact that their preamble
detector didn't work unless you had SSTL (it used the difference
in voltage level between AC and DC low in the standard).

-- Gabor
 
Cant say I am a great fan of MIG. The design seems incredibly bloated and
not very easy to get to run at a reasonable speed. I ended up writing my
own DDR2 controller.
You should check that MIG and the device allows you to run at such a slow
speed. You really need a good simulation to start with all timings
verified. Check the datasheet to verify that no timings are being violated.
Can you not look at the data on a scope to see if you are getting the
correct signals and verify timing? Memory can be a pain to get working so
you need to be as meticulous as possible.
The MIG controller is a bit complicated, but it does appear to be the
correct architecture for LPDDR parts. On the scope I can see that the
memory is indeed working properly, so its timings must be fine.
Whether or not the memory is meeting the fpga's timing is a different
story.

I've managed to confirm that what is happening in simulation when
clock is slower than 12ns, is indeed what is happening in hardware at
any speed from 10 to 100 mhz. I guess I will need to rewrite the dqs
fifo enable stuff. I think if dqs comes after some margin the logic is
just broken and won't turn off.

They do _not_ have delay-locked
loops in them, so read timing almost works better using your internal
clock than the DQS signal.
This argument seems backwards to me. There is almost no point on using
DQS on regular DDR parts because the the DLL phase aligns it to the
master clock, giving you a multitude of good options. But with no DLL,
there is no phase guarantee, forcing you to use a truly source
synchronous design.
 
On Dec 9, 12:06 pm, jonpry <jon...@gmail.com> wrote:
Cant say I am a great fan of MIG. The design seems incredibly bloated and
not very easy to get to run at a reasonable speed. I ended up writing my
own DDR2 controller.
You should check that MIG and the device allows you to run at such a slow
speed. You really need a good simulation to start with all timings
verified. Check the datasheet to verify that no timings are being violated.
Can you not look at the data on a scope to see if you are getting the
correct signals and verify timing? Memory can be a pain to get working so
you need to be as meticulous as possible.

The MIG controller is a bit complicated, but it does appear to be the
correct architecture for LPDDR parts. On the scope I can see that the
memory is indeed working properly, so its timings must be fine.
Whether or not the memory is meeting the fpga's timing is a different
story.

I've managed to confirm that what is happening in simulation when
clock is slower than 12ns, is indeed what is happening in hardware at
any speed from 10 to 100 mhz.  I guess I will need to rewrite the dqs
fifo enable stuff. I think if dqs comes after some margin the logic is
just broken and won't turn off.

They do _not_ have delay-locked
loops in them, so read timing almost works better using your internal
clock than the DQS signal.

This argument seems backwards to me. There is almost no point on using
DQS on regular DDR parts because the the DLL phase aligns it to the
master clock, giving you a multitude of good options. But with no DLL,
there is no phase guarantee, forcing you to use a truly source
synchronous design.
The point I was making is that the DQS pins of the mobile DDR memories
are not phase aligned to the DQ signals. For normal DDR memories the
DQS is edge aligned to the DQ, which is not easy to use, but with a
90 degree phase-shift circuit in the FPGA (MIG uses this where
possible)
can very accurately center-sample the data. It is not so easy to get
a
center sampling point using the DQS output of a mobile DDR device.

You can run Mobile DDR much slower than the standard DDR because
it doesn't have the DLL. You can even gate the clock to it if you
follow
the rules in the data sheet. When running more slowly, the data eye
gets bigger and is easier to hit without the added complexity of the
DQS signals.

-- Gabor
 
The point I was making is that the DQS pins of the mobile DDR memories
are not phase aligned to the DQ signals.  For normal DDR memories the
DQS is edge aligned to the DQ, which is not easy to use, but with a
90 degree phase-shift circuit in the FPGA (MIG uses this where
possible)
can very accurately center-sample the data.  It is not so easy to get
a
center sampling point using the DQS output of a mobile DDR device.
I have read your other posts on the subject. You mentioned some
difference between mobile and standard ddr's dq/dqs relationship.
After seeing this, I made an imho heroic effort to find out what this
difference is, and turned up not much. Maybe it is my particular chip,
but from the datasheet:

DQS edge-aligned with data for READs; center-
aligned with data for WRITEs

Which sounds an awful lot like what you were describing for DDR. From
an implementation perspective, it seems like it would be trivial to
clock out DQS right with the DQ. I can't imagine why they would do
anything else.

You can run Mobile DDR much slower than the standard DDR because
it doesn't have the DLL.  You can even gate the clock to it if you
follow
the rules in the data sheet.  When running more slowly, the data eye
gets bigger and is easier to hit without the added complexity of the
DQS signals.

-- Gabor
My operation seems to be working now. At least at 50mhz. I get
problems at 100, but there are several things that could be causing
this. I ended up short circuiting the rst_dqs_div stuff that loopbacks
outside of the chip. This allowed the flag enough to time to get read
fifo turned off at the end of the burst. I still am not totally sure
why this fix was needed in the testbench let alone the hardware. And
don't understand the consequences of removing it.

Gabor, thanks for your help anyways, your original posts inspired me
to use LPDDR in my design since xilinx is more negative about the
whole thing.

~Jon
 
jonpry <jonpry@gmail.com> wrote:

Cant say I am a great fan of MIG. The design seems incredibly bloated and
not very easy to get to run at a reasonable speed. I ended up writing my
own DDR2 controller.
You should check that MIG and the device allows you to run at such a slow
speed. You really need a good simulation to start with all timings
verified. Check the datasheet to verify that no timings are being violated.
Can you not look at the data on a scope to see if you are getting the
correct signals and verify timing? Memory can be a pain to get working so
you need to be as meticulous as possible.

The MIG controller is a bit complicated, but it does appear to be the
correct architecture for LPDDR parts. On the scope I can see that the
memory is indeed working properly, so its timings must be fine.
Whether or not the memory is meeting the fpga's timing is a different
story.
MIG is way too bloated especially on Spartan devices.

I've managed to confirm that what is happening in simulation when
clock is slower than 12ns, is indeed what is happening in hardware at
any speed from 10 to 100 mhz. I guess I will need to rewrite the dqs
fifo enable stuff. I think if dqs comes after some margin the logic is
just broken and won't turn off.

They do _not_ have delay-locked
loops in them, so read timing almost works better using your internal
clock than the DQS signal.

This argument seems backwards to me. There is almost no point on using
DQS on regular DDR parts because the the DLL phase aligns it to the
master clock, giving you a multitude of good options. But with no DLL,
there is no phase guarantee, forcing you to use a truly source
synchronous design.
There is a phase quarantee (if you clock the memory from the FPGA) but
you need to calculate it by yourself. On a Spartan 3e you should be
able to achieve 100 to 125MHz depending on the speed grade. An added
bonus of writing your own DDR controller is that you can use any I/O
pin to connect the memory.

--
Failure does not prove something is impossible, failure simply
indicates you are not using the right tools...
nico@nctdevpuntnl (punt=.)
--------------------------------------------------------------
 
There is a phase quarantee (if you clock the memory from the FPGA) but
you need to calculate it by yourself. On a Spartan 3e you should be
able to achieve 100 to 125MHz depending on the speed grade. An added
bonus of writing your own DDR controller is that you can use any I/O
pin to connect the memory.
My board is not fully assembled yet. Normally the clock would be
supplied from the PLL's in an omap3 processor. But it is not populated
yet, so I am supplying the clock from off board. Living in the third
world, there is a shortage of clock sources at hand. I have a 50mhz
oscillator, and whatever i can synthesize on another fpga board. The
synthetic clocks do not seem to work at even 50mhz. So I'm going to
put off further speed testing until the clocking situation improves.
It may work for all I know. everything looks fine on the scope at
100mhz anyways.
 
jonpry <jonpry@gmail.com> wrote:

There is a phase quarantee (if you clock the memory from the FPGA) but
you need to calculate it by yourself. On a Spartan 3e you should be
able to achieve 100 to 125MHz depending on the speed grade. An added
bonus of writing your own DDR controller is that you can use any I/O
pin to connect the memory.

My board is not fully assembled yet. Normally the clock would be
supplied from the PLL's in an omap3 processor. But it is not populated
yet, so I am supplying the clock from off board. Living in the third
world, there is a shortage of clock sources at hand. I have a 50mhz
oscillator, and whatever i can synthesize on another fpga board. The
synthetic clocks do not seem to work at even 50mhz. So I'm going to
put off further speed testing until the clocking situation improves.
It may work for all I know. everything looks fine on the scope at
100mhz anyways.
Why aren't you using the clock multipliers in the Spartan3e? You can
feed it almost any clock you want and create any clock frequency you
need. If the memory is connected to the FPGA, you should also clock
the memory from the FPGA. This eliminates the clock input to the clock
net timing uncertainty.

--
Failure does not prove something is impossible, failure simply
indicates you are not using the right tools...
nico@nctdevpuntnl (punt=.)
--------------------------------------------------------------
 
Why aren't you using the clock multipliers in the Spartan3e? You can
feed it almost any clock you want and create any clock frequency you
need. If the memory is connected to the FPGA, you should also clock
the memory from the FPGA. This eliminates the clock input to the clock
net timing uncertainty.
Mainly because the MIG design requires a CLK90, and there is no
CLKFX90. I think this is the root of the trouble. If I use a synthetic
clock from another fpga, it has DLL jitter in it, and although the DLL
in my design seems to stay locked for a few seconds while running on
such a source, things seem to not work quite right for that period of
time.

The memory is being clocked from the fpga, just I can't get a low
jitter input at anything other than 50mhz right now.
 
On Dec 10, 1:46 pm, jonpry <jon...@gmail.com> wrote:
Why aren't you using the clock multipliers in the Spartan3e? You can
feed it almost any clock you want and create any clock frequency you
need. If the memory is connected to the FPGA, you should also clock
the memory from the FPGA. This eliminates the clock input to the clock
net timing uncertainty.

Mainly because the MIG design requires a CLK90, and there is no
CLKFX90. I think this is the root of the trouble. If I use a synthetic
clock from another fpga, it has DLL jitter in it, and although the DLL
in my design seems to stay locked for a few seconds while running on
such a source, things seem to not work quite right for that period of
time.

The memory is being clocked from the fpga, just I can't get a low
jitter input at anything other than 50mhz right now.
I forgot to mention that for a time I tried cascaded DCM's in the one
fpga, but was not able to get it lock. Presumably I just did something
wrong, it was just easier to debug with the DLL's in different chips
because I am able to have a bank of different frequencies, and I just
plug in the wire and see what happens. Now that it is generally
working I suppose I could try cascaded DCM's again.
 
On Dec 10, 5:04 pm, jonpry <jon...@gmail.com> wrote:
On Dec 10, 1:46 pm, jonpry <jon...@gmail.com> wrote:

Why aren't you using the clock multipliers in the Spartan3e? You can
feed it almost any clock you want and create any clock frequency you
need. If the memory is connected to the FPGA, you should also clock
the memory from the FPGA. This eliminates the clock input to the clock
net timing uncertainty.

Mainly because the MIG design requires a CLK90, and there is no
CLKFX90. I think this is the root of the trouble. If I use a synthetic
clock from another fpga, it has DLL jitter in it, and although the DLL
in my design seems to stay locked for a few seconds while running on
such a source, things seem to not work quite right for that period of
time.

The memory is being clocked from the fpga, just I can't get a low
jitter input at anything other than 50mhz right now.

I forgot to mention that for a time I tried cascaded DCM's in the one
fpga, but was not able to get it lock. Presumably I just did something
wrong, it was just easier to debug with the DLL's in different chips
because I am able to have a bank of different frequencies, and I just
plug in the wire and see what happens. Now that it is generally
working I suppose I could try cascaded DCM's again.
Cascading DCM's does not work particularly well. There is more jitter
on the FX outputs of the DCM's than the CLK0, 90, 2X outputs. The
second DCM therefore ends up with a jittery input clock. If you want
to try cascading at 100 Mhz, just use the CLK2X output of the first
DCM to get a somewhat less jittery 100 Mhz and feed that into
the DCM for the 90 degree shift. Unfortunately the Spartan 3 series
don't have PLL's, which would be a better choice for frequency
synthesis.

Regards,
Gabor
 
jonpry <jonpry@gmail.com> wrote:

Why aren't you using the clock multipliers in the Spartan3e? You can
feed it almost any clock you want and create any clock frequency you
need. If the memory is connected to the FPGA, you should also clock
the memory from the FPGA. This eliminates the clock input to the clock
net timing uncertainty.

Mainly because the MIG design requires a CLK90, and there is no
CLKFX90. I think this is the root of the trouble. If I use a synthetic
???? Read the spartan3e datasheet and user manual. The DCM has 90, 180
and 270 degrees phase shift outputs and the ability to shift these
clocks by small steps as well.

clock from another fpga, it has DLL jitter in it, and although the DLL
in my design seems to stay locked for a few seconds while running on
such a source, things seem to not work quite right for that period of
time.
The DLL jitter is somewhere around 100ps p-p maximum. If your design
fails by that marging you better not start producing it. The FPGA is
quite tolerant to input jitter BTW.

Your problem sounds like the pulses from your clock source are not
wide enough or otherwise distorted. Get a >400MHz scope and check
whether pulse widths and rise times are within the specs the FPGA
requires.

The memory is being clocked from the fpga, just I can't get a low
jitter input at anything other than 50mhz right now.
You can cascade DCMs as well but you should read the Xilinx
application notes to get it right.

--
Failure does not prove something is impossible, failure simply
indicates you are not using the right tools...
nico@nctdevpuntnl (punt=.)
--------------------------------------------------------------
 

Welcome to EDABoard.com

Sponsor

Back
Top