400 Mb/s ADC

J

Jeff Peterson

Guest
We are building a new radio telescope called PAST
(http://astrophysics.phys.cmu.edu/~jbp/past6.pdf)
which we will install at the South Pole or in Western China.

To make this work, will need to sample (6 to 8 bit precision) dozens
of analog voltages at 400 Msample/sec and feed these data streams into
PCs. One PC per sampler.

The flash ADCs we need are available (Maxim), but we are finding it
difficult to get the data into the PC.

One simple way would be to use SCSI ultra640, but so far I have not
found any 640 adapters on the market. Is any 640 adapter available?
anything coming soon?

or we could go right into a PCI-X bus. has anyone out there
done this at 400 Mb/s? is this hard to do? FPGA core liscense
for this seems expensive ($9K), with no guarentee of 400 mByte rates.

is there a better way?

thanks

-Jeff Peterson
 
"Nik Simpson" <n_simpson@bellsouth.net> wrote in message news:<DiMub.3510$gU2.827@bignews6.bellsouth.net>...
Jeff Peterson wrote:
We are building a new radio telescope called PAST
(http://astrophysics.phys.cmu.edu/~jbp/past6.pdf)
which we will install at the South Pole or in Western China.

To make this work, will need to sample (6 to 8 bit precision) dozens
of analog voltages at 400 Msample/sec and feed these data streams into
PCs. One PC per sampler.

How big is a sample?
8 bits.


The flash ADCs we need are available (Maxim), but we are finding it
difficult to get the data into the PC.

One simple way would be to use SCSI ultra640, but so far I have not
found any 640 adapters on the market. Is any 640 adapter available?
anything coming soon?

or we could go right into a PCI-X bus. has anyone out there
done this at 400 Mb/s? is this hard to do? FPGA core liscense
for this seems expensive ($9K), with no guarentee of 400 mByte rates.

is there a better way?


Not clear from this whether you mean Mbit/s or MBytes/sec. If you mean
Mbit/s then obviously that's not a hard problem to solve. If as I suspect
you do mean Mbytes/sec then a PC (by the conventional definition) isn't
going to cut it because typical PC motherboards don't support PCI-X at any
frequency, they are still limited to 33MHz/32bit PCI which just isn't good
enough.
i do mean 400Mbytes/sec. and yeah, PCI 33/32 wont cut it.

So the first step will be identifying a motherboard (probably with a
workstation or server classification) that supports PCI-X at least 100MHz,
which gives a peak theoretical throughput of 800MB/s, but a sustain probably
closer to 400MB/s. Then you need to define what you are doing with the data,
for example you could be:

1. Just capturing the data performing some operation on it, storing the
results and throwing away the sample
we accumualte averages (of cross products of fourier tranforms)

2. You might be actually planning to capture to disk 400MB/s for a sustained
period which has some pretty hairy implications for storage capacity.

we wont store the raw data, just a very much reduced set.

I do know of one site doing something on a similar scale, and that's a US
Airforce project called Starfire Optical Range
(http://www.sor.plk.af.mil/SOR/) at Kirkland AFB. I don't believe this
project is heavily classified (I certainly didn't have to sign anything
before helping them on the storage subsystem in 2000) so it might be worth
contacting them to see if they can help you spec out a system.
 
yes, repacking might allow a 64/66 PCI to accept the data. i worry
that we will spend lots of time and money, but the margin will be
insufficient for it to actually work. i have heard that some PCI
cores are not too efficient.
Spend money and time on what? With regards to PCI, I am pretty sure it will
work. You can ask PCI crowd on the PCI mailing list
(http://www.pcisig.com/developers/technical_support/pci_forum), they will
tell you for sure.And it doesn't have to be a core, you could use an
industry proven silicon, e.g. from PLX. I would be more worried about
processing all this data in your PC. I don't think any PC can do FFT's while
keeping up with such a data flow. Let's say you want to do 1024 point FFT.
At 400 MSPS it will take only 2.56 us to accumulate a new block of data.
The latest and greatest ADI ADSP-TS201S can do a 1024-point complex FFT time
in 16.8 microseconds. I doubt any of the Intel chips can do it faster.
AFAIK, TI DSP's aren't faster either. So, in my opinion you will either need
an array of fast DSP's or some sort of FPGA based processing. Trying to do
this kind of processing in host doesn't sound feasible to me.


/Mikhail
 
Jeff Peterson wrote:
"Nik Simpson" <n_simpson@bellsouth.net> wrote in message news:<vQPub.41$zi3.40@bignews3.bellsouth.net>...
we accumualte averages (of cross products of fourier tranforms)

So the basic problem is getting 400MB/s of data into memory and processing
it, but are you reading 400MB every second, or sampling say once every ten
seconds. If it's every second, then you've got a bigger problem because I'd
be surprised if you can process it fast enough to get the job done before
the next sample comes along.
we will take about 64K samples, then can pause while processing...
however all the time we are pausing we are losing data. so we do want to
keep the duty cycle up. 50% dudty cyle is not a problem. 5% would be.
As stated elsethread, if you give up trying to get this throughput
on a conventional PC platform, you probably can do this on a "big enough"
FPGA. From your memory needs alone (64K x 6 x some overhead in which
to do your FFT) you're probably looking north of an XC2V2000, and the
single chip price is measured in the thousands of US$. For the c.a.f
group to estimate with any precision the smallest practical part, you
need to do things like compute the number of bits precision you need
for your butterflies. The 96 18x18 multipliers on an XC2V3000 would
come in real handy, especially if they didn't need to be cascaded for
more precision. If you can make your design work at 200 MS/s (DDR),
Even 32 multipliers would let you run the FFT as fast as data points
stream in -- although that would also require 16 x 64K x 18 bits
storage, out of reach for the current Xilinx offerings at least.

I know who I'd ask first for help (ahem-ray-cough).

- Larry
 
jbp@cmu.edu (Jeff Peterson) wrote in message news:<369b6e8b.0311190715.4d66f38f@posting.google.com>...
We are building a new radio telescope called PAST
(http://astrophysics.phys.cmu.edu/~jbp/past6.pdf)
which we will install at the South Pole or in Western China.

To make this work, will need to sample (6 to 8 bit precision) dozens
of analog voltages at 400 Msample/sec and feed these data streams into
PCs. One PC per sampler.

The flash ADCs we need are available (Maxim), but we are finding it
difficult to get the data into the PC.
You should definitely talk to High Energy Physics People. Like the
STAR experiment at BNL or ALICE at CERN. Talk to the data aquisition
and Level 3 Trigger people there. You probably can just buy boards
with fast links and DSPs from them.

If you want to design it yourself, here are some comments:
1)
If you use a busmaster device you and you want to read data with 50%
duty cycle you can buffer the events in your readout board and reduce
the data rate to 200MByte/s. You add one event of latency.

2)
The fastest slots on a PC Mainboard are the memory expansion slots.
It's an easy to design hardware interface and if you use a server
mainboard with multiple memory channels you get a hell lot of
bandwidth. I remember seeing a cryptoaccelerator on a DIMM somewhere
and SUN used to place graphics boards in memory slots.

3.
If your political environment is similar to high energy physics, than
if you can reduce the duty cycle it does not really matter how
expensive the readout boards are. With a large FPGA on a PCI board you
can try to perform all computations on the board and achieve a 100%
duty cycle.

Kolja Sulimma
 
The fastest slots on a PC Mainboard are the memory expansion slots.
It's an easy to design hardware interface and if you use a server
mainboard with multiple memory channels you get a hell lot of
bandwidth.
....and forget Windows support. Only the specially hacked Linux will be your
friend.

and SUN used to place graphics boards in memory slots.
Sorry? Sun used S-Bus for them, which is not memory slot.

--
Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
maxim@storagecraft.com
http://www.storagecraft.com
 
The fastest slots on a PC Mainboard are the memory expansion slots.
It's an easy to design hardware interface and if you use a server
mainboard with multiple memory channels you get a hell lot of
bandwidth.

...and forget Windows support. Only the specially hacked Linux will be your
friend.
????
The need to write their own driver anyway.

I do not know much about windows driver programming, but it should be
possible for a driver developer to map arbitrary physical address
ranges to user space.
You need chipset specific code to enable access to the dimm after
boot, because it must start disabled to prevent windows from using the
memory. But as they use the board only in a single setup, this is no
problem at all.
Anyway, an experiment of that type is likely to use an real time OS
anyway, neither windows nor plain vanilla linux. Maybe OS9 or VxWorks.

Sorry? Sun used S-Bus for them, which is not memory slot.
They did, but they also had UMA archtiectures based on DIMMS.

Kolja Sulimma
 
"Jeff Peterson" <jbp@cmu.edu> wrote in message
news:369b6e8b.0311190715.4d66f38f@posting.google.com...
We are building a new radio telescope called PAST
(http://astrophysics.phys.cmu.edu/~jbp/past6.pdf)
which we will install at the South Pole or in Western China.

To make this work, will need to sample (6 to 8 bit precision) dozens
of analog voltages at 400 Msample/sec and feed these data streams into
PCs. One PC per sampler.

The flash ADCs we need are available (Maxim), but we are finding it
difficult to get the data into the PC.

One simple way would be to use SCSI ultra640, but so far I have not
found any 640 adapters on the market. Is any 640 adapter available?
anything coming soon?

or we could go right into a PCI-X bus. has anyone out there
done this at 400 Mb/s? is this hard to do? FPGA core liscense
for this seems expensive ($9K), with no guarentee of 400 mByte rates.

is there a better way?

thanks

-Jeff Peterson

Why dont you get an AGP Graphics processor, and try to connect your ADCs to
the GPU Memory Bus.
Run a PCI card for graphics on the PC.

The GPUs are programmable , so you might even be able to do some processing
inside...

Since you only need 400 MSamples/S, you could live with the Maxims.

If you want to get some real speed, then maybe something like the Atmel
TS8308500 (500 Mspl/s), TS8388B (1 Gspl/s) or TS83102G0B (Gspl/s) could be
of interest.
Going up to Giga Samples per second, would make your problem worse though
:)

http://www.atmel.com/dyn/products/datasheets.asp?family_id=611

--
Best Regards
Ulf at atmel dot com
These comments are intended to be my own opinion and they
may, or may not be shared by my employer, Atmel Sweden.
 
You need chipset specific code to enable access to the dimm after
boot, because it must start disabled to prevent windows from using the
memory.
Easier! Just add /MAXMEM to Windows's BOOT.INI, and it will skip some of the
BIOS reported memory.
So, for the second sight, the think looks easier.

Anyway, an experiment of that type is likely to use an real time OS
anyway, neither windows nor plain vanilla linux. Maybe OS9 or VxWorks.
Surely.

--
Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
maxim@storagecraft.com
http://www.storagecraft.com
 
Maxim S. Shatskih wrote:
You need chipset specific code to enable access to the dimm after
boot, because it must start disabled to prevent windows from using
the memory.

Easier! Just add /MAXMEM to Windows's BOOT.INI, and it will skip some
of the BIOS reported memory.
So, for the second sight, the think looks easier.
The trick is knowing which physical memory slots are affected by the
BOOT.INI statement. An alternative is simply to grab physical memory address
space for a device driver during the boot sequence and lock Windows out of
it, DataCore uses that approach for it's cache in SANsymphony.



--
Nik Simpson
 
On a sunny day (Thu, 20 Nov 2003 00:06:40 -0500) it happened "MM"
<mbmsv@yahoo.com> wrote in <bphi42$1nue87$1@ID-204311.news.uni-berlin.de>:

yes, repacking might allow a 64/66 PCI to accept the data. i worry
that we will spend lots of time and money, but the margin will be
insufficient for it to actually work. i have heard that some PCI
cores are not too efficient.

Spend money and time on what? With regards to PCI, I am pretty sure it will
work. You can ask PCI crowd on the PCI mailing list
(http://www.pcisig.com/developers/technical_support/pci_forum), they will
tell you for sure.And it doesn't have to be a core, you could use an
industry proven silicon, e.g. from PLX. I would be more worried about
processing all this data in your PC. I don't think any PC can do FFT's while
keeping up with such a data flow. Let's say you want to do 1024 point FFT.
At 400 MSPS it will take only 2.56 us to accumulate a new block of data.
The latest and greatest ADI ADSP-TS201S can do a 1024-point complex FFT time
in 16.8 microseconds. I doubt any of the Intel chips can do it faster.
AFAIK, TI DSP's aren't faster either. So, in my opinion you will either need
an array of fast DSP's or some sort of FPGA based processing. Trying to do
this kind of processing in host doesn't sound feasible to me.


/Mikhail


A little while ago in sci.crypt there was some talk about the first optical processor.
Basically this is an LED array with multipliers that can do 125 million complex 128 point
FFT or 500000 DFT 16 K size per second.
http://www.lenslet.com/newsItem.asp?showArchive=&newsId=184
www.lenslet.com
The thing itself is a normal DSP with the optical array (you can buy that separately too).
Normal logic, if you interfaced a FPGA you could go faster perhaps, those gallium
arsenide LEDS switch at 20 GHz...
No idea what it costs, perhaps less then you think.
Download the datasheet .pdf, maybe it is of use...
JP
 
The same holds true for CoolRunner XPLA3 devices (pin compatible with
Altera devices (and they can be powered from fruit).


Ulf Samuelsson wrote:

"Jeff Peterson" <jbp@cmu.edu> wrote in message
news:369b6e8b.0311190715.4d66f38f@posting.google.com...


We are building a new radio telescope called PAST
(http://astrophysics.phys.cmu.edu/~jbp/past6.pdf)
which we will install at the South Pole or in Western China.

To make this work, will need to sample (6 to 8 bit precision) dozens
of analog voltages at 400 Msample/sec and feed these data streams into
PCs. One PC per sampler.

The flash ADCs we need are available (Maxim), but we are finding it
difficult to get the data into the PC.

One simple way would be to use SCSI ultra640, but so far I have not
found any 640 adapters on the market. Is any 640 adapter available?
anything coming soon?

or we could go right into a PCI-X bus. has anyone out there
done this at 400 Mb/s? is this hard to do? FPGA core liscense
for this seems expensive ($9K), with no guarentee of 400 mByte rates.

is there a better way?

thanks

-Jeff Peterson




Why dont you get an AGP Graphics processor, and try to connect your ADCs to
the GPU Memory Bus.
Run a PCI card for graphics on the PC.

The GPUs are programmable , so you might even be able to do some processing
inside...

Since you only need 400 MSamples/S, you could live with the Maxims.

If you want to get some real speed, then maybe something like the Atmel
TS8308500 (500 Mspl/s), TS8388B (1 Gspl/s) or TS83102G0B (Gspl/s) could be
of interest.
Going up to Giga Samples per second, would make your problem worse though
:)

http://www.atmel.com/dyn/products/datasheets.asp?family_id=611
 
If you want to get some real speed, then maybe something like the Atmel
TS8308500 (500 Mspl/s), TS8388B (1 Gspl/s) or TS83102G0B (Gspl/s) could be
of interest.
Going up to Giga Samples per second, would make your problem worse though
:)

http://www.atmel.com/dyn/products/datasheets.asp?family_id=611
I think we will try atmel...Jeff
 
The fastest slots on a PC Mainboard are the memory expansion slots.
It's an easy to design hardware interface and if you use a server
mainboard with multiple memory channels you get a hell lot of
bandwidth. I remember seeing a cryptoaccelerator on a DIMM somewhere
and SUN used to place graphics boards in memory slots.
hmmm...interesting idea
 
Ulf Samuelsson wrote:

If you want to get some real speed, then maybe something like the Atmel
TS8308500 (500 Mspl/s), TS8388B (1 Gspl/s) or TS83102G0B (Gspl/s) could be
of interest.
Going up to Giga Samples per second, would make your problem worse though
:)

http://www.atmel.com/dyn/products/datasheets.asp?family_id=611
When did Atmel start making flash ADCs? Can someone actually buy these
now? How much do they cost?

see:

http://dustbunny.physics.indiana.edu/~paul/hallDrd

for our "merely" 250 Msps particle physics project.


Paul Smith
Indiana University Physics
 
"Paul Smith" <ptsmith@no_spam.indiana.edu> wrote in message
news:bpleh0$fdm$1@hood.uits.indiana.edu...
Ulf Samuelsson wrote:


If you want to get some real speed, then maybe something like the Atmel
TS8308500 (500 Mspl/s), TS8388B (1 Gspl/s) or TS83102G0B (2 Gspl/s)
could be
of interest.
Going up to Giga Samples per second, would make your problem worse
though
:)

http://www.atmel.com/dyn/products/datasheets.asp?family_id=611


When did Atmel start making flash ADCs? Can someone actually buy these
now? How much do they cost?
Yep, have been around for quite some time.

see:

http://dustbunny.physics.indiana.edu/~paul/hallDrd

for our "merely" 250 Msps particle physics project.
My guess is maybe $500 for the 1 G sample 8 bit devices.
2-3 x the price for the 2 G sample 10 bit device.
This is for commercial spec, mil spec devices are more expensive.

How many do you need?

--
Best Regards
Ulf at atmel dot com
These comments are intended to be my own opinion and they
may, or may not be shared by my employer, Atmel Sweden.



Paul Smith
Indiana University Physics
 
"Ulf Samuelsson" <ulf@NOSPAMatmel.com> wrote in message
news:bpis10$q70$1@public2.atmel-nantes.fr...
If you want to get some real speed, then maybe something like the Atmel
TS8308500 (500 Mspl/s), TS8388B (1 Gspl/s) or TS83102G0B (Gspl/s) could be
of interest.
Going up to Giga Samples per second, would make your problem worse though
:)
Maxim also has the MAX104 (1 Ghz,) and MAX108 (1.5Ghz) and other lower
speeds in the range 500Mhz and up.
 
In article <369b6e8b.0311191630.15ece843@posting.google.com>, Jeff Peterson wrote:
we will take about 64K samples, then can pause while processing...
however all the time we are pausing we are losing data. so we do want to
keep the duty cycle up. 50% duty cyle is not a problem. 5% would be.
It sounds like you *really* want to do the FFTs on PCs, rather than on
an outboard DSP. (for your application that might make sense, as you're
only building one or a few of these things, and software development
will be hugely easier) You might try "striping" the data across
multiple PCs. After the ADC, block the data into 64K "packets", and
have 2 or 3 links to FPGA "NICs" that receive blocks of 64k samples
(plus maybe a sequence number) and DMA them over PCI.

Even if you do try to convert to SCSI or Fibre Channel and use standard
adapters, you might still want to consider the striping idea.

--
......................................................................
Peter Desnoyers (617) 661-1979 pjd@fred.cambridge.ma.us
162 Pleasant St.
Cambridge, Mass. 02139
 
On Sat, 22 Nov 2003 15:50:45 +0100, "Morten Leikvoll"
<m-leik@online.nospam> wrote:

"Ulf Samuelsson" <ulf@NOSPAMatmel.com> wrote in message
news:bpis10$q70$1@public2.atmel-nantes.fr...

If you want to get some real speed, then maybe something like the Atmel
TS8308500 (500 Mspl/s), TS8388B (1 Gspl/s) or TS83102G0B (Gspl/s) could be
of interest.
Going up to Giga Samples per second, would make your problem worse though
:)

Maxim also has the MAX104 (1 Ghz,) and MAX108 (1.5Ghz) and other lower
speeds in the range 500Mhz and up.
I worked on something similar that never got built. We wanted to
record the output of multiple 200MHz 12 Bit ADC and store it in a
harddrive. There was some company that sold an optical interface
harddrive that could manage the speed but we had decided to use
multiple harddrive and split the data among the drives to achieve the
desired throughput similar to what RAID drives do. As for FFT's they
are done way faster in a FPGA than in any processor including DSP
processors. The only problem is the cost of the FPGA if you need
really big FFT's (4K 24 bit wide). The card I was working on was
basically an RF spectrum analyzer for a missile so things needed to be
done in a hurry. The harddrive thing was for a prototype so we wanted
to store the RF we were seeing without processing it.

Ray
 

Welcome to EDABoard.com

Sponsor

Back
Top