Location constraints questions

T

Thomas Heller

Guest
I have VHDL code that creates a tapped delay line, using the carry
chain of a spartan3 chip. The cells (up to 128) are created in a
generate loop:

gen_delaychain: for i in 0 to N-1 generate
delaycell_inst : delaycell port map (
clock => clock,
cin => cy(i),
cout => cy(i+1),
cout_reg => cy_reg(i));
end generate;

As the delay chain has to run vertically through the chip (Spartan3E),
I have RLOC constraints in the UCF file:

INST "gen_delaychain[0].delaycell_inst/MUXCY_1" RLOC="X0Y0";
INST "gen_delaychain[1].delaycell_inst/MUXCY_1" RLOC="X0Y0";
INST "gen_delaychain[2].delaycell_inst/MUXCY_1" RLOC="X0Y1";
INST "gen_delaychain[2].delaycell_inst/MUXCY_1" RLOC="X0Y1";
.... and so on.

First question:

Is it possible to define these contraints in the VHDL code, in the
generate loop so that I don't have to write lots of these lines?

Second question:

Is it possible to have these RLOC constraints so that the carry chain
runs vertically through the chip, and have at the same time an absolute
LOC constraint for one of these cells to specify the absolute location
of the whole chain?

Third question:

If I want to use more than one of these carry chains, how can I specify
the relative location of the cells in each chain, but let ISE determine
the y-position of the chains itself?

Thanks,
Thomas
 
Thomas Heller wrote:

I have VHDL code that creates a tapped delay line, using the carry
chain of a spartan3 chip. The cells (up to 128) are created in a
generate loop:
I have no answer for your questions, but are you sure it is a good idea to
use a delay line with a chain of gates? I've read that the delay can change
very much with temperature, voltage or different batches for your FPGA (if
you need it for a product). And it changes each time you change your VHDL
program and the routing is different, but maybe you can solve this with
your location constraints.

--
Frank Buss, http://www.frank-buss.de
piano and more: http://www.youtube.com/user/frankbuss
 
Am 16.01.2011 20:23, schrieb Frank Buss:
Thomas Heller wrote:

I have VHDL code that creates a tapped delay line, using the carry
chain of a spartan3 chip. The cells (up to 128) are created in a
generate loop:

I have no answer for your questions, but are you sure it is a good idea to
use a delay line with a chain of gates? I've read that the delay can change
very much with temperature, voltage or different batches for your FPGA (if
you need it for a product). And it changes each time you change your VHDL
program and the routing is different, but maybe you can solve this with
your location constraints.

Of course the whole chain must be calibrated, and changes in prop-delay
depending on temperature or voltage must be compensated.

I'm trying to do high-resolution timing measurements with 100ps or so
resolution, as described in several papers. One example is this:

http://lss.fnal.gov/archive/2009/pub/fermilab-pub-09-608-e.pdf

Thomas
 
Thomas Heller wrote:

Of course the whole chain must be calibrated, and changes in prop-delay
depending on temperature or voltage must be compensated.

I'm trying to do high-resolution timing measurements with 100ps or so
resolution, as described in several papers. One example is this:

http://lss.fnal.gov/archive/2009/pub/fermilab-pub-09-608-e.pdf
Thanks, looks interesting.

--
Frank Buss, http://www.frank-buss.de
piano and more: http://www.youtube.com/user/frankbuss
 
Thomas Heller <theller@ctypes.org> wrote:

Am 16.01.2011 20:23, schrieb Frank Buss:
Thomas Heller wrote:

I have VHDL code that creates a tapped delay line, using the carry
chain of a spartan3 chip. The cells (up to 128) are created in a
generate loop:

I have no answer for your questions, but are you sure it is a good idea to
use a delay line with a chain of gates? I've read that the delay can change
very much with temperature, voltage or different batches for your FPGA (if
you need it for a product). And it changes each time you change your VHDL
program and the routing is different, but maybe you can solve this with
your location constraints.

Of course the whole chain must be calibrated, and changes in prop-delay
depending on temperature or voltage must be compensated.

I'm trying to do high-resolution timing measurements with 100ps or so
resolution, as described in several papers. One example is this:
If the inputsignal you are measuring consists of many events you could
try to use the delay line in the DCM to create a sampler by changing
the phase a little.

--
Failure does not prove something is impossible, failure simply
indicates you are not using the right tools...
nico@nctdevpuntnl (punt=.)
--------------------------------------------------------------
 
Am 16.01.2011 22:10, schrieb Nico Coesel:
Thomas Heller<theller@ctypes.org> wrote:
I'm trying to do high-resolution timing measurements with 100ps or so
resolution, as described in several papers. One example is this:

If the inputsignal you are measuring consists of many events you could
try to use the delay line in the DCM to create a sampler by changing
the phase a little.

The inputsignal is not repetetive (sp?), so this is not possible.

Thomas
 
On Sun, 16 Jan 2011 20:04:04 +0100, Thomas Heller <theller@ctypes.org> wrote:

I have VHDL code that creates a tapped delay line, using the carry
chain of a spartan3 chip. The cells (up to 128) are created in a
generate loop:

First question:

Is it possible to define these contraints in the VHDL code, in the
generate loop so that I don't have to write lots of these lines?
Here is one paper describing roughly this approach.

http://www.riverside-machines.com/pub2/xilinx/vhdl_rpm/place1.htm

Linked from

http://www.riverside-machines.com/pub2/xilinx/vhdl_rpm/top.htm

which may have more information.

- Brian
 
Thomas Heller <theller@ctypes.org> wrote:

Am 16.01.2011 22:10, schrieb Nico Coesel:
Thomas Heller<theller@ctypes.org> wrote:
I'm trying to do high-resolution timing measurements with 100ps or so
resolution, as described in several papers. One example is this:

If the inputsignal you are measuring consists of many events you could
try to use the delay line in the DCM to create a sampler by changing
the phase a little.

The inputsignal is not repetetive (sp?), so this is not possible.
Then you do have a time consuming problem :)

Did you think about placing the delay lines by hand by using the
floorplan editor?

I'd try to fixate the delay lines as much as possible and have ISE
route the rest of the circuits around it. You could write a piece of
software to generate the UCF files. This would save a lot of typing
and it is easy to change. I have used that method for a design which
used a very complicated multiplexing scheme which was stored in a
blockram.

It would also be nice if you could have the delay lines in a seperate
netlist which is included during build (like an IP core). You could
also generate this together with the UCF files. Then the whole delay
line block(s) are just drop-in objects. It might take some time to
figure it out but it might be cleaner and more reproducable than using
VHDL. Another pitfall is the optimizer; it should not be allowed to
optimize the delay lines.

--
Failure does not prove something is impossible, failure simply
indicates you are not using the right tools...
nico@nctdevpuntnl (punt=.)
--------------------------------------------------------------
 
Thomas Heller <theller@ctypes.org> wrote:
(snip)

I'm trying to do high-resolution timing measurements with 100ps or so
resolution, as described in several papers. One example is this:

http://lss.fnal.gov/archive/2009/pub/fermilab-pub-09-608-e.pdf
Sounds prety neat, though, as it says, you need to do the
complete calibration.

Some years ago I worked with an analog ASIC based TDC that
(as I understand it), it charged a capacitor at constant current
starting at the input pulse, and ending at the clock. Then the
voltage is measured with an ADC. I don't remember now (it was
built into the ASIC, and I didn't have anything to do with that)
how it kept the calibration right. That is, it was an 8 bit ADC,
with full scale at the clock frequency. It supplied the low bits,
a counter at 66MHz supplied the high order bits. Maybe a PLL to
adjust the current.

It seems to me that you need FF's with a known time, or at least
constant delay, from the carry chain taps. I don't remember seeing
that in your macros, but I wasn't looking for them, either.

A thought that I didn't see in the Fermilab paper, and maybe won't
work: Use the delay through the whole delay line (or another one
in the same FPGA) to adjust the voltage and/or temperature to keep
the delay constant. Also, maybe you can work with the DCM in the FPGA.

-- glen
 
Nico Coesel <nico@puntnl.niks> wrote:
(snip)

It would also be nice if you could have the delay lines in a seperate
netlist which is included during build (like an IP core). You could
also generate this together with the UCF files. Then the whole delay
line block(s) are just drop-in objects. It might take some time to
figure it out but it might be cleaner and more reproducable than using
VHDL. Another pitfall is the optimizer; it should not be allowed to
optimize the delay lines.
In the XC4000 days, I remember using RPMs from verilog. I would
write a dummy (empty) module, or maybe with very simple logic.
Generate the netlist, then, in a step I don't remember exactly,
replace the dummy module with the RPM. The result then goes
into P&R as usual. The PRM includes relative positioning, and
also specific carry chain use, but otherwise can be moved around
by P&R.

-- glen
 
On Jan 16, 2:04 pm, Thomas Heller <thel...@ctypes.org> wrote:
I have VHDL code that creates a tapped delay line, using the carry
chain of a spartan3 chip.  The cells (up to 128) are created in a
generate loop:

   gen_delaychain: for i in 0 to N-1 generate
     delaycell_inst : delaycell port map (
       clock    => clock,
       cin      => cy(i),
       cout     => cy(i+1),
       cout_reg => cy_reg(i));
   end generate;

As the delay chain has to run vertically through the chip (Spartan3E),
I have RLOC constraints in the UCF file:

   INST "gen_delaychain[0].delaycell_inst/MUXCY_1" RLOC="X0Y0";
   INST "gen_delaychain[1].delaycell_inst/MUXCY_1" RLOC="X0Y0";
   INST "gen_delaychain[2].delaycell_inst/MUXCY_1" RLOC="X0Y1";
   INST "gen_delaychain[2].delaycell_inst/MUXCY_1" RLOC="X0Y1";
... and so on.

First question:

Is it possible to define these contraints in the VHDL code, in the
generate loop so that I don't have to write lots of these lines?

Second question:

Is it possible to have these RLOC constraints so that the carry chain
runs vertically through the chip, and have at the same time an absolute
LOC constraint for one of these cells to specify the absolute location
of the whole chain?

Third question:

If I want to use more than one of these carry chains, how can I specify
the relative location of the cells in each chain, but let ISE determine
the y-position of the chains itself?

Thanks,
Thomas
Interesting. I am currently working on something similar using an
external programmable delay line.

I looked at the paper, but I'm not sure I understand exactly what you
are trying to measure really. I guess my concern would not be the
issues of timing consistency in the active portions like the carry
chain. Rather my concern would be the routing. If you look at a
typical timing report, much if not most of a delay is in the routing.
This can be minimized, but that is the issue. If you don't hand place
at least, and better yet, hand route the routing delays will dominate
the timing differences between the taps and defeat the concept.

But then I may have missed something in this thread or not fully
understood what I read in the paper.

Rick
 
rickman <gnuarm@gmail.com> wrote:
(snip)

Interesting. I am currently working on something similar using an
external programmable delay line.

I looked at the paper, but I'm not sure I understand exactly what you
are trying to measure really. I guess my concern would not be the
issues of timing consistency in the active portions like the carry
chain. Rather my concern would be the routing.
There isn't much of a question on routing for the carry chain,
especially if you make RPMs out of it. It seems to me that you
do have to be careful to get the FF's a constant delay from
the chain, but if they are in the same CLB, then I believe it
shouldn't be too far off. Well, there is also the clock tree
delay, which also has to be consistent enough. That is, more
consistent than it does for ordinary designs.

Then there is the part in the paper about a bubble, which could
occur when the delay is slightly different, such that one tap does
come before the previous tap. There is extra logic to fix up
that case.

If you look at a
typical timing report, much if not most of a delay is in the routing.
This can be minimized, but that is the issue. If you don't hand place
at least, and better yet, hand route the routing delays will dominate
the timing differences between the taps and defeat the concept.
I don't know about the OP, but these are commonly used in PET
scanners to determine the time between the arrival of the two
gamma rays. The time resolution gives the scanner resolution.

But then I may have missed something in this thread or not fully
understood what I read in the paper.
-- glen
 
Am 16.01.2011 23:21, schrieb Brian Drummond:
On Sun, 16 Jan 2011 20:04:04 +0100, Thomas Heller<theller@ctypes.org> wrote:

I have VHDL code that creates a tapped delay line, using the carry
chain of a spartan3 chip. The cells (up to 128) are created in a
generate loop:

First question:

Is it possible to define these contraints in the VHDL code, in the
generate loop so that I don't have to write lots of these lines?

Here is one paper describing roughly this approach.

http://www.riverside-machines.com/pub2/xilinx/vhdl_rpm/place1.htm

Linked from

http://www.riverside-machines.com/pub2/xilinx/vhdl_rpm/top.htm

which may have more information.
It seems that these papers describe exactly what I need.
Thanks for the links!

Thomas
 
Am 16.01.2011 23:32, schrieb glen herrmannsfeldt:
Thomas Heller<theller@ctypes.org> wrote:
(snip)

I'm trying to do high-resolution timing measurements with 100ps or so
resolution, as described in several papers. One example is this:

http://lss.fnal.gov/archive/2009/pub/fermilab-pub-09-608-e.pdf

Sounds prety neat, though, as it says, you need to do the
complete calibration.

Some years ago I worked with an analog ASIC based TDC that
(as I understand it), it charged a capacitor at constant current
starting at the input pulse, and ending at the clock. Then the
voltage is measured with an ADC. I don't remember now (it was
built into the ASIC, and I didn't have anything to do with that)
how it kept the calibration right. That is, it was an 8 bit ADC,
with full scale at the clock frequency. It supplied the low bits,
a counter at 66MHz supplied the high order bits. Maybe a PLL to
adjust the current.
There are, of course, several ways to do high resolution time
measurements.

It seems to me that you need FF's with a known time, or at least
constant delay, from the carry chain taps. I don't remember seeing
that in your macros, but I wasn't looking for them, either.
I have instantiated the whole delay cell in one CLB. Clock skew is
still an issue.

A thought that I didn't see in the Fermilab paper, and maybe won't
work: Use the delay through the whole delay line (or another one
in the same FPGA) to adjust the voltage and/or temperature to keep
the delay constant. Also, maybe you can work with the DCM in the FPGA.
Several years ago I implemented 8 16-bits dacs in a small spartanII
device as pulse width modulated outputs with external analog low-pass
filters; adjusting the power supply voltage in a loop. Worked pretty
well, but I want to avoid that approach in the current design.

Thomas
 
On Jan 16, 7:04 pm, Thomas Heller <thel...@ctypes.org> wrote:
First question:

Is it possible to define these contraints in the VHDL code, in the
generate loop so that I don't have to write lots of these lines?
SHIFT_REGS :for i in 0 to nregs-1 generate
constant rloc_str : string := "X" & integer'image(i) & "Y0";
attribute RLOC of FDE_INST: label is rloc_str;
begin
FDE_INST : FDE port map (
D => cdc_TIG_sync_regs(i),
Q => cdc_TIG_sync_regs(i+1),
CE=> '1',
C => clk
);
end generate SHIFT_REGS;
 
Am 17.01.2011 10:07, schrieb Chris Higgs:
On Jan 16, 7:04 pm, Thomas Heller<thel...@ctypes.org> wrote:
First question:

Is it possible to define these contraints in the VHDL code, in the
generate loop so that I don't have to write lots of these lines?

SHIFT_REGS :for i in 0 to nregs-1 generate
constant rloc_str : string := "X"& integer'image(i)& "Y0";
attribute RLOC of FDE_INST: label is rloc_str;
begin
FDE_INST : FDE port map (
D => cdc_TIG_sync_regs(i),
Q => cdc_TIG_sync_regs(i+1),
CE=> '1',
C => clk
);
end generate SHIFT_REGS;
Perfect!

Thanks,
Thomas
 
On Jan 17, 12:25 am, glen herrmannsfeldt <g...@ugcs.caltech.edu>
wrote:
rickman <gnu...@gmail.com> wrote:

(snip)

Interesting.  I am currently working on something similar using an
external programmable delay line.
I looked at the paper, but I'm not sure I understand exactly what you
are trying to measure really.  I guess my concern would not be the
issues of timing consistency in the active portions like the carry
chain.  Rather my concern would be the routing.  

There isn't much of a question on routing for the carry chain,
especially if you make RPMs out of it.  It seems to me that you
do have to be careful to get the FF's a constant delay from
the chain, but if they are in the same CLB, then I believe it
shouldn't be too far off.  Well, there is also the clock tree
delay, which also has to be consistent enough.  That is, more
consistent than it does for ordinary designs.
Getting the carry to the FF is what it is all about! No, unless the
newer architectures are different, you can't run the carry directly
into a FF. Actually, as I think about this, I realize that the
easiest way to get the carry out is to set up the adder to be adding 1
to -1 with the 1 being the signal you want to time. Then you can
capture the sum into a FF at each bit. So that would give you no
variation in delay.


Then there is the part in the paper about a bubble, which could
occur when the delay is slightly different, such that one tap does
come before the previous tap.  There is extra logic to fix up
that case.  

If you look at a
typical timing report, much if not most of a delay is in the routing.
This can be minimized, but that is the issue.  If you don't hand place
at least, and better yet, hand route the routing delays will dominate
the timing differences between the taps and defeat the concept.

I don't know about the OP, but these are commonly used in PET
scanners to determine the time between the arrival of the two
gamma rays.  The time resolution gives the scanner resolution.
I am working on a design that has to align an outgoing pulse to an
incoming pulse using a delay line with 10 ps resolution. Then it uses
a 100 MHz reference to calibrate the delay line and make updates to
the delay after the incoming pulse has gone away. Pretty interesting
design. I'm only doing the initial part of the FPGA where the 100 MHz
"low accuracy" stuff is. The rest of the board is PECL. Still, its
pretty interesting.

Rick
 
For the record, here are the answers to my questions ;-)

Am 16.01.2011 20:04, schrieb Thomas Heller:
I have VHDL code that creates a tapped delay line, using the carry
chain of a spartan3 chip. The cells (up to 128) are created in a
generate loop:

gen_delaychain: for i in 0 to N-1 generate
delaycell_inst : delaycell port map (
clock => clock,
cin => cy(i),
cout => cy(i+1),
cout_reg => cy_reg(i));
end generate;

As the delay chain has to run vertically through the chip (Spartan3E),
I have RLOC constraints in the UCF file:

INST "gen_delaychain[0].delaycell_inst/MUXCY_1" RLOC="X0Y0";
INST "gen_delaychain[1].delaycell_inst/MUXCY_1" RLOC="X0Y0";
INST "gen_delaychain[2].delaycell_inst/MUXCY_1" RLOC="X0Y1";
INST "gen_delaychain[2].delaycell_inst/MUXCY_1" RLOC="X0Y1";
... and so on.

First question:

Is it possible to define these contraints in the VHDL code, in the
generate loop so that I don't have to write lots of these lines?
Chris Higgs provided the code which does this:

SHIFT_REGS :for i in 0 to nregs-1 generate
constant rloc_str : string := "X" & integer'image(i) & "Y0";
attribute RLOC of FDE_INST: label is rloc_str;
begin
FDE_INST : FDE port map (
D => cdc_TIG_sync_regs(i),
Q => cdc_TIG_sync_regs(i+1),
CE=> '1',
C => clk
);
end generate SHIFT_REGS;

Second question:

Is it possible to have these RLOC constraints so that the carry chain
runs vertically through the chip, and have at the same time an absolute
LOC constraint for one of these cells to specify the absolute location
of the whole chain?
The RLOC_ORIGIN constraint does this.

Third question:

If I want to use more than one of these carry chains, how can I specify
the relative location of the cells in each chain, but let ISE determine
the y-position of the chains itself?
ISE automatically generates HU_SET constraints which puts the RLOCs in
separate groups.

All this is documented here:

http://www.xilinx.com/itp/xilinx7/books/data/docs/cgd/cgd0155_116.html

RTFM'ly
Thomas
 

Welcome to EDABoard.com

Sponsor

Back
Top