Send a pulse across clocks

L

Leo

Guest
Hello, I want to send a pulse from one clock domain to another, knowing that from the time event that this pulse is generated in the source clock domain it arrives in the first rising edge of the destination clock domain and lasts exactly one clock period of the destination clock domain. Now, I know this problem is not generic and is subject to timing constraints and clock frequency/phase relationship. So the question would be, how to implement it best in RTL with Xilinx technology and what time constraints to apply from source to destination FF. In particular the destination clock is a little over half of the source clock frequency, the phase between them is unknown and may change over time.

Currently my idea is to use the asynchronous preset/clear of the destination FF, which would mean that on an specific part (Xilinx Spartan 6 or 7-series FPGAs) there must be some relation between the clocks that allows the minimum pulse width and propagation of the asynchronous signal.

Any help is appreaciated.
 
El jueves, 22 de enero de 2015, 14:46:17 (UTC-3), Tim Wescott escribió:
On Thu, 22 Jan 2015 06:05:23 -0800, Leo wrote:

Hello, I want to send a pulse from one clock domain to another, knowing
that from the time event that this pulse is generated in the source
clock domain it arrives in the first rising edge of the destination
clock domain and lasts exactly one clock period of the destination clock
domain.

Huh? That's not at all clear.

You're saying that the pulse always arrives on "the first rising edge of
the destination clock domain" -- you mean the first rising edge after
power up? Always? Or do you mean the first rising edge ever?

Then you say the pulse lasts exactly one period of the destination clock
domain -- how can it do that, when it's being sourced in a different
domain?

Now, I know this problem is not generic and is subject to timing
constraints and clock frequency/phase relationship. So the question
would be, how to implement it best in RTL with Xilinx technology and
what time constraints to apply from source to destination FF. In
particular the destination clock is a little over half of the source
clock frequency, the phase between them is unknown and may change over
time.

Currently my idea is to use the asynchronous preset/clear of the
destination FF, which would mean that on an specific part (Xilinx
Spartan 6 or 7-series FPGAs) there must be some relation between the
clocks that allows the minimum pulse width and propagation of the
asynchronous signal.

If the pulse _from_ the faster clock domain always lasts two source
clocks, then you should be able to reliably capture it with a plain old
register. However, we get back to your unclear opening paragraph.

Assuming that the pulse is of some shorter width, then yes you need to do
something different. I don't know that you necessarily have LUT-by-LUT
access to the preset (I'd have to spelunk through data sheets, or ask
here). If you don't, you may have to build an RS FF, which should be
possible with just a few lines of code. If you do that then you'll have
some hold time requirement which the tools may or may not be able to
calculate.

Any help is appreaciated.

If you want what I think you want, it would be helpful to allow a few
clock cycles of delay in the destination, to allow for synchronization (or
metastability killing -- whatever). You haven't said how quickly you need
to act on this pulse.

--

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

You are right, after re-reading, it isn't clear. I want the pulse generated on the rising edge of the source clk to be transported to the destination clk domain in the first rising edge of the destination clk after the pulse was generated. In other words I want the destination clock to catch the pulse as fast as possible, and I also need the pulse to last a single clock cycle of the destination clk (I plan to shift it down a shift reg to delay it).

I can leave with latency as long as this latency is always the same in destination clk cycles.

I need it because from the time that this pulse is generated in the source clk, I need to count exactly 14 clk cycles of the destination clk to catch correct data going down a pipeline on the destination clk.
 
El jueves, 22 de enero de 2015, 14:50:13 (UTC-3), Rob Gaddi escribió:
On Thu, 22 Jan 2015 06:05:23 -0800 (PST)
Leo wrote:

Hello, I want to send a pulse from one clock domain to another, knowing that from the time event that this pulse is generated in the source clock domain it arrives in the first rising edge of the destination clock domain and lasts exactly one clock period of the destination clock domain. Now, I know this problem is not generic and is subject to timing constraints and clock frequency/phase relationship. So the question would be, how to implement it best in RTL with Xilinx technology and what time constraints to apply from source to destination FF. In particular the destination clock is a little over half of the source clock frequency, the phase between them is unknown and may change over time.

Currently my idea is to use the asynchronous preset/clear of the destination FF, which would mean that on an specific part (Xilinx Spartan 6 or 7-series FPGAs) there must be some relation between the clocks that allows the minimum pulse width and propagation of the asynchronous signal.

Any help is appreaciated.

Pulse-toggle-pulse synchronizer. My go-to answer for this problem;
there has to be a bizarre and compelling reason for me to do otherwise.

Sending domain has a T-flop; so the pulse causes the output to
transition. We don't care which way. The output of the T-flop goes
into one flop, then a second, then a third, on the receiving domain.
First gets rid of metastability. XOR of the second and the third is
your reconstructed pulse.

--
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order. See above to fix.

That is a good answer thank you. The only question I have is: does the XOR output always have the same delay in destionation clk cycles (3 cycles), or there might be a case where it takes more or less ?
 
On Thursday, January 22, 2015 at 1:21:37 PM UTC-5, Leo wrote:

> That is a good answer thank you. The only question I have is: does the XOR output always have the same delay in destionation clk cycles (3 cycles), or there might be a case where it takes more or less ?

Since the clock phases are unknown and (presumably) different and shifting relative to each other, then it could be longer. The most likely answer is that it would occasionally take one extra clock. If you consider metastability, then there could be likely yet another extra clock or an exponentially decreasing probability of even more. There is no way to guarantee that the very next destination clock edge will be the one that grabs the pulse since you haven't designed in a way to guarantee that you won't violate setup/hold time requirements.

So the delay in clock cycles will look something like this:
3 : Happens when setup/hold time is not violated
4 : May happen when setup/hold time is violated but no metastability
5 or more : Will happen when setup/hold time is violated and metastability occurs

Kevin Jennings
 
On Thu, 22 Jan 2015 06:05:23 -0800, Leo wrote:

Hello, I want to send a pulse from one clock domain to another, knowing
that from the time event that this pulse is generated in the source
clock domain it arrives in the first rising edge of the destination
clock domain and lasts exactly one clock period of the destination clock
domain.

Huh? That's not at all clear.

You're saying that the pulse always arrives on "the first rising edge of
the destination clock domain" -- you mean the first rising edge after
power up? Always? Or do you mean the first rising edge ever?

Then you say the pulse lasts exactly one period of the destination clock
domain -- how can it do that, when it's being sourced in a different
domain?

Now, I know this problem is not generic and is subject to timing
constraints and clock frequency/phase relationship. So the question
would be, how to implement it best in RTL with Xilinx technology and
what time constraints to apply from source to destination FF. In
particular the destination clock is a little over half of the source
clock frequency, the phase between them is unknown and may change over
time.

Currently my idea is to use the asynchronous preset/clear of the
destination FF, which would mean that on an specific part (Xilinx
Spartan 6 or 7-series FPGAs) there must be some relation between the
clocks that allows the minimum pulse width and propagation of the
asynchronous signal.

If the pulse _from_ the faster clock domain always lasts two source
clocks, then you should be able to reliably capture it with a plain old
register. However, we get back to your unclear opening paragraph.

Assuming that the pulse is of some shorter width, then yes you need to do
something different. I don't know that you necessarily have LUT-by-LUT
access to the preset (I'd have to spelunk through data sheets, or ask
here). If you don't, you may have to build an RS FF, which should be
possible with just a few lines of code. If you do that then you'll have
some hold time requirement which the tools may or may not be able to
calculate.

> Any help is appreaciated.

If you want what I think you want, it would be helpful to allow a few
clock cycles of delay in the destination, to allow for synchronization (or
metastability killing -- whatever). You haven't said how quickly you need
to act on this pulse.

--

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com
 
On Thu, 22 Jan 2015 06:05:23 -0800 (PST)
Leo <capossio.leonardo@gmail.com> wrote:

Hello, I want to send a pulse from one clock domain to another, knowing that from the time event that this pulse is generated in the source clock domain it arrives in the first rising edge of the destination clock domain and lasts exactly one clock period of the destination clock domain. Now, I know this problem is not generic and is subject to timing constraints and clock frequency/phase relationship. So the question would be, how to implement it best in RTL with Xilinx technology and what time constraints to apply from source to destination FF. In particular the destination clock is a little over half of the source clock frequency, the phase between them is unknown and may change over time.

Currently my idea is to use the asynchronous preset/clear of the destination FF, which would mean that on an specific part (Xilinx Spartan 6 or 7-series FPGAs) there must be some relation between the clocks that allows the minimum pulse width and propagation of the asynchronous signal.

Any help is appreaciated.

Pulse-toggle-pulse synchronizer. My go-to answer for this problem;
there has to be a bizarre and compelling reason for me to do otherwise.

Sending domain has a T-flop; so the pulse causes the output to
transition. We don't care which way. The output of the T-flop goes
into one flop, then a second, then a third, on the receiving domain.
First gets rid of metastability. XOR of the second and the third is
your reconstructed pulse.

--
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order. See above to fix.
 
On Thu, 22 Jan 2015 10:21:34 -0800 (PST)
Leo <capossio.leonardo@gmail.com> wrote:

El jueves, 22 de enero de 2015, 14:50:13 (UTC-3), Rob Gaddi escribiĂł:
On Thu, 22 Jan 2015 06:05:23 -0800 (PST)
Leo wrote:

Hello, I want to send a pulse from one clock domain to another, knowing that from the time event that this pulse is generated in the source clock domain it arrives in the first rising edge of the destination clock domain and lasts exactly one clock period of the destination clock domain. Now, I know this problem is not generic and is subject to timing constraints and clock frequency/phase relationship. So the question would be, how to implement it best in RTL with Xilinx technology and what time constraints to apply from source to destination FF. In particular the destination clock is a little over half of the source clock frequency, the phase between them is unknown and may change over time.

Currently my idea is to use the asynchronous preset/clear of the destination FF, which would mean that on an specific part (Xilinx Spartan 6 or 7-series FPGAs) there must be some relation between the clocks that allows the minimum pulse width and propagation of the asynchronous signal.

Any help is appreaciated.

Pulse-toggle-pulse synchronizer. My go-to answer for this problem;
there has to be a bizarre and compelling reason for me to do otherwise.

Sending domain has a T-flop; so the pulse causes the output to
transition. We don't care which way. The output of the T-flop goes
into one flop, then a second, then a third, on the receiving domain.
First gets rid of metastability. XOR of the second and the third is
your reconstructed pulse.

--
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order. See above to fix.

That is a good answer thank you. The only question I have is: does the XOR output always have the same delay in destionation clk cycles (3 cycles), or there might be a case where it takes more or less ?

In rereading, it's not quite what you asked for, with a minimum latency
solution, nor what you later asked for with a deterministic latency
solution.

The problem is, with truly asynchronous clocks, shortening latency
increases risk, and deterministic latency is impossible. Think about
a scope, triggering on your source clock edge. The relative location
of your destination clock edge is uniformly distributed everywhere on
the screen, up to and including mere picoseconds either way from the
source edge.

That means that when you hit the first flop on the destination domain
(and that needs to be ONE flop, never ever ever two with an expectation
they'll behave the same) you have a guarantee that sometimes you'll
violate the setup timing of that flop. When you do that, it goes
metastable -- the internal nodes hit a linear state that is neither 1
or zero, and stay there until the positive feedback pushes them to one
rail or the other, an amount of time that is probabilistic rather than
deterministic. Whether it'll settle to the old or new value is
anyone's guess.

But the moral of the story is: accept that there's 1 clock of
variability in that latency and design your system to live with it. If
you can't, then you can't have an async clock crossing there.

--
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order. See above to fix.
 
On Thu, 22 Jan 2015 10:21:34 -0800, Leo wrote:

El jueves, 22 de enero de 2015, 14:50:13 (UTC-3), Rob Gaddi escribiĂł:
On Thu, 22 Jan 2015 06:05:23 -0800 (PST)
Leo wrote:

Hello, I want to send a pulse from one clock domain to another,
knowing that from the time event that this pulse is generated in the
source clock domain it arrives in the first rising edge of the
destination clock domain and lasts exactly one clock period of the
destination clock domain. Now, I know this problem is not generic and
is subject to timing constraints and clock frequency/phase
relationship. So the question would be, how to implement it best in
RTL with Xilinx technology and what time constraints to apply from
source to destination FF. In particular the destination clock is a
little over half of the source clock frequency, the phase between
them is unknown and may change over time.

Currently my idea is to use the asynchronous preset/clear of the
destination FF, which would mean that on an specific part (Xilinx
Spartan 6 or 7-series FPGAs) there must be some relation between the
clocks that allows the minimum pulse width and propagation of the
asynchronous signal.

Any help is appreaciated.

Pulse-toggle-pulse synchronizer. My go-to answer for this problem;
there has to be a bizarre and compelling reason for me to do otherwise.

Sending domain has a T-flop; so the pulse causes the output to
transition. We don't care which way. The output of the T-flop goes
into one flop, then a second, then a third, on the receiving domain.
First gets rid of metastability. XOR of the second and the third is
your reconstructed pulse.

That is a good answer thank you. The only question I have is: does the
XOR output always have the same delay in destionation clk cycles (3
cycles), or there might be a case where it takes more or less ?

When the source and destination clock edges are close to happening at the
same time there will be an uncertainty of which destination clock catches
the transition on the T-flop -- but that uncertainty will be with you
always anyway.

--

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com
 
Rob Gaddi wrote:

[snip]

That means that when you hit the first flop on the destination domain
(and that needs to be ONE flop, never ever ever two with an expectation
they'll behave the same) you have a guarantee that sometimes you'll
violate the setup timing of that flop. When you do that, it goes
metastable -- the internal nodes hit a linear state that is neither 1
or zero, and stay there until the positive feedback pushes them to one
rail or the other, an amount of time that is probabilistic rather than
deterministic. Whether it'll settle to the old or new value is
anyone's guess.

This is a bit over-pessimistic. The setup and hold time of a flop are
spec'd over PVT. The metastability window is much much smaller than
the window specified by the setup and hold times. What you do get
between the setup and hold time is a window when you can't know
for sure over all temperature, voltage, and process whether the
change in the input will propagate to the output on that clock
cycle, or whether it will not go through until the next cycle.

But the moral of the story is: accept that there's 1 clock of
variability in that latency and design your system to live with it. If
you can't, then you can't have an async clock crossing there.

Again if you want to be pessimistic, you could add the setup/hold
time violation window to the destination clock period and call
that the variation in latency you will see. However this ignores
the fact that the device will be operating at the same PVT
point on one cycle that it is on the next cycle.

Anyway the toggle business is a very good idea since it means you
don't need to deal with asynchronous set/reset in the destination
clock domain to stretch a short pulse.

--
Gabor
 
On Thu, 22 Jan 2015 14:05:05 -0500
GaborSzakacs <gabor@alacron.com> wrote:

Rob Gaddi wrote:

[snip]


That means that when you hit the first flop on the destination domain
(and that needs to be ONE flop, never ever ever two with an expectation
they'll behave the same) you have a guarantee that sometimes you'll
violate the setup timing of that flop. When you do that, it goes
metastable -- the internal nodes hit a linear state that is neither 1
or zero, and stay there until the positive feedback pushes them to one
rail or the other, an amount of time that is probabilistic rather than
deterministic. Whether it'll settle to the old or new value is
anyone's guess.


This is a bit over-pessimistic. The setup and hold time of a flop are
spec'd over PVT. The metastability window is much much smaller than
the window specified by the setup and hold times. What you do get
between the setup and hold time is a window when you can't know
for sure over all temperature, voltage, and process whether the
change in the input will propagate to the output on that clock
cycle, or whether it will not go through until the next cycle.

Right, but with an async clock crossing you are GUARANTEED to
occasionally end up inside of that admittedly small window.
Likewise, when you do, the probability function of how long resolution
will take has (if I recall correctly) an exponential PDF. On modern
FPGAs, the resolution function becomes more likely to be resolved than
not in something absurd like 100 ps, making the chances that it's not
done resolving in the span of a reasonable clock period something on the
order of being hit by a bus that is being hit by lightning.

But you still don't know whether you're going to get a 0 or a 1.

--
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order. See above to fix.
 
Tim Wescott <seemywebsite@myfooter.really> wrote:

Assuming that the pulse is of some shorter width, then yes you need to do
something different. I don't know that you necessarily have LUT-by-LUT
access to the preset (I'd have to spelunk through data sheets, or ask
here). If you don't, you may have to build an RS FF, which should be
possible with just a few lines of code. If you do that then you'll have
some hold time requirement which the tools may or may not be able to
calculate.

Not so long ago, I wrote this:

module S2504(out, in, phi1, phi2);
output out;
input in, phi1, phi2;
reg q1, q2;
reg [511:0] s1, s2;
wire R, S;
always @(posedge phi1) q1 <= in;
always @(posedge phi2) q2 <= in;
always @(negedge phi2) s1 <= {s1[510:0], q1};
always @(negedge phi1) s2 <= {s2[510:0], q2};
assign R=~(S & phi1), S=~(R & phi2);
assign out=(R & s2[511]) | (S & s1[511]);
endmodule

There is a warning message about a combinatorial loop, but I believe
it works.

In case it isn't obvious, it is a shift register with a two-phase
clock that shifts on the falling edge of either phase. The output
is from the last shift (falling edge).

The real 2504 is a pMOS dynamic shift register.

-- glen
 
El jueves, 22 de enero de 2015, 11:05:25 (UTC-3), Leo escribió:
Hello, I want to send a pulse from one clock domain to another, knowing that from the time event that this pulse is generated in the source clock domain it arrives in the first rising edge of the destination clock domain and lasts exactly one clock period of the destination clock domain. Now, I know this problem is not generic and is subject to timing constraints and clock frequency/phase relationship. So the question would be, how to implement it best in RTL with Xilinx technology and what time constraints to apply from source to destination FF. In particular the destination clock is a little over half of the source clock frequency, the phase between them is unknown and may change over time.

Currently my idea is to use the asynchronous preset/clear of the destination FF, which would mean that on an specific part (Xilinx Spartan 6 or 7-series FPGAs) there must be some relation between the clocks that allows the minimum pulse width and propagation of the asynchronous signal.

Any help is appreaciated.

Thanks to all. "Pulse-toggle-pulse synchronizer" proposed by Rob Gaddi is what I need. The current architecture allows me to "aim" to a flank of the receiver clock (within certain limits) so that no undefined latency issues arise (due to setup/hold violations or metastability).
 
On 1/22/2015 12:50 PM, Rob Gaddi wrote:
On Thu, 22 Jan 2015 06:05:23 -0800 (PST)
Leo <capossio.leonardo@gmail.com> wrote:

Hello, I want to send a pulse from one clock domain to another, knowing that from the time event that this pulse is generated in the source clock domain it arrives in the first rising edge of the destination clock domain and lasts exactly one clock period of the destination clock domain. Now, I know this problem is not generic and is subject to timing constraints and clock frequency/phase relationship. So the question would be, how to implement it best in RTL with Xilinx technology and what time constraints to apply from source to destination FF. In particular the destination clock is a little over half of the source clock frequency, the phase between them is unknown and may change over time.

Currently my idea is to use the asynchronous preset/clear of the destination FF, which would mean that on an specific part (Xilinx Spartan 6 or 7-series FPGAs) there must be some relation between the clocks that allows the minimum pulse width and propagation of the asynchronous signal.

Any help is appreaciated.

Pulse-toggle-pulse synchronizer. My go-to answer for this problem;
there has to be a bizarre and compelling reason for me to do otherwise.

Sending domain has a T-flop; so the pulse causes the output to
transition. We don't care which way. The output of the T-flop goes
into one flop, then a second, then a third, on the receiving domain.
First gets rid of metastability. XOR of the second and the third is
your reconstructed pulse.

I think this is similar to the circuit I use except I use feedback from
the To clock domain to make the From FF toggle. This makes it
impossible for too frequent triggers in the first domain screwing up the
pulse in the other domain.

FromLogic : process (FromClk, FromReset)
begin
if (FromReset = '1') then
FromSync <= '0';
elsif (rising_edge(FstClk)) then
FromSync <= not ToSync;
end if;
end process FromLogic;

ToLogic : process (ToClk, ToReset)
begin
if (ToReset = '1') then
ToSync <= '0';
ToSync_d <= '0';
elsif (rising_edge(FstClk)) then
ToSync <= FromSync;
ToSync_d <= ToSync;
end if;
end process ToLogic;

PulseOut <= ToSync XOR ToSync_d;

I haven't tested the above code so I won't guaranty it works correctly.
The idea is that either edge of FromSync creates a pulse in the To
clock domain. Only one edge ca be generated until the first edge is
"seen" by the logic in the To domain. The relative speeds of the two
clock domains is not important.

Some people add another FF in the To domain to help with metastability.
But what you really need is slack time which should be no problem if
you spec it in the timing constraint of the appropriate logic paths.

--

Rick
 
Since the destination is running at lower speed than your source, all what
you need is a level-alternative scheme at the source to stretch out your
pulse width and at the destination side you will need a synchronizer along
with a dual edge detector. BTW, in this case, you do not have to be worried
about the timing constraint violations.

Abdullah



---------------------------------------
Posted through http://www.FPGARelated.com
 
Hello Leo,

you may want to have a look at my pulse synchronizer, which is available on GitHub at: https://github.com/noasic/noasic/blob/master/components/pulse_synchronizer.vhd

Regards,

Guy Eschemann
FPGA Consultant
http://noasic.com
 

Welcome to EDABoard.com

Sponsor

Back
Top