A
Andy Ross
Guest
I'm a long-time software guy with a stale-but-reasonably-robust
background in digital electronics, am in the process of teaching
myself Verilog, and have stumbled into a gotcha that I'm having
trouble understanding. Or rather: I think I *do* understand it (and
have fixed it in my code), but am having trouble understanding why the
scheduler rules permit this insanity in the first place. So please
read to the bottom -- I'm not asking for help with the circuit, I'm
trying to understand the language's behavior.
Here's the exposition. My circuit design (a USB 1.1 receiver, as it
happens) has an edge detector in it using a single synchronous
flip-flop. Stripped of all the other stuff, it looks basically like
this (I've been led to believe by most sources that this is more or
less the canonical way one writes such a circuit; if there are bugs
please tell me!):
module edge_detector(input clk, x, output edg);
reg x0;
assign edg = (x != x0);
always @(posedge clk) x0 <= x;
endmodule
So that being done, I want to write a testbench, provide some
stimulus, and get out a waveform I can look at in (in this case)
gtkwave. So here's the code for that:
module test;
reg clk=1, x=0;
edge_detector ed(.clk(clk), .x(x));
always #(0.5) clk = !clk;
initial begin
$dumpfile("edge_detector.vcd");
$dumpvars(1, ed);
ed.x0 = 0;
repeat(100) #4 x = !x;
$finish;
end
endmodule
Basically: instantiate the module, clock it with a period of 1, and
every four clocks toggle the value of the input. Again, this is
pretty much standard stuff from everything I'm finding on the web and
in books. The desired output, of course, is that I see the "ed.x"
line oscillate with 4x the frequency of "ed.clk", the "ed.x0" line
should be a 1-clock-delayed version of "ed.x", and that on every edge
of x I see "ed.edg" rise to a logic 1 and stay there until the next
rising edge of clk.
But that's not what happens in the simulator! Running exactly this
code under Icarus 0.9.1, I see the x and x0 lines move in lockstep,
and edg NEVER TRIGGERS AT ALL.
And this doesn't appear to be a bug in Icarus as far as I can tell,
the scheduling rules allow it. For example: just before T=4 the x0
line is still zero, then the "x = !x" line fires first and toggles x
from 0 to 1 with a blocking assignment (this is the edge I want to
detect). Then the "clk = !clk" trigger fires and creates a rising
edge, which in turn triggers the synchronous block in the device to
save the current value of x (which is now one!) in the x0 flipflop.
So we exit the T=4 time step with x and x0 having *both* toggled, and
thus no edge is detected!
The "solution", of course, is to use a nonblocking assignment to x in
the testbench to prevent it from being seen prematurely by the
flip-flop, and (once I puzzled this out, of course) that's what I've
done. The core of the problem here seems basically to be that I have
testbench code running in the same time step as device code and racing
against each other.
But that's insane! First off, no one writes test benches like that.
I've looked at bunch of references at this point, and *everybody*
writes their Verilog test code in a traditional/imperative style using
blocking assignments. Doing otherwise would be a nightmare.
But more importantly, I don't see how this can ever be made to work
reliably. As far as I can tell, making a blocking assignment can only
be made "safe" from this kind of race condition if you can guarantee
either that no other code is sampling it (or any combinatorial signal
derived from it!) using a nonblocking assignment, or that it isn't
possible for your testbench code to occupy the same time step as
device code. You see the "don't mix assignment types" rule all the
time, but this problem appears to be much more serious: the converse
of the above is that you can't use a nonblocking assignment unless you
know that every reg variable referenced by ths RHS can *never* be
assigned to with "=".
So we finally come to my questions: how is this not a *huge* booby
trap in Verilog testbench development? Am I just incredibly unlucky
that I discovered this so early? If not, what were the language
designers thinking when they decided to allow inevitable race
conditions without synchronization primitives? How do professionals
deal with the issue: use nonblocking assignments always? Shift the
time steps so that testbench code is never synchronous to any device
clocks? Are there well-known conventions people follow, or is this
just something that everyone needs to figure out on their own?
I was really enjoying learning Verilog until I hit this issue. Now I
find myself thinking through ways I can isolate it to just synthesis
code; I'm currently playing with Verilator's C++ environment, and am
seriously thinking about jumping ship. Basically: what am I missing?
Andy
background in digital electronics, am in the process of teaching
myself Verilog, and have stumbled into a gotcha that I'm having
trouble understanding. Or rather: I think I *do* understand it (and
have fixed it in my code), but am having trouble understanding why the
scheduler rules permit this insanity in the first place. So please
read to the bottom -- I'm not asking for help with the circuit, I'm
trying to understand the language's behavior.
Here's the exposition. My circuit design (a USB 1.1 receiver, as it
happens) has an edge detector in it using a single synchronous
flip-flop. Stripped of all the other stuff, it looks basically like
this (I've been led to believe by most sources that this is more or
less the canonical way one writes such a circuit; if there are bugs
please tell me!):
module edge_detector(input clk, x, output edg);
reg x0;
assign edg = (x != x0);
always @(posedge clk) x0 <= x;
endmodule
So that being done, I want to write a testbench, provide some
stimulus, and get out a waveform I can look at in (in this case)
gtkwave. So here's the code for that:
module test;
reg clk=1, x=0;
edge_detector ed(.clk(clk), .x(x));
always #(0.5) clk = !clk;
initial begin
$dumpfile("edge_detector.vcd");
$dumpvars(1, ed);
ed.x0 = 0;
repeat(100) #4 x = !x;
$finish;
end
endmodule
Basically: instantiate the module, clock it with a period of 1, and
every four clocks toggle the value of the input. Again, this is
pretty much standard stuff from everything I'm finding on the web and
in books. The desired output, of course, is that I see the "ed.x"
line oscillate with 4x the frequency of "ed.clk", the "ed.x0" line
should be a 1-clock-delayed version of "ed.x", and that on every edge
of x I see "ed.edg" rise to a logic 1 and stay there until the next
rising edge of clk.
But that's not what happens in the simulator! Running exactly this
code under Icarus 0.9.1, I see the x and x0 lines move in lockstep,
and edg NEVER TRIGGERS AT ALL.
And this doesn't appear to be a bug in Icarus as far as I can tell,
the scheduling rules allow it. For example: just before T=4 the x0
line is still zero, then the "x = !x" line fires first and toggles x
from 0 to 1 with a blocking assignment (this is the edge I want to
detect). Then the "clk = !clk" trigger fires and creates a rising
edge, which in turn triggers the synchronous block in the device to
save the current value of x (which is now one!) in the x0 flipflop.
So we exit the T=4 time step with x and x0 having *both* toggled, and
thus no edge is detected!
The "solution", of course, is to use a nonblocking assignment to x in
the testbench to prevent it from being seen prematurely by the
flip-flop, and (once I puzzled this out, of course) that's what I've
done. The core of the problem here seems basically to be that I have
testbench code running in the same time step as device code and racing
against each other.
But that's insane! First off, no one writes test benches like that.
I've looked at bunch of references at this point, and *everybody*
writes their Verilog test code in a traditional/imperative style using
blocking assignments. Doing otherwise would be a nightmare.
But more importantly, I don't see how this can ever be made to work
reliably. As far as I can tell, making a blocking assignment can only
be made "safe" from this kind of race condition if you can guarantee
either that no other code is sampling it (or any combinatorial signal
derived from it!) using a nonblocking assignment, or that it isn't
possible for your testbench code to occupy the same time step as
device code. You see the "don't mix assignment types" rule all the
time, but this problem appears to be much more serious: the converse
of the above is that you can't use a nonblocking assignment unless you
know that every reg variable referenced by ths RHS can *never* be
assigned to with "=".
So we finally come to my questions: how is this not a *huge* booby
trap in Verilog testbench development? Am I just incredibly unlucky
that I discovered this so early? If not, what were the language
designers thinking when they decided to allow inevitable race
conditions without synchronization primitives? How do professionals
deal with the issue: use nonblocking assignments always? Shift the
time steps so that testbench code is never synchronous to any device
clocks? Are there well-known conventions people follow, or is this
just something that everyone needs to figure out on their own?
I was really enjoying learning Verilog until I hit this issue. Now I
find myself thinking through ways I can isolate it to just synthesis
code; I'm currently playing with Verilator's C++ environment, and am
seriously thinking about jumping ship. Basically: what am I missing?
Andy