Non-Blocking versus blocking

nemo · May 21, 2010

On May 20, 1:28 pm, Patrick Maupin <pmau...@gmail.com> wrote:

On May 20, 3:08 am, Jan Decaluwe <j...@jandecaluwe.com> wrote:

Cummings' guideline to use only blocking assignments for combinatorial
logic is problematic, because it creates an unnecessary exception that
encourages something that is inherently dangerous.

Can you show a real-world example of this danger?

The fact that communication based on blocking assignments works for
combinatorial logic is a coincidence and actually not that trivial to
prove. It depends not only on the inherent nature of combinatorial
logic, but also on "sensible usage".

Blocking assignments to "registers" inside non-clocked blocks and
continuous assignments to "wires" are essentially the same thing. I
don't see this as any kind of coincidence. If proof of a non-clocked
set of gates requires some sort of inherent propagation delay, that
might make the proof suspect, unless you can also prove that the
propagation delay is the correct magnitude.

Cummings' guidelines are problematic in general because they
artificially discuss races in the context of synthesizable logic. But
Verilog, the language, doesn't care about synthesis. Races are races,
and there are plenty of race opportunities in high level models and
test benches also. Those cases need a working guideline too.

In general, the bad effects of a race in your test bench will be that
the test fails. In general the bad effects of a post-synthesis race
are either, again, that the test fails (if you are lucky) or that the
silicon fails (if you are unlucky). So why is it a problem to explain
things in the context of synthesis?

So here it is: Decaluwe's universal guideline for race-free HDL
assignments.
"""
Use non-blocking assignments (signals) for communication.
Use blocking assignments (variables) for local computation.
"""

[ Rest of comment on this snipped. ]

Yes, this will work. But in practice, examination to insure that
these guidelines have been followed can be more time-consuming than if
other guidelines are followed, and it may be possible, using just
these guidelines, to write code that may be more conceptually
difficult to understand than the code you write using other
guidelines. But that's (obviously) just my opinion.

This last guideline is essentially what VHDL does. Variables only
have scope within a process so can not be used for communication
between processes. What guidelines would accomplish the same thing
and be easier to verify? Sounds like something a tool should be
checking for you.

Rick

Jan Decaluwe · May 21, 2010

On May 20, 7:28 pm, Patrick Maupin <pmau...@gmail.com> wrote:

On May 20, 3:08 am, Jan Decaluwe <j...@jandecaluwe.com> wrote:

Cummings' guideline to use only blocking assignments for combinatorial
logic is problematic, because it creates an unnecessary exception that
encourages something that is inherently dangerous.

Can you show a real-world example of this danger?

Use blocking assignments for communication anywhere else, and you
immediately have race problems. Just look in the manuals of mainstream
synthesis vendors for plenty of problematic examples.

"As a matter of forming good coding habits", avoid blocking
assignments for communication altogether, and stop worrying about
races.

The fact that communication based on blocking assignments works for
combinatorial logic is a coincidence and actually not that trivial to
prove. It depends not only on the inherent nature of combinatorial
logic, but also on "sensible usage".

Blocking assignments to "registers" inside non-clocked blocks and
continuous assignments to "wires" are essentially the same thing. I
don't see this as any kind of coincidence.

That is not what I was talking about.

Suppose you start manipulating clocks combinatorially, using assigns
or blocking assignments. I don't see why the resulting transactions
would be race-free. I immediately add that this is probably
nonsensical usage, and that in practice, blocking assignment-based
communication works for the special case of "meaningful" combinatorial
logic. But do you see why I call that a coincidence? It feels like
plain luck based on shaky foundations.

Cummings' guidelines are problematic in general because they
artificially discuss races in the context of synthesizable logic. But
Verilog, the language, doesn't care about synthesis. Races are races,
and there are plenty of race opportunities in high level models and
test benches also. Those cases need a working guideline too.

In general, the bad effects of a race in your test bench will be that
the test fails. In general the bad effects of a post-synthesis race
are either, again, that the test fails (if you are lucky) or that the
silicon fails (if you are unlucky). So why is it a problem to explain
things in the context of synthesis?

Given the fact that races are costly in any case, do you see the value
of having one single guideline that covers everything?

So here it is: Decaluwe's universal guideline for race-free HDL
assignments.
"""
Use non-blocking assignments (signals) for communication.
Use blocking assignments (variables) for local computation.
"""

[ Rest of comment on this snipped. ]

Yes, this will work. But in practice, examination to insure that
these guidelines have been followed can be more time-consuming than if
other guidelines are followed

I have a single guideline that covers everything; Cummings has a set
of guidelines that covers synthesis only and that introduces a number
of special cases. It seems obvious that my proposal will make the work
of the reviewer or the linting tool both more meaningful and much
simpler.

Jan

Jan Decaluwe · May 21, 2010

On May 20, 10:34 pm, Jonathan Bromley <s...@oxfordbromley.plus.com>
wrote:

I suspect that Jan, like me, is unhappy about Verilog's
completely uncontrolled concurrency model.

Absolutely correct. Rather unhappy

In 1990, I designed a chip with Verilog, using blocking assignments
only because that's all there was at the time. Non-blocking behavior
was provided by using modules and ports - otherwise Verilog would have
been completely unusable at the time.

Much later, in 2000, I found out that simulation vendors had exploited
loopholes in the Verilog standard to take away non-blocking behavior
from ports. In other words, my trusted coding style and legacy code
had now become undeterministic and unreliable.

When I complained about this, I experienced not a single grain of
sympathy or understanding from the Verilog design community. Instead,
they started to explain the loopholes to me. The message was that I
had been stupid to think that it was possible to design with Verilog
in a reliable way at the time.

At that moment, I developed a fundamental distrust against Verilog's
zero-delay (non)model, and against Verilog's design community and its
guru's in the same pass. I learned to forgive VHDL almost anything
just for giving us the delta cycle algorithm. When using Verilog, I
tried to do it as much as possible as VHDL. And whenever I hear
someone sing the praises of Verilog's ease of use, I invariably think:
you don't have enough experience.

For my outcry in despair at the time, see:

http://groups.google.com/group/comp.lang.verilog/browse_frm/thread/edc6d326a821c9a9/67ec9f89afe22b3b

Jan

Jonathan Bromley · May 21, 2010

On May 21, 12:12 pm, Jan Decaluwe <j...@jandecaluwe.com> wrote:

For my outcry in despair at the time, see:

http://groups.google.com/group/comp.lang.verilog/browse_frm/thread/ed...

At least Evan Lavelle backed you up

--
Jonathan Bromley

Jan Decaluwe · May 21, 2010

On May 21, 2:37 pm, Jonathan Bromley <s...@oxfordbromley.plus.com>
wrote:

On May 21, 12:12 pm, Jan Decaluwe <j...@jandecaluwe.com> wrote:

For my outcry in despair at the time, see:

http://groups.google.com/group/comp.lang.verilog/browse_frm/thread/ed...

At least Evan Lavelle backed you up
--
Jonathan Bromley

That's right. My thanks to him, even if they are little out-of-date

Patrick Maupin · May 21, 2010

On May 21, 2:44 am, Jonathan Bromley <s...@oxfordbromley.plus.com>
wrote:

Patrick,

I don't have time to write a complete reply just now,
but I can't let this piece of nonsense go unchallenged:

[me]> > And then there's Verilog:
- anyone can mess with any shared variable at any time,

snip
[Pat]

If you want to
insure that another process can never see incoherent state between
variable X and variable Y, just make sure you update them at exactly
the same time.

That is simply untrue:

always @(posedge clock)
X = some_expression;
always @(posedge clock)
Y = some_other_expression;
always @(posedge clock) begin : observer
if (some relationship between X and Y) ....;
end

Yes, I should have known that you would be both pedantic and
dismissive.
What I said is not nonsense *if* you always use non-blocking
assignments in sequential blocks. Of course you know I believe that
is the right answer, so of course you show an opposite example.

<snipped lots more examples of blocking assignments>

No semaphore, mutex, or monitor required.

You're kidding, right? In Verilog you make this
work correctly by wheeling out the evaluate/update
model made possible by nonblocking assignment.

Ahhh, so you *do* understand. But yes, a C semaphore or mutex or
monitor requires something *below* the level of the language, whereas
the Verilog language itself provides the needed primitives, and, let's
face it, "evaluate/update" is what has to happen with clocked
processes in all but the simplest cases.

In other environments you would use other exclusion
or synchronisation primitives. Either way, you need
some discipline.

Coding requires discipline. With verilog, there is no *extra*
discipline that would be required for the kind of IPC thing we were
discussing than there is for sequential coding in general.

Verilog is _much_ worse than C in this regard because
it has concurrency constructs built in to the language,
but they are completely undisciplined. In C, to get
concurrency you must appeal to some library or toolkit;
if properly designed, that library will provide not
only the parallelism but also the synchronisation
primitives that you need, so you get a proper way to
do things as a single package.

You can easily be as undisciplined in C. The fact that C has no
concurrency, but C + package allows you concurrency doesn't mean that
C + package enforces anything on the user.

Regards,
Pat

Patrick Maupin · May 21, 2010

On May 21, 3:19 am, nemo <gnu...@gmail.com> wrote:

This last guideline is essentially what VHDL does. Variables only
have scope within a process so can not be used for communication
between processes. What guidelines would accomplish the same thing
and be easier to verify? Sounds like something a tool should be
checking for you.

Obviously opinions vary, but if you always use non-blocking
assignments in sequential blocks, and always use blocking assignments
in combinatorial blocks, and adhere to a few other guidelines, it's
very easy to spot issues at a glance.

Regards,
Pat

Patrick Maupin · May 21, 2010

On May 21, 4:07 am, Jan Decaluwe <j...@jandecaluwe.com> wrote:

On May 20, 7:28 pm, Patrick Maupin <pmau...@gmail.com> wrote:

On May 20, 3:08 am, Jan Decaluwe <j...@jandecaluwe.com> wrote:

Cummings' guideline to use only blocking assignments for combinatorial
logic is problematic, because it creates an unnecessary exception that
encourages something that is inherently dangerous.

Can you show a real-world example of this danger?

Use blocking assignments for communication anywhere else, and you
immediately have race problems. Just look in the manuals of mainstream
synthesis vendors for plenty of problematic examples.

So by "anywhere else" you mean in a sequential block. Yes, I don't do
that.

"As a matter of forming good coding habits", avoid blocking
assignments for communication altogether, and stop worrying about
races.

Well, I just avoid blocking assignments at all in sequential blocks.
As we have discussed previously, that can also be a viable strategy.

[ stuff snipped]

Suppose you start manipulating clocks combinatorially, using assigns
or blocking assignments. I don't see why the resulting transactions
would be race-free. I immediately add that this is probably
nonsensical usage, and that in practice, blocking assignment-based
communication works for the special case of "meaningful" combinatorial
logic. But do you see why I call that a coincidence? It feels like
plain luck based on shaky foundations.

I see. I agree that (as you have also pointed out in other posts) the
foundations of Verilog were somewhat shaky (at least in the sense of
not being regular and extremely well thought out) and that some of the
subsequent fixes have created a non-orthogonal language.
"Coincidence" sounds like a lucky accident we got there at all, when
at the end of the day, it was through a lot of hard work, so "hack"
might be a better word

[ stuff snipped ]

Given the fact that races are costly in any case, do you see the value
of having one single guideline that covers everything?

Yes, but I also see the cost. For example, in a large block with
intermingled blocking/non-blocking assignments, there might be a lot
of scrolling back and forth to determine if a variable is local or
not. For another example, if you need the local variable in another
block, now you have to go back and check all the uses of it and
possibly do some recoding.

I have a single guideline that covers everything; Cummings has a set
of guidelines that covers synthesis only and that introduces a number
of special cases.

Cummings certainly covers synthesis. What in his guidelines will
break for non-synthesis?

It seems obvious that my proposal will make the work
of the reviewer or the linting tool both more meaningful and much
simpler.

Well, I've shown you an example of how I would code something, so you
know my mileage varies on that.

Regards,
Pat

Ali Karaali · May 22, 2010

[Jonathan Bromley]

The second issue is that blocking assignment in combinational
logic allows you to implement clock gating without a delta
delay.

What do you mean by that(implementing clock gating)?

I know,
always@(*) begin
a = b;
c = a;
end
these statements are executed sequentaly, a = b blocks the c = d;

But what about these?
always@(*) begin
a = b; //statement1
c = d; //statement2
end
Is there any order of those staments?

As for the nonblocking,
Is there any order between nonblocking assignments in the same
execution
path according to the verilog standart?

always@(posedge clk) begin
a <= b;
c <= d;
end

Ali

Jonathan Bromley · May 22, 2010

On Sat, 22 May 2010 10:24:52 -0700 (PDT), Ali Karaali wrote:

The second issue is that blocking assignment in combinational
logic allows you to implement clock gating without a delta
delay.

What do you mean by that(implementing clock gating)?

If you're using an FPGA, please don't worry about it.

When designing ASICs for low power consumption, folk
often use clock gating to suppress the clock to some
part of the design when it's not active. This is
*not* usually a good idea in an FPGA design, but in
an ASIC it can be a useful technique. So let's
suppose that the designer has correctly crafted a
clock gating signal ClockIsActive, and has carefully
arranged its timing so that
MasterClock & ClockIsActive
will behave nicely, with no unpleasant glitches.
OK, so now we try...

always @(posedge MasterClock)
Qm <= something;

always @*
GatedClock <= MasterClock & ClockIsActive;

always @(posedge GatedClock)
Qg <= Qm;

The problem is that GatedClock is delayed by
one nonblocking assignment delay (a delta cycle)
in just the same way as Qm is delayed. So now
there is a simulation race condition between the
change of Qm and the (posedge GatedClock) that's
used to sample it.

However, if you change the clock gate logic to

always @*
GatedClock = MasterClock & ClockIsActive;

or, just as good,

assign GatedClock = MasterClock & ClockIsActive;

then the clock gate (and any other combinational logic)
updates earlier than the nonblocking assignments, and the
sampling of Qm in the GatedClock domain works correctly.

Of course, this is merely a trick to make zero-delay
simulation work correctly. In the finished device it's
very important to apply appropriate timing constraints
to ensure that the real logic also behaves in this way.

always@(*) begin
a = b;
c = a;
end
these statements are executed sequentaly, a = b blocks the c = d;

But what about these?
always@(*) begin
a = b; //statement1
c = d; //statement2
end
Is there any order of those staments?

Yes, certainly. Sequential statements in a begin...end block
are definitely executed in order. The whole point about blocking
assignment is that the assignment has completed (has taken effect)
before the next statement is executed. Just like normal software.

As for the nonblocking,
Is there any order between nonblocking assignments in the same
execution
path according to the verilog standart?

always@(posedge clk) begin
a <= b;
c <= d;
end

Yes! The two assignments execute sequentially. Consequently the
order of activity is:

1) Evaluate "b"
2) Make a delayed assignment of that value to "a" - put the
assignment on the queue of future activity, but don't do it yet
3) Evaluete "d"
4) Make the delayed assignment to "c"
5) Do any other "always" blocks that were triggered by
(posedge clk), in the same way as I just described
6) When all such activity is finished, update the variables
that have scheduled delayed assignments, BUT DO IT IN THE
SAME ORDER IN WHICH THE CORRESPONDING ASSIGNMENTS WERE
EXECUTED (first-in first-out). Therefore, first update
"a" and then update "c".

Hope this helps you to see what's happening.
--
Jonathan Bromley

Jonathan Bromley · May 22, 2010

On Fri, 21 May 2010 11:40:22 -0700 (PDT), Patrick Maupin wrote:

Yes, I should have known that you would be both pedantic and
dismissive.

"Pedantic" I regard as a compliment in this context, because
it's too easy to say things that are unclear or have loopholes;
pedantry is useful. "Dismissive" is less welcome; I'm sorry
you felt that. I think we are diametrically opposed on some
"cultural" issues about how these things are best done, but
that doesn't mean I dismiss your position; I'm just rather
keen to expose any weaknesses I perceive in it

Of course you know I believe that
[the use of NBA to give meaning to "at the same time"]
is the right answer, so of course you show an opposite example.

Yes, because it is important to be pedantic. The notion of
"at the same time" is not at all simple in Verilog (take
a look at the LRM description of the scheduler!) and it
is completely inappropriate to use such a phrase when
trying to be precise about what makes sense and what doesn't.
By contrast, VHDL's extremely rigid evaluate/update model
makes it rather easy to reason about "at the same time"
provided you avoid a very tiny set of well-documented
loopholes in the language (the broken form of shared
variable in VHDL93, sharing access to files).

Ahhh, so you *do* understand.

Well, of course I do to some extent; and, of course, so
do you. We both know perfectly well how to achieve the
results we need. The difference seems to be that you
are a vigorous apologist for the status quo, while I
would love to have things be very different. However,
I'm sufficiently pragmatic to know that I can't make
a real difference. That doesn't, and shouldn't, stop
me having a good old rant about it from time to time

But yes, a C semaphore or mutex or
monitor requires something *below* the level of the language

I don't think I agree with that. C is a sufficiently low-level
language that you can create those primitives within the core
language if you wish to do so.

whereas
the Verilog language itself provides the needed primitives, and, let's
face it, "evaluate/update" is what has to happen with clocked
processes in all but the simplest cases.

Right; but Verilog does leave Pandora's box wide open, doesn't it?
Since Verilog does not _enforce_ the use of nonblocking assignment
to shared variables, it effectively has free uncontrolled sharing
of variables among concurrent processes. And yet it didn't, until
SystemVerilog, have even a mutex or semaphore available! This
situation is tolerable only because many (most??) Verilog users
are working within a framework that looks pretty much like
synthesisable logic, where a few fairly simple coding guidelines
are enough to keep you out of trouble. Anyone who's tried to
write a reasonably sophisticated testbench in Verilog, with
multiple threads of control, either is familiar with the
problem of mutual exclusion and has tricks for dealing with
it, or has their head buried in the sand.

--
Jonathan Bromley

Andy · May 22, 2010

On May 21, 1:40 pm, Patrick Maupin <pmau...@gmail.com> wrote:

You can easily be as undisciplined in C. The fact that C has no
concurrency, but C + package allows you concurrency doesn't mean that
C + package enforces anything on the user.

And just like C + package, Verilog does not enforce anything either.

Andy

Patrick Maupin · May 23, 2010

On May 22, 4:38 pm, Jonathan Bromley <s...@oxfordbromley.plus.com>
wrote:

On Fri, 21 May 2010 11:40:22 -0700 (PDT), Patrick Maupin wrote:
Yes, I should have known that you would be both pedantic and
dismissive.

"Pedantic" I regard as a compliment in this context, because
it's too easy to say things that are unclear or have loopholes;
pedantry is useful. "Dismissive" is less welcome; I'm sorry
you felt that.

Well, you did use the word "nonsense" instead of simply explaining
that my statement simply required a (bit, lot?) more qualification...

I think we are diametrically opposed on some
"cultural" issues about how these things are best done, but
that doesn't mean I dismiss your position; I'm just rather
keen to expose any weaknesses I perceive in it

Sure, and the pedantry is fine and helpful in that regard.

Of course you know I believe that

[the use of NBA to give meaning to "at the same time"]

is the right answer, so of course you show an opposite example.

Yes, because it is important to be pedantic. The notion of
"at the same time" is not at all simple in Verilog (take
a look at the LRM description of the scheduler!) and it
is completely inappropriate to use such a phrase when
trying to be precise about what makes sense and what doesn't.

But, if all assignments in clocked processes are nonblocking, "at the
same time" can make a great deal of sense when dealing with
combinations of those signals.

By contrast, VHDL's extremely rigid evaluate/update model
makes it rather easy to reason about "at the same time"
provided you avoid a very tiny set of well-documented
loopholes in the language (the broken form of shared
variable in VHDL93, sharing access to files).

Sure, but VHDL has its own set of problems.

Ahhh, so you *do* understand.

Well, of course I do to some extent; and, of course, so
do you. We both know perfectly well how to achieve the
results we need. The difference seems to be that you
are a vigorous apologist for the status quo, while I
would love to have things be very different.

I wouldn't mind terribly if things were different in some ways, but on
balance, I prefer Verilog over VHDL. For one thing, I
programmatically generate a lot of RTL, and I find it extremely easy
to generate Verilog. It's low-level enough that it doesn't get in the
way of trying to place application-specific abstractions on top of it.

However,
I'm sufficiently pragmatic to know that I can't make
a real difference. That doesn't, and shouldn't, stop
me having a good old rant about it from time to time

Be my guest. But it would nice if you could rant in such a way that
you made the actual target of the rant a bit clearer at the top of the
post

But yes, a C semaphore or mutex or
monitor requires something *below* the level of the language

I don't think I agree with that. C is a sufficiently low-level
language that you can create those primitives within the core
language if you wish to do so.

Well, it's been a decade and a half since I've done any of that sort
of thing in C, but IIRC, the core language doesn't have, e.g. atomic
operations. Now it may be that "i += 1" is usually atomic on a
processor that supports atomic increment; but with multiprocessing,
all bets are off, unless you use (e.g. on the x86) a LOCK prefix in
front of the INC instruction. (In a multiprocessing environment, even
disabling interrupts will be insufficient to guarantee
sychronization.) On a multiprocessin X86, you really need to be able
to do things like LOCK and XCHG or XADD for a full set of primitives,
and I don't think the C language directly supports this; at least not
back when I was doing a lot of C.

whereas
the Verilog language itself provides the needed primitives, and, let's
face it, "evaluate/update" is what has to happen with clocked
processes in all but the simplest cases.

Right; but Verilog does leave Pandora's box wide open, doesn't it?

Absolutely.

Since Verilog does not _enforce_ the use of nonblocking assignment
to shared variables, it effectively has free uncontrolled sharing
of variables among concurrent processes.

Right, but as you know, I always use nonblocking assignments in
clocked processes, so really and truly, this is a non-issue for me.

And yet it didn't, until
SystemVerilog, have even a mutex or semaphore available!

And, to my knowledge, C still doesn't either...

This
situation is tolerable only because many (most??) Verilog users
are working within a framework that looks pretty much like
synthesisable logic, where a few fairly simple coding guidelines
are enough to keep you out of trouble.

Yes. As I mentioned, myopically speaking

, that's the closest to
the final product.

Anyone who's tried to
write a reasonably sophisticated testbench in Verilog, with
multiple threads of control, either is familiar with the
problem of mutual exclusion and has tricks for dealing with
it, or has their head buried in the sand.

Sure, but at the end of the day, you can probably get into trouble
with synchronization (where trouble is defined as "unexpected
behavior") in any language. Personally, although there will always be
room for improvement, I think that the "new world order" (at least for
the world where I live) where synthesizable stuff is in Verilog and
testbenches are moving to System Verilog, is really not too bad.

Regards,
Pat

Patrick Maupin · May 23, 2010

On May 22, 3:40 pm, Andy <jonesa...@comcast.net> wrote:

On May 21, 1:40 pm, Patrick Maupin <pmau...@gmail.com> wrote:

You can easily be as undisciplined in C. The fact that C has no
concurrency, but C + package allows you concurrency doesn't mean that
C + package enforces anything on the user.

And just like C + package, Verilog does not enforce anything either.

Yes. By "as undisciplined" I meant "equally undisciplined." Sorry if
that was unclear.

Regards,
Pat

Jan Decaluwe · May 23, 2010

On May 21, 8:59 pm, Patrick Maupin <pmau...@gmail.com> wrote:

On May 21, 4:07 am, Jan Decaluwe <j...@jandecaluwe.com> wrote:

Use blocking assignments for communication anywhere else, and you
immediately have race problems. Just look in the manuals of mainstream
synthesis vendors for plenty of problematic examples.

So by "anywhere else" you mean in a sequential block. Yes, I don't do
that.

No, I do mean anywhere else, in particular also for verification-
related tasks where the RTL paradigm does not apply, and which is
typically the bulk of the work.

(When I use the word "sequential" in the context of HDL coding, I mean
RTL. Of course, if you would define it as "anything non-
combinatorial", we are saying the same thing.)

"As a matter of forming good coding habits", avoid blocking
assignments for communication altogether, and stop worrying about
races.

Well, I just avoid blocking assignments at all in sequential blocks.
As we have discussed previously, that can also be a viable strategy.

I know what you do for synthesis, now let's see about test benches.

Given the fact that races are costly in any case, do you see the value
of having one single guideline that covers everything?

Yes, but I also see the cost. For example, in a large block with
intermingled blocking/non-blocking assignments, there might be a lot
of scrolling back and forth to determine if a variable is local or
not. For another example, if you need the local variable in another
block, now you have to go back and check all the uses of it and
possibly do some recoding.

Correct. Note that VHDL makes this task much easier and less costly as
safe communication is enforced by the language.

I have a single guideline that covers everything; Cummings has a set
of guidelines that covers synthesis only and that introduces a number
of special cases.

Cummings certainly covers synthesis. What in his guidelines will
break for non-synthesis?

Short answer: nothing, but that is irrelevant because nobody wants
similar restrictions beyond synthesis.

Long answer: this is a puzzling question.

First, the fact that Cummings talks specifically about synthesis tells
me indirectly that he doesn't want to impose anything similar for test
benches and higher level modeling. I don't want to give the impression
of critisizing him for something he doesn't say, I have enough work
with what he actually does say already

Then, his guidelines are formulated in RTL jargon, which makes them
not applicable at higher levels. We first need a translation step that
extracts the "spirit" of his guidelines.

The spirit of his guidelines is: use only a single type of assignment
in any given always block. To me, it is close to obvious that no
Verilog verification engineer will find this acceptable. But of
course, what I find obvious doesn't count, so I will state my
hypothesis explicitly in a form that can be disproven.

My hypothesis is that anyone serious about using Verilog for
verification will, in the same always block:
1) want reliable communication, hence non-blocking assignments.
2) systematically need variable semantics internally for modeling,
hence blocking assignments.

If this is right, the spirit of Cummings is not applicable and you
basically need my guideline. Actually, I suspect that this is what
most Verilog verification engineers are doing already, and I would be
rather surprized if you didn't.

Jan

Patrick Maupin · May 23, 2010

On May 23, 8:46 am, Jan Decaluwe <j...@jandecaluwe.com> wrote:

On May 21, 8:59 pm, Patrick Maupin <pmau...@gmail.com> wrote:

On May 21, 4:07 am, Jan Decaluwe <j...@jandecaluwe.com> wrote:
Use blocking assignments for communication anywhere else, and you
immediately have race problems. Just look in the manuals of mainstream
synthesis vendors for plenty of problematic examples.

So by "anywhere else" you mean in a sequential block. Yes, I don't do
that.

No, I do mean anywhere else, in particular also for verification-
related tasks where the RTL paradigm does not apply, and which is
typically the bulk of the work.

OK. I saw "synthesis" in the original, and thought that was what we
were talking about. Also, you mentioned that you never use blocking
assignments for "communication." Depending on what you mean by
"communication", that could be extremely limiting in a testbench, so I
assumed the context was RTL.

(When I use the word "sequential" in the context of HDL coding, I mean
RTL. Of course, if you would define it as "anything non-
combinatorial", we are saying the same thing.)

Well, if we have to have more categories than combinatorial and
sequential, I would call the third category "advanced." "Advanced" is
really sequential, but is usually a tiny portion of the design. It
can happen in both testbenches and synthesizable logic, however. For
example, in synthesizable it could be I/O latches, or clock gaters.
In testbenches, the same thing, plus clock generation and a few other
corner cases.

In general, you don't need that much advanced stuff, and in general
(especially in RTL) the junior guy isn't coding it, so I have to say
up front, that while it would be nice for a set of guidelines to cover
"advanced", it really isn't that bad if the rules have to be bent a
bit for the advanced stuff.

"As a matter of forming good coding habits", avoid blocking
assignments for communication altogether, and stop worrying about
races.

Well, I just avoid blocking assignments at all in sequential blocks.
As we have discussed previously, that can also be a viable strategy.

I know what you do for synthesis, now let's see about test benches.

Testbenches have slightly relaxed rules, which probably violate your
rule about not using blocking assignments for communications. For
example, if I am generating a signal that is an asynchronous input to
RTL, there is no reason not to use nonblocking assignments to create
it. However, we still follow the rule of not mixing blocking and
nonblocking in the same block.

Given the fact that races are costly in any case, do you see the value
of having one single guideline that covers everything?

Yes, but I also see the cost. For example, in a large block with
intermingled blocking/non-blocking assignments, there might be a lot
of scrolling back and forth to determine if a variable is local or
not. For another example, if you need the local variable in another
block, now you have to go back and check all the uses of it and
possibly do some recoding.

Correct. Note that VHDL makes this task much easier and less costly as
safe communication is enforced by the language.

You say that like the cost is unbearable, and like VHDL doesn't have
other costs.

I have a single guideline that covers everything; Cummings has a set
of guidelines that covers synthesis only and that introduces a number
of special cases.

Cummings certainly covers synthesis. What in his guidelines will
break for non-synthesis?

Short answer: nothing, but that is irrelevant because nobody wants
similar restrictions beyond synthesis.

Certainly, one of Cummings's major objectives is to insure that
simulation and synthesis results match, and there is no requirement
for that in a testbench. But in that vein, if I understand your
guideline about blocking assignments correctly, I don't want *that*
restriction for a testbench at all.

Long answer: this is a puzzling question.

First, the fact that Cummings talks specifically about synthesis tells
me indirectly that he doesn't want to impose anything similar for test
benches and higher level modeling. I don't want to give the impression
of critisizing him for something he doesn't say, I have enough work
with what he actually does say already

Well, you agree that some guidelines are required for testbenches. In
fact, if I understand you correctly, you yourself have a guideline
that I don't follow for testbenches. So, as a starting point, you
could do much worse than reading Cummings's papers and understanding
the reasoning behind his guidelines.

Then, his guidelines are formulated in RTL jargon, which makes them
not applicable at higher levels. We first need a translation step that
extracts the "spirit" of his guidelines.

I can agree with that.

The spirit of his guidelines is: use only a single type of assignment
in any given always block.

I do that.

To me, it is close to obvious that no
Verilog verification engineer will find this acceptable.

Our dedicated testbench engineers are much more adamant than I am
about following this guideline, so I really don't know how you arrived
at this conclusion.

But of
course, what I find obvious doesn't count, so I will state my
hypothesis explicitly in a form that can be disproven.

My hypothesis is that anyone serious about using Verilog for
verification will, in the same always block:
1) want reliable communication, hence non-blocking assignments.

When a testbench is generating a completely independent signal for
simulation purposes (for an async input), non-blocking assignments may
not be required at all.

2) systematically need variable semantics internally for modeling,
hence blocking assignments.

When a testbench is manipulating dependent signals, we code that like
the RTL -- the blocking and nonblocking assignments are in *different*
always blocks. Honestly, this is not a terrible burden.

If this is right, the spirit of Cummings is not applicable and you
basically need my guideline. Actually, I suspect that this is what
most Verilog verification engineers are doing already, and I would be
rather surprized if you didn't.

If I understand your guideline correctly, it says any varying inputs
to my DUT need to be created from nonblocking assignments inside the
testbench. I certainly don't need or want that guideline.

Regards,
Pat

Ali Karaali · May 24, 2010

On 23 MayÄąs, 00:06, Jonathan Bromley <s...@oxfordbromley.plus.com>
wrote:

On Sat, 22 May 2010 10:24:52 -0700 (PDT), Ali Karaali wrote:
[Jonathan Bromley]

always@(*) begin
Â Â a = b;
Â Â c = a;
end
these statements are executed sequentaly, a = b blocks the c = d;

When b is assigned to a, the always star block will be activated again
because
right handside of the variable is changed?
When activated the block stars immediately or after finish the
execution path?
So the right values is assigned after two iteration but it takes very
small time?

Hope this helps you to see what's happening.
--
Jonathan Bromley

Jonathan Bromley · May 24, 2010

On May 23, 10:06 pm, Ali Karaali <ali...@gmail.com> wrote:

always@(*) begin
a = b;
c = a;
end

When b is assigned to a, the always star block
will be activated again because
right handside of the variable is changed?

Yes, but....

When activated the block stars immediately or after finish the
execution path?

That's not quite the right question. It's important to see
that "always@*" is not a Verilog keyword. What's really
happening is...

always
@* begin
...
end

And, of course, "always" is a procedural infinite loop,
exactly equivalent to "initial forever" or, if you
prefer, "initial while(1)". Some people think, wrongly,
that "always @*" represents a block of code that is
automatically launched whenever an input signal
changes value. That is not the case. The infinite
loop starts to execute at time 0. It immediately
freezes at the @* event control, waiting until
there is a value change on one or more of its inputs;
the event control is then released, and execution
proceeds. The software (procedural code) in the
begin...end block then executes to the end, and
then the "always" construct causes execution to
loop back to the beginning and the whole thing
starts all over again, waiting for @*.

So let's use this understanding to follow what
happens in your example:

1)
At time 0, the "always" statement starts to execute.
2)
Immediately, its execution freezes on @*.
3)
Some time later, some other code changes the value
of variable 'b'.
4)
The @* event control is released; execution of your
code proceeds. First, 'a' is updated with the new
value of 'b'. Next, 'c' is updated with the new
value of 'a' (which came from 'b', of course). This
is just normal software-like execution. Blocking
assignment (with no delay) works exactly like normal
assignment in C or other imperative languages.
5)
Execution has now reached the "end", so the "always"
iterates and execution once more hits the @*. Note
that both 'a' and 'c' already have their correct new
values, after only one iteration of the code.

So far, so clear. Now things get a little more
difficult. As you say, there is a value change
on 'a'. Does that release the @*? I would argue
that it does not, because 'a' already has its new
value at the moment when execution reaches @*.
There is no further value change on 'a'.
However, some simulators (I believe) compute
"value change" by looking at the value of a
variable before and after the execution of code
at a given point in time. Such a simulator might
now see that 'a' has changed since the beginning
of the time-slot, and therefore might choose to
release the @* for a second time. THIS DOES NOT
MATTER because your code describes proper
combinational logic and the second iteration,
if it occurs, will give exactly the same results
as before.

So the right values is assigned after two iteration
but it takes very small time?

Possibly.... see above.

Now it's your chance to show how well you understand
all this... let's try another example....

always @* begin
a <= b;
c <= a;
end

Can you "tell the story" of this code, in the same
way as I did for your example? The final result
is almost the same, but the processing is very
different.........

Jonathan Bromley

Ali Karaali · May 24, 2010

On 24 MayÄąs, 11:29, Jonathan Bromley <s...@oxfordbromley.plus.com>
wrote:

On May 23, 10:06Â pm, Ali Karaali <ali...@gmail.com> wrote:

always@(*) begin
Â Â a = b;
Â Â c = a;
end

When b is assigned to a, the always star block
will be activated again because
right handside of the variable is changed?

Yes, but....

When activated the block stars immediately or after finish the
execution path?

That's not quite the right question. Â It's important to see
that "always@*" is not a Verilog keyword. Â What's really
happening is...

Â always
Â Â @* begin
Â Â Â ...
Â Â end

And, of course, "always" is a procedural infinite loop,
exactly equivalent to "initial forever" or, if you
prefer, "initial while(1)". Â Some people think, wrongly,
that "always @*" represents a block of code that is
automatically launched whenever an input signal
changes value. Â That is not the case. Â The infinite
loop starts to execute at time 0. Â It immediately
freezes at the @* event control, waiting until
there is a value change on one or more of its inputs;
the event control is then released, and execution
proceeds. Â The software (procedural code) in the
begin...end block then executes to the end, and
then the "always" construct causes execution to
loop back to the beginning and the whole thing
starts all over again, waiting for @*.

So let's use this understanding to follow what
happens in your example:

1)
At time 0, the "always" statement starts to execute.
2)
Immediately, its execution freezes on @*.
3)
Some time later, some other code changes the value
of variable 'b'.
4)
The @* event control is released; execution of your
code proceeds. Â First, 'a' is updated with the new
value of 'b'. Â Next, 'c' is updated with the new
value of 'a' (which came from 'b', of course). Â This
is just normal software-like execution. Â Blocking
assignment (with no delay) works exactly like normal
assignment in C or other imperative languages.
5)
Execution has now reached the "end", so the "always"
iterates and execution once more hits the @*. Â Note
that both 'a' and 'c' already have their correct new
values, after only one iteration of the code.

So far, so clear. Â Now things get a little more
difficult. Â As you say, there is a value change
on 'a'. Â Does that release the @*? Â I would argue
that it does not, because 'a' already has its new
value at the moment when execution reaches @*.
There is no further value change on 'a'.
However, some simulators (I believe) compute
"value change" by looking at the value of a
variable before and after the execution of code
at a given point in time. Â Such a simulator might
now see that 'a' has changed since the beginning
of the time-slot, and therefore might choose to
release the @* for a second time. Â THIS DOES NOT
MATTER because your code describes proper
combinational logic and the second iteration,
if it occurs, will give exactly the same results
as before.

So the right values is assigned after two iteration
but it takes very small time?

Possibly.... see above.

Now it's your chance to show how well you understand
all this... let's try another example....

Â always @* begin
Â Â a <= b;
Â Â c <= a;
Â end

Can you "tell the story" of this code, in the same
way as I did for your example? Â The final result
is almost the same, but the processing is very
different.........

Jonathan Bromley

For example
@t = 0 a = 3, b = 5, c = 2;

1-) At time = 0 always is started to execute
2-) it is freezed on @*.
3-) (@t = 2 b = 7) b is changed somewhere and @* is released and
the execution is started at the end of begin keyword.
4-) "assigned b to a" in a queue, therefore b isn't change.
(@t = 2) a = 3 b = 7 c = 2 ( a = 7 is in the queue)
5-) "assigned a to c" in a queue too, c isn't change either.
(@t = 2) a = 3 b = 7 c = 2 ( c = 3 is in the queue)
6-) Whole statements is finished so fetch the queue elements,
(@t = 2 + delta_t) b is assigned to a
a = 7 b = 7 c = 2
@(t = 2 + delta_t)and a (a = 3; the value of 'a' at t = 2) is assigned
to c but
a = 7 b = 7 c = 3
7-) And freeze @*

But a was changed so @* is released.
9-) "assigned b to a" in a queue.
(@t = 2 + delta_t)a = 7, b = 7, c = 3
10-)"assigned a to c" in a queue too.
(@t = 2 + delta_t)a = 7, b = 7, c = 3 (c = 7 in the queue)
11-) Whole statements is finished
(@t = 2 + 2*delta_t)and a is assigned to c
a = 7 b = 7 c = 7

But I need to think about all of them again.
@6 How much time does between two assignment have?
a = 7 between c = 3
isn't againg a delta time, right?

Ali

Jonathan Bromley · May 24, 2010

Ali,

always @* begin
a <= b;
c <= a;
end

Can you "tell the story" of this code, in the same
way as I did for your example? The final result
is almost the same, but the processing is very
different.........

Hey, wait a minute. This isn't what's supposed to
happen here. The idea is that lazy students ask
idiotic questions, and we tell them they are being
lazy. But now you go and ruin it: not only have
you asked a completely reasonable question based
on your own thoughtful concerns, but you also are
prepared to do some real work to dig deeper. Be
very careful... this might become a habit

For example
@t = 0 a = 3, b = 5, c = 2;

1-) At time = 0 always is started to execute
2-) it is freezed on @*.
3-) (@t = 2 b = 7) b is changed somewhere and @* is released and
the execution is started at the end of begin keyword.
4-) "assigned b to a" in a queue, therefore b isn't change.
(@t = 2) a = 3 b = 7 c = 2 ( a = 7 is in the queue)
5-) "assigned a to c" in a queue too, c isn't change either.
(@t = 2) a = 3 b = 7 c = 2 ( c = 3 is in the queue)
6-) Whole statements is finished

Correct so far.

so fetch the queue elements,
(@t = 2 + delta_t) b is assigned to a
a = 7 b = 7 c = 2
@(t = 2 + delta_t)and a (a = 3; the value of 'a' at t = 2) is assigned
to c but
a = 7 b = 7 c = 3
7-) And freeze @*

Not quite. You have switched these around. FIRST the always
block loops around to @* and freezes. And the same thing
happens to any other always blocks that were running at the
same moment of simulation time. Then, when every one of
those always blocks is stuck at an @ or # event control,
the nonblocking assignments on the queue will take effect.

You say "+delta_t". That would be exactly correct for
VHDL. For Verilog, though, there is no exact idea of
a "delta cycle". But it's still a useful and sensible
idea.

But a was changed so @* is released.
9-) "assigned b to a" in a queue.
(@t = 2 + delta_t)a = 7, b = 7, c = 3
10-)"assigned a to c" in a queue too.
(@t = 2 + delta_t)a = 7, b = 7, c = 3 (c = 7 in the queue)
11-) Whole statements is finished

and then back to freeze at @* again, and then...

(@t = 2 + 2*delta_t)and a is assigned to c
a = 7 b = 7 c = 7

But I need to think about all of them again.
@6 How much time does between two assignment have?
a = 7 between c = 3
isn't againg a delta time, right?

No; all those nonblocking updates happen together.

However, it is DEFINED that they happen in the same
sequence that they were executed. Often this is not
so important, but there's one place where it is
essential. Consider this clocked logic:

always @(posedge clock) begin
Q <= 0; // default assignment
if (some_complicated_condition) begin
if (some_other_condition) begin
Q <= 1; // change your mind
end
end
end

This is a very useful coding trick, but you can see
that the two nonblocking assignments MUST take effect
in the same order that they were executed, if this
code is to make sense.

Regards
--
Jonathan Bromley

Non-Blocking versus blocking

nemo

Guest

Jan Decaluwe

Guest

Jan Decaluwe

Guest

Jonathan Bromley

Guest

Jan Decaluwe

Guest

Patrick Maupin

Guest

Patrick Maupin

Guest

Patrick Maupin

Guest

Ali Karaali

Guest

Jonathan Bromley

Guest

Jonathan Bromley

Guest

Andy

Guest

Patrick Maupin

Guest

Patrick Maupin

Guest

Jan Decaluwe

Guest

Patrick Maupin

Guest

Ali Karaali

Guest

Jonathan Bromley

Guest

Ali Karaali

Guest

Jonathan Bromley

Guest

Log in

Welcome to EDABoard.com

Sponsor