RTL 10 Commandments

B

Beanut

Guest
Here's a list of rules I've compiled over time from experience and web
sites like this one. Please provide feedback. My target is for
engineers and students new to VHDL/Verilog who might not know what many
of us assume to be a given. I would like to correct any errors and
make sure no criticul rules are missed (such as commandment 11, You
shall only use numeric_std).

Enjoy,
Beanut

------------------------------------
1. Always use synchronous processes.

2. Always reset signals.

3. Use an asynchronously asserted Reset, synchronous deassertion.

4. All asynchronous inputs must be double synchronized to prevent
metastability.
http://klabs.org/richcontent/MAPLDCon00/Presentations/Session_A/A5_Erickson_S.PDF

5. Never use latches(latch=curse word). Check synthesis log for
warning messages where latches are used.
a. Latch glitches
b. Transparent latches cause oscillation
c. Difficult timing analysis

6. Only use clocks that are derivatives of master clock.

7. If rule #6 is violated, use double synchronizer for all signals
crossing clock domains.

8. Only use clocks for synchronous processes, do not substitute signals
for clocks(i.e. on clock signals for 'event command).

process 1:
if clk'event and clk = 1 then
irq <= data_bus;

process 2: Bad Example: process 2: Good Example:
if irq'event and irq = 1 then if clk'event and clk= 1 then
clear <= '1'; if irq = 1 then
clear <= '1';

9. Implement state machines with one of the following implementations:
a. 3 process state machine.
One synchronous process for updating state with next_state
One state machine process updating next_state only
One state machine process updating outputs only
b. 2 process state machine (preferred method)
One synchronous process for updating state with next_state
One state machine process updating next_state and outputs

10. State machines should assign outputs a hard coded value only. If
previous values are used in signal assignment statement then a latch
will be inferred to store that previous value.

Example:
Good: output <= '1' or output_bus<= "0010"
Bad: out <= outt xor input ; or out_bus <= out_bus(3 downto 1) & '0';

If previous value needed, use a synchronous if/else statement with
imbedded case or if/else statements
-------------END----------------
 
Beanut wrote:
Here's a list of rules I've compiled over time from experience and web
sites like this one. Please provide feedback.
Good work!
I believe all the way down to 9.


6. Only use clocks that are derivatives of master clock.
(a single clock is preferred)

8. Only use clocks for synchronous processes, do not substitute signals
for clocks(i.e. on clock signals for 'event command).

process 1:
if clk'event and clk = 1 then
irq <= data_bus;

process 2: Bad Example: process 2: Good Example:
if irq'event and irq = 1 then if clk'event and clk= 1 then
clear <= '1'; if irq = 1 then
clear <= '1';
I believe in the commandment, but would prefer
"if rising_edge(clk)" in the examples.

9. Implement state machines with one of the following implementations:
a. 3 process state machine.
One synchronous process for updating state with next_state
One state machine process updating next_state only
One state machine process updating outputs only
b. 2 process state machine (preferred method)
One synchronous process for updating state with next_state
One state machine process updating next_state and outputs
I would remove all references to "state machines" from
the commandments because this is not a vhdl abstraction.
The vhdl version of this commandment would concern
how to make use of local variables in the process(es)
made synchronous by the first commandment.

10. State machines should assign outputs a hard coded value only. If
previous values are used in signal assignment statement then a latch
will be inferred to store that previous value.
This commandment is unnecessary since the first commandment
makes latches impossible to create.


-- Mike Treseler
 
Hi,
I like the commandment, but 10. point has better definitions:

10. State machines should assign outputs a hard coded value only. If
previous values are used in signal assignment statement then a latch
will be inferred to store that previous value.

All outputs should be declared in the following way:
State_B : process(...)
begin
Output1 <= '0'; -- all outputs are assigned
Output2 <= '0'; -- a false vale at the beginning
-- before case statement
...
case ... is
when ... =>
if(...) then
...
Output1 <= '1'; -- then it is assigned true
-- value when it is needed
elsif(...) then
...
Output2 <= '1';
else
...
Output3 <= '1';
endif;

There is no latch generated in the above way.
This is my experiences.

Another important thing is:
nextstate assignment should be added into any code section, otherwise a
latch would be generated for nextstate.

Example:
when xxx =>
if(..) then
NextState_S <= State0_S;
elsif(...) then
NextState_S <= State1_S;
end if;

Do you see problem in the above code?
Correct answer:
when xxx =>
if(..) then
NextState_S <= State0_S;
elsif(...) then
NextState_S <= State1_S;
else
NextState_S <= State2_S; -- <-- if it is missed,
-- a latch for NextState_S would be generated.
end if;

Weng
 
As a Student, I greatly appreciate your effort. I knew 1 or 2, but
most of these are eye openers.

As soon as this thing is finalized, I'm going to post it on my wall :D
 
"Beanut" <fourbeans@gmail.com> writes:

Here's a list of rules I've compiled over time from experience and web
sites like this one. Please provide feedback. My target is for
engineers and students new to VHDL/Verilog who might not know what many
of us assume to be a given. I would like to correct any errors and
make sure no criticul rules are missed (such as commandment 11, You
shall only use numeric_std).
[snip]

4. All asynchronous inputs must be double synchronized to prevent
metastability.
http://klabs.org/richcontent/MAPLDCon00/Presentations/Session_A/A5_Erickson_S.PDF
Beware of blindly double-sync'ing a bus. This will NOT give you what
you want.

[personal rant about designing async clock boundary fifos deleted]

5. Never use latches(latch=curse word). Check synthesis log for
warning messages where latches are used.
a. Latch glitches
b. Transparent latches cause oscillation
c. Difficult timing analysis
I agree, except that Latches are very useful in DDR interface
blocks. But hopefully this isn't something you assign to a freshman.

9. Implement state machines with one of the following implementations:
a. 3 process state machine.
One synchronous process for updating state with next_state
One state machine process updating next_state only
One state machine process updating outputs only
b. 2 process state machine (preferred method)
One synchronous process for updating state with next_state
One state machine process updating next_state and outputs
Actually, I don't see the value of this rule. But I'm not a beginner
anymore, so I know where I can cut corners and still avoid getting
burned :)

[snip]

11. Do not mix behavioural and structural in a single block.
Mixing them requires a synthesis run on the higher hierarchical
levels, and may take obscene amount of runtime. When the block is
fully structural, the subblocks may simply be linked together.


Regards,

Kai
--
Kai Harrekilde-Petersen <khp(at)harrekilde(dot)dk>
 
Hi Beanut,

My comments below:

"Beanut" <fourbeans@gmail.com> writes:

Here's a list of rules I've compiled over time from experience and web
sites like this one. Please provide feedback. My target is for
engineers and students new to VHDL/Verilog who might not know what many
of us assume to be a given. I would like to correct any errors and
make sure no criticul rules are missed (such as commandment 11, You
shall only use numeric_std).

Enjoy,
Beanut

snip
2. Always reset signals.
Not always. Only when it matters, especially in FPGAs, as otherwise
you'll have to route that signal around to places that don't really
need it. See

http://www.xilinx.com/xlnx/xweb/xil_tx_display.jsp?sGlobalNavPick=&sSecondaryNavPick=&category=&iLanguageID=1&multPartNum=1&sTechX_ID=kc_smart_reset

<snip>
4. All asynchronous inputs must be double synchronized to prevent
metastability.
http://klabs.org/richcontent/MAPLDCon00/Presentations/Session_A/A5_Erickson_S.PDF
As someone else has said already - be careful doing this with buses...

<snip>
7. If rule #6 is violated, use double synchronizer for all signals
crossing clock domains.
See 4.:)

<snip>
9. Implement state machines with one of the following implementations:
a. 3 process state machine.
One synchronous process for updating state with next_state
One state machine process updating next_state only
One state machine process updating outputs only
b. 2 process state machine (preferred method)
One synchronous process for updating state with next_state
One state machine process updating next_state and outputs
I always do my state machines in a single process, it's easier to
maintain that way (I find). Never had a problem with synthesis so far...

10. State machines should assign outputs a hard coded value only. If
previous values are used in signal assignment statement then a latch
will be inferred to store that previous value.
If you do it all in one synch process, it's difficult to get latches :)
Another reason for the one process approach...

All IMHO!

Cheers,
Martin

--
martin.j.thompson@trw.com
TRW Conekt - Consultancy in Engineering, Knowledge and Technology
http://www.trw.com/conekt
 
Martin Thompson <martin.j.thompson@trw.com> writes:
"Beanut" <fourbeans@gmail.com> writes:
Here's a list of rules I've compiled over time from experience and web
sites like this one. Please provide feedback. My target is for
engineers and students new to VHDL/Verilog who might not know what many
of us assume to be a given. I would like to correct any errors and
make sure no criticul rules are missed (such as commandment 11, You
shall only use numeric_std).

Enjoy,
Beanut

snip
2. Always reset signals.

Not always. Only when it matters, especially in FPGAs, as otherwise
you'll have to route that signal around to places that don't really
need it. See

http://www.xilinx.com/xlnx/xweb/xil_tx_display.jsp?sGlobalNavPick=&sSecondaryNavPick=&category=&iLanguageID=1&multPartNum=1&sTechX_ID=kc_smart_reset
But for an ASIC, I think this is very good advice. Take a flop without
a reset, add a bit of gate-level simulation, and hey presto! you've
got X'es all over the place.

snip
9. Implement state machines with one of the following implementations:
[snip]

I always do my state machines in a single process, it's easier to
maintain that way (I find). Never had a problem with synthesis so far...
That's been my approach too for the last, uh, 8 years or so.

10. State machines should assign outputs a hard coded value only. If
previous values are used in signal assignment statement then a latch
will be inferred to store that previous value.


If you do it all in one synch process, it's difficult to get latches :)
Another reason for the one process approach...
Well, Synopsys DC will infer latches if you reset a signal, but never
assign to it. Blame the stupid software...


Kai
--
Kai Harrekilde-Petersen <khp(at)harrekilde(dot)dk>
 
Here's a quick summary of the ideas. Thanks for the feedback.

6. Use of a signle clock is strongly encouraged. If not possible, use
clocks that are derivtives of the master clock.

8. Replace 'event with rising_edge

11. Do not mix behavioural and structural in a single block.
Mixing them requires a synthesis run on the higher hierarchical
levels, and may take obscene amount of runtime. When the block is
fully structural, the subblocks may simply be linked together.

Ruels 9 and 10 might be overkill. I use a single process like many
here, but I see a lot of a new coders using a lot of state machines
they found in a book without using the synchronous state <= next_state
process. I'm on board with discouraging the use of these 2&3 process
state machines if everyone here agrees. I was hesitant to enforce my
style on others as long as their style could be implemented safely.

I am an advocate of having a global reset and many still recommend it:
http://www.chipdesignmag.com/fpgadeveloper/august2005.html
I do not buy into the routing issue because I treat the reset as a high
fanout net. In an ASIC I would route it as a clock, in an FPGA I
connect it to one of the prerouted clocks. Does anyone else advocate
removing global reset?

Will someone provide some insight into the problem with synchronizing a
bus? I was intending commandment 4 for control signals, but I've never
had a bus problem.

Enjoy,
Beanut
 
Beanut wrote:

Here's a list of rules I've compiled over time from experience and web
sites like this one. Please provide feedback. My target is for
engineers and students new to VHDL/Verilog who might not know what many
of us assume to be a given.
You should always highligt this point, because e.g. latches are very
nice for low power design. For a HDL beginner I agree. So call hese
rules "rules, that are useful, unless you know what you are doing". ;-)

Ralf
 
Beanut a écrit:
Ruels 9 and 10 might be overkill. I use a single process like many
here, but I see a lot of a new coders using a lot of state machines
they found in a book without using the synchronous state <= next_state
process. I'm on board with discouraging the use of these 2&3 process
state machines if everyone here agrees.
Hi
I used to write my state machines with 2 processes but found that
synchronous single-process ones were much easier to write.
Besides, rule one enforces the single-process style :eek:)

Nicolas
 
Beanut wrote:

9. Implement state machines with one of the following implementations:
a. 3 process state machine.
One synchronous process for updating state with next_state
One state machine process updating next_state only
One state machine process updating outputs only
b. 2 process state machine (preferred method)
One synchronous process for updating state with next_state
One state machine process updating next_state and outputs
Well, include me in the list of people who prefer the single-process
state machine.

-a
 
"Beanut" <fourbeans@gmail.com> writes:

Will someone provide some insight into the problem with synchronizing a
bus? I was intending commandment 4 for control signals, but I've never
had a bus problem.
IF you use independent double-clocking of each bit on a bus, you risk
sampling a mixture of X(t) and X(t+1). Very bad idea.

Instead, you need to put it into a FIFO, with a gray-coded index, and
then double-clock that bus (only one bit changes per clock cycle, so
it's OK here), and then have a similar gray-coded read-pointer to get
the data out of the FIFO. This way, you never get garbled data (I've
left out a few details, but you should get the general point).

You can add bells and whistles like full/empty flags as well as
"flush" conditions, in order to sync up both sides of the fifo if one
side gets cleared due to a reset of the clock domain.

Also, a low-bandwidth FIFO (which is just a 1-element fifo with
elaborate controls to handle the clock boundary changes), can be made.

Since many people don't consider all the cases related to clock-domain
changes, I designed such a generic FIFO about 5 years ago with
configurable data width at work. After being scrutinized by a few of
my colleagues, it became a house rule that if you had to pass a
databus across a clock boundary, you had to use one of my "resync"
FIFOs.


Kai
--
Kai Harrekilde-Petersen <khp(at)harrekilde(dot)dk>
 
Sounds cool. Interested in sharing/trading the code? I'd prefer using
something verified than spit together on my own.

Enjoy!
 
"Beanut" <fourbeans@gmail.com> writes:

Sounds cool. Interested in sharing/trading the code? I'd prefer using
something verified than spit together on my own.
I'm sorry, but I cannot share it with you - it's company property.

But it should be fairly simple to design a (very) similar FIFO, based
on my general description.

The only thing that needs consideration is how many FIFO levels you
need, and how you are going to implement the gray-coded version of
"increment".

Hints:
* 8 FIFO levels may not be enough, due to round-trip time (across the
clock boundary).
* A simple graynext() can be done by doing graycode(grayuncode(N)+1).
I spent a weekend figuring out how to do a direct computation of the
graynext() function. There really is a general formula for that, for
gray sequences of length 2**N.

Cheers,


Kai
--
Kai Harrekilde-Petersen <khp(at)harrekilde(dot)dk>
 
"Simulation and Synthesis Techniques for Asynchronous FIFO Design with
Asynchronous Pointer Comparisons" by Clifford E. Cummings and Peter
Alfke is very detailed in code and it teaches how to design a
sophisticated asynchronous FIFO with thorough details.
Clifford E. Cummings website address is: www.sunburst-design.com. You
may download the paper from the website.

Weng
 
For example, if the MASTER_CLK delivers input at 1 sample(8 bits) per
40 microseconds, can I generate a SLAVE_CLK which can process the rest
of the circuit (implying every bit) at 5 microseconds (in effect, 5
microseconds per bit).
The best way to do that is to use a single clock and as many clock
enables as you need, so you might have a single high speed clock, then
one clock enable for every bit and another for every byte.

Use the synchronous process template:

process(clock, reset)
if (reset = '1') then
<reset your signal(s) here>
elsif rising_edge(clock) then
if (clock_enable = '1') then
<do stuff here>
end if;
end if;
end process;

In that case you might have one process using bit_enable as the clock
enable, and another process using byte_enable as the clock enable.
 
6. Use of a signle clock is strongly encouraged. If not possible,
use clocks that are derivtives of the master clock.
I would modify #6 like this:

6. Use a single clock and as many clock enables as necessary. Don't
use multiple clocks unless there's a really good reason. If it's
necessary to derive one clock from another, all asynchronous design
problems apply.

And a side note: most DPLLs will allow you to select the phase between
the input and output clocks, so if you know what you're doing and spend
a lot of time analyzing and dealing with all asynchronous design
issues, you can safely derive one clock from another. But deriving
clocks should still be avoided.

Another very similar version of the 10 commandments:
http://www.bawankule.com/verilogcenter/files/10_2.pdf
 
Hi Beanut,

Here's a quick summary of the ideas. Thanks for the feedback.

6. Use of a signle clock is strongly encouraged. If not possible, use
clocks that are derivtives of the master clock.
The wonderful thing about engineering is the exceptions.
In general I like the rule, however, I have done a UART using
the main clock and a 16X load enable and I have done a UART
using a separate clock for the 16X clock and have to say that
it was much easier using the 16X clock and crossing the clock
domain than using the 16X load enable.


Ruels 9 and 10 might be overkill. I use a single process like many
here, but I see a lot of a new coders using a lot of state machines
they found in a book without using the synchronous state <= next_state
process. I'm on board with discouraging the use of these 2&3 process
state machines if everyone here agrees.
I use 2 process and discourage the use of 1 process.
While 1 process will work for many cases, if you
need an output combinationally, it must be coded separately
in an ad-hoc sense. While some experts handle this with
ease, I have also seen others inherit a one process
statemachine and fail to figure out how to modify it to make
it meet their requirements. As a result the original design
gets scrapped.

With the 2 process, I do take care with mealy outputs and
register them when possible. I also take care to make
sure off chip signals get registered.



I am an advocate of having a global reset and many still recommend it:
http://www.chipdesignmag.com/fpgadeveloper/august2005.html
I do not buy into the routing issue because I treat the reset as a high
fanout net.
I think the big deal here is that the global reset must settle in
one clock. Hence, you better do static timing analysis on it. If
it does not settle in one clock, then you will have one part of your
circuit out of reset while another part is in reset - yikes. IE:
one bit of your statemachine changes state while another bit
is still in reset.

According to Xilinx application notes they do not guarantee
the timing of their GSR (global reset net) and seem to
recommend against using it in timing critical applications.
I would love for Xilinx to elaborate on this as I could not
determine if there was a frequency at which I could safely
use the GSR as my global reset.

My #1 rule is:
Always code using a block diagram as a flow chart.
I like to code the hardware left to right, top to bottom
(although there are other good ways to linearize the logic also).

If after synthesis you do not get what you wanted, re-draw/refine
the block diagram and re-code.

Cheers,
Jim
--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Jim Lewis
Director of Training mailto:Jim@SynthWorks.com
SynthWorks Design Inc. http://www.SynthWorks.com
1-503-590-4787

Expert VHDL Training for Hardware Design and Verification
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
Jim Lewis wrote:

While 1 process will work for many cases, if you
need an output combinationally, it must be coded separately
in an ad-hoc sense. While some experts handle this with
ease, I have also seen others inherit a one process
statemachine and fail to figure out how to modify it to make
it meet their requirements. As a result the original design
gets scrapped.
Fair enough.

Nonetheless, a solution does exist.
Here is an example of tapping a combinational
node inside a synchronous process.

http://home.comcast.net/~mike_treseler/single_process.vhd
http://home.comcast.net/~mike_treseler/single_process.pdf

-- Mike Treseler
 
Mike,
Jim Lewis wrote:

While 1 process will work for many cases, if you
need an output combinationally, it must be coded separately
in an ad-hoc sense. While some experts handle this with
ease, I have also seen others inherit a one process
statemachine and fail to figure out how to modify it to make
it meet their requirements. As a result the original design
gets scrapped.


Fair enough.

Nonetheless, a solution does exist.
Here is an example of tapping a combinational
node inside a synchronous process.

http://home.comcast.net/~mike_treseler/single_process.vhd
http://home.comcast.net/~mike_treseler/single_process.pdf
I was thinking more of something where comb_v was an output
of the entity. The solution is similar in nature to what you
proposed except that it must be done outside of the process:

ab <= a and b ;
comb_o <= ab when c = '1' else not ab ;

The point is that if the logic requires this type of
solution, a 2 process statemachine can do a good job
keeping the logic/intent compartmentalized.

Note that as a general thought, I prefer all outputs of
a block to be registered, however, sometimes it is not possible
to do this and meet the latency requirements of a design.
Going further, there is no reason to add latency to a design
unless there is a timing problem without the register.

Also sometimes the output of the statemachine simply goes to
the Load Enable of another register and will effectively
be registered anyway. With a 1 process statemachine, the
register would need to be in the same process as the statemachine
and would also need to have the same reset conditions
(another of my rules, if you don't reset all registers in
a design, then don't code registers that have different
reset conditions in the same process).

Since designs tend to evolve, I find the 2 process serves me
well since it keeps the code flexible. Everyone has their
own design style and I am sure we all gravitate to what works
well for us and the problems we have to solve on a day to day
basis.

2 process does cost a little more typing, however, that is
not a problem for me as I type fast enough. I find I spend
much more time planning the design than I do typing it in.

It is ironic that I have moved in a more methodical
direction as when I started I was much closer to the
hardware and sometimes coded statemachines as logic
operators feeding into register bits. Sure glad I
did not have to maintain that code.

Cheers,
Jim
--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Jim Lewis
Director of Training mailto:Jim@SynthWorks.com
SynthWorks Design Inc. http://www.SynthWorks.com
1-503-590-4787

Expert VHDL Training for Hardware Design and Verification
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 

Welcome to EDABoard.com

Sponsor

Back
Top