FSM single process...BIG question



Hi all,
I've a problem to efficiently unregister output from FSM written in
single process style...

in FSM synch process (don't pay attention to syntax..):

case IDLE:
if (pippo = '1') then
next_state <= START;
if (pluto = '1') then
outp <= '0';
outp <= '1';
end if;
end if;

outp is registered...if I want to avoid that register I should (in anothe
process) duplicate a lot of logic:

if (state = IDLE) then
if (pippo = '1' and pluto = '1') then
outp <= '0'
elsif (pippo = '1' and pluto = '0') then
outp <= '1';
else outp <= 'Z' -- for example
end if;
end if;

Is it possible to unregister output without duplicating that logic (I
seems to me very poor coding...)...

I've read something about using variables...but I don't understand exactl
how...please show me an example...

Thanx for any help

You may already know this, but...

Your combinatorial process is not equivalent to the clocked process.
An unassigned signal is not driven to 'Z'; it creates a latch. Read
below for more information.

If this was just an unfortunate choice of "for example" please


first of all...thanx for your help....

The question is...basically in a case like that is better to use 2 proces
method...is it right??

I'm studing vhdl by myself and on every textbook I've found around the bes
way to code FSM is 2 process...
I started writing my code (in 2 process style)...but, taking a look at cod
written by others I see very often only 1 process...crazy...
I tried to implement some part in 1 process style because it seems to m
more "simple"..cleaner...but the output is registered and doesn't fit wel
with rest of the code (other FSM)...

I've started to ask me if there is a method to implement in a 1 proces
unregistered output FSM...

I've tried something like this:

variable v_tx_data : std_logic_vector(7 downto 0);
if (reset='1') then
act_txd_state <= TXD_IDLE;
v_tx_data := (others => '0');
elsif (clk'event and clk ='1') then

case act_txd_state is

when TXD_IDLE =>
if (txValid = '1') then
if (dataIn = "00000000") then
v_tx_data := "01000000";
else v_tx_data := "11000000";
end if;
act_txd_state <= TXD_ACTIVE;
end if;

when TXD_ACTIVE =>
if (txReady = '1') then
act_txd_state <= TXD_END;
end if;

when TXD_END =>
v_tx_data := dataIn;
if (txValid = '0') then
act_txd_state <= TXD_IDLE;
end if;
end case;

end if;

tx_data <= v_tx_data;

Obviously v_tx_data is a FF...and the output is registered....I don't fin
any way to unregister output in a process like that...

Any hint??

Mike in you example output is registered or not?? eventually...how can
apply your 1 process example to the
above code???

Yes code is very simple...I've already coded a version with 2 process an
unregistered output...but, to better
understand, I would like to discover if it is possible in one process...

In general..better 1 process and registered outout or 2 process and choos
if un/register output???


Note that while out_port <= f(in, var), located after the clk/rst end
if statement, will produce combinatorial logic from in to out_port (or
out_sig), the simulation and synthesis will mismatch slightly. Outport
will only be updated on any event of clk or rst (i.e. both edges of
both) in simulation, but continously in hardware (i.e. any edge of
in). Whether this makes a difference is dependent upon what is reading
the out_port. This can be avoided by adding any "in" signals used as
such to the process sensitivity list.

If pluto is an input to the process, then the only way to do it and
retain the sim/hw matching behavior for all cases is to use a separate
combinatorial process. I generally do not like combinatorial paths
from in to out within a state machine, and will seek to implement it
in a register if at all possible.

Combinatorial processes, especially ones as complex as state machines
with nested case & if statements, are prone to create latches. Avoid
the combinatorial process, and avoid the possibility of a latch. If
you cannot avoid the combinatorial process, then at least add default
assignments for all output signals in the very beginning of the
process, before any conditional statements are executed. This is much
easier to write, and to review/audit, than the old addage to include
an else for every if, etc.

You may already know this, but...

Your combinatorial process is not equivalent to the clocked process.
An unassigned signal is not driven to 'Z'; it creates a latch. Read
below for more information.

If this was just an unfortunate choice of "for example" please

Yes code is very simple...I've already coded a version with 2 process and
unregistered output...but, to better
understand, I would like to discover if it is possible in one process...

In general..better 1 process and registered outout or 2 process and choose
if un/register output???
Why do you want to avoid the output register? Is it for latency or
resource usage? Is there really no other sollution?
I strongly suggest to register outputs whenever possible. For all
other cases I tend to use concurrent signal assignments. Of course I
use also combinatorical process in cases it would simplify concurrent
statements (which is seldom the case)

process (clk,rst)
if reset = active then
sig_a <= '0';
elsif rising_edge(clk)
sig_a <= input_a;
end if;
end process;

comb_out <= (sig_a xor input_a) when fsm_state=idle else '0';
Yes code is very simple...I've already coded a version with 2 proces
unregistered output...but, to better
understand, I would like to discover if it is possible in on

In general..better 1 process and registered outout or 2 process an
if un/register output???

Why do you want to avoid the output register? Is it for latency or
resource usage? Is there really no other sollution?
I strongly suggest to register outputs whenever possible. For all
other cases I tend to use concurrent signal assignments. Of course I
use also combinatorical process in cases it would simplify concurrent
statements (which is seldom the case)

process (clk,rst)
if reset = active then
sig_a <= '0';
elsif rising_edge(clk)
sig_a <= input_a;
end if;
end process;

comb_out <= (sig_a xor input_a) when fsm_state=idle else '0';

that's only because that is a simple fsm inside a project...and 1 cloc
latency on that signal doesn't fit with other modules....
Yes...I agree with you...I tend to use registers when possible...using
process style (till now) you have to take care of registering outpu
therefore is up to you to decide...


I would like to go deep in both methods...I already manage 2 process....no
I would like to improve the 1 process knowledge to better fit my style t
every situation...

My conclusion is...when your output is regstered (probably very goo
choise) better to use 1 process style...when some/all outputs ar
unregistered better dive in 2 process style...that give more flexibilit
even if is more prone to errors (sensitivity list, inferred latc

Do you agree???


Textbooks are seldom about state of the art in anything. Just because
they seem to be in unanimous agreement about one coding method does
not mean that the state of the art agrees. Two-process FSMs require
twice the declarations, require complex sensitivity lists, require
additional code to avoid latches, prohibit using variables to keep
local data local, thwart simulation optimizations that take advantage
of common sensitivity lists, and a host of other problems.

There are two different practical issues at play here. One is whether
an output is a combinatorial function of just the state, and one is
whether the output is a also a combinatorial function of some
combinatorial inputs.

The former can be mitigated by assigning registered outputs
"early" (i.e. when the FSM transitions to the state, not just when the
FSM is already in the state), or by assigning outputs from the state
variable after the clocked if statement ends.

You appear to want the output to be the latter: a combinatorial
function of the registered state and of the unregistered inputs.

While you can combine both into one process (with the combinatorial
input in the sensitivity list) as Mike has suggested, that is asking
for trouble, and if you don't correct the sensitivity list, then the
simulation and synthesis results will differ slightly, depending on
how/when the output is read by other processes/hardware. It is also
potentially prone to creating latches, to which a simple clocked
process is immune.

You really should create this output in a separate process (or
concurrent assignment), but there is no need to separate the FSM into
two processes.

The best way to avoid latches is to avoid combinatorial processes.
If you cannot avoid a combinatorial process, keep it as simple as
possible (doing only what has to be done combinatorially) and use up
front default assignments.

I would suggest you look at your design requirements, and decide what
must be combinatorial, and what can tolerate a clock delay.

Do you need to avoid a clock delay between tx_valid and tx_data?
Do you need to avoid a clock delay between data_in and tx_data?
Do you need to avoid a clock delay between tx_ready and tx_data?

Make sure you really need to avoid that clock delay. "It would be
better if..." does not count. A working design that is simple to
write, understand and maintain is much better than a design that is
none of those but happens to be a clock cycle faster.

Find a way to accomplish combinatorially only what needs to be
combinatorial. It appears that tx_data is merely a function of
tx_valid, tx_ready, and data_in. What exactly do you need an FSM for?
You could use an FSM for error handling (what happens if tx_valid goes
high, then low, and tx_ready never fired?). But you don't appear to be
using it for that (maybe just to simplify the example?).

I wasn't going to provide any code to the OP mainly because I didn't
have time to worry with it. But I saw that a verbal explanation
wasn't very good and I thought I could get away with a quick edit of
his code. Obviously I made a mistake, but that doesn't make
combinatorial processes bad. You can also create latches in
concurrent statements too. Should you avoid using them as well?

I remember when I was just learning VHDL there were things about using
integers and variables that I just didn't get. I posted that someone
should avoid using them. I was chided for that and I have remembered
it since.

Just because you don't know how to use a feature effectively doesn't
mean no one else should.

Hi Andy,
as you can understand I'm not an expert.....

The former can be mitigated by assigning registered outputs
"early" (i.e. when the FSM transitions to the state, not just when the
FSM is already in the state), or by assigning outputs from the state
variable after the clocked if statement ends.

pYou appear to want the output to be the latter: a combinatorial
function of the registered state and of the unregistered inputs.
Could you elaborate better on this with an example on my FSM???

Find a way to accomplish combinatorially only what needs to be
combinatorial. It appears that tx_data is merely a function of
tx_valid, tx_ready, and data_in. What exactly do you need an FSM for?
You could use an FSM for error handling (what happens if tx_valid goes
high, then low, and tx_ready never fired?). But you don't appear to be
using it for that (maybe just to simplify the example?).

Uhmm...I've written a fsm manly to describe the interface...the outpu
becomes "01000000"/"11000000" when tx_valid goes high depending on dat
input...then stay on that value till tx_ready becomes high....at that poin
(on the next clock cycle) tx_data will be equal to datain till tx_vali
goes to '0'...yes..I miss probably some error handling...the code is no
completely written...

Every signal on output should change when clock raise...and every signal i
input should change/be checked according to the same cloc
front....input/output interfaces are synchronous to the clock...
How do you describe all this combinatorially???
More simple ways are welcome....if you/somebody wanna help me... :))


I've tried one of your suggestion...tell me if I'm wrong (this is a quic
written code...probably there are mistakes...pay attention t

if (reset='1') then
act_txd_state := TXD_IDLE;
v_delay := '0';
elsif (clk'event and clk ='1') then

v_delay := '0';
case act_txd_state is

when TXD_IDLE =>
if (txValid = '1') then
act_txd_state := TXD_ACTIVE;
end if;

when TXD_ACTIVE =>
if (txReady = '1') then
v_delay := '1';
act_txd_state := TXD_END;
end if;

when TXD_END =>
if (txValid = '0') then
act_txd_state := TXD_IDLE;
end if;
end case;

end if;

if (act_txd_state = TXD_ACTIVE) OR (v_delay = '1')then
if (dataIn = "00000000") then
tx_data <= "01000000";
tx_data <= "11000000";
end if;
elsif (act_txd_state = TXD_END) then
tx_data <= dataIn;
tx_data <= "00000000";
end if;

This "should" be equivalent (not tested yet)...I've used v_delay FF t
delay 1 clock the output change between ACTIVE and END...I've no mor
registered output...but I've been forced to add dataIn in the sensitivit
list (to avoid latches)...

That is change you were talking about??


Just to elaborate on Thomas Stanka's post...

Say that you've inherited a design that has a large FSM that had a
clocked process and the BRAM were from an old technology and
interfaced to the FSM with combinatorial inputs. The new device has
registered BRAM inputs so you want to get rid of a pipeline delay in
the FSM but want to keep the changes to a minimum.

I went through the exercise with the example to see how messy it would

library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;

entity state_test is

port (
clk : in std_logic;
reset : in std_logic;
txValid : in std_logic;
txReady : in std_logic;
dataIn : in std_logic_vector(7 downto 0);

tx_data_out : out std_logic_vector(7 downto 0)

end state_test;

architecture behav of state_test is
type state_typ is (TXD_IDLE, TXD_ACTIVE, TXD_END);

signal act_txd_state : state_typ := TXD_IDLE;
signal tx_data_next, tx_data : std_logic_vector(7 downto 0);

tx_data_out <= tx_data_next;

tx_data_next < "00000000" when reset '1'
"01000000" when ((act_txd_state = TXD_IDLE) and (txValid = '1')
and (dataIn = "00000000")) else
"11000000" when ((act_txd_state = TXD_IDLE) and (txValid = '1')
and (dataIn /= "00000000")) else
dataIn when ((act_txd_state TXD_END))

sync_proc : process (clk, reset) is
if (reset = '1') then
act_txd_state <= TXD_IDLE;
tx_data <= (others => '0');
elsif (clk'event and clk = '1') then
tx_data <= tx_data_next;

case act_txd_state is

when TXD_IDLE =>
if (txValid = '1') then
act_txd_state <= TXD_ACTIVE;
end if;

when TXD_ACTIVE =>
if (txReady = '1') then
act_txd_state <= TXD_END;
end if;

when TXD_END =>
if (txValid = '0') then
act_txd_state <= TXD_IDLE;
end if;
end case;

end if;
end process sync_proc;
end behav;
I went through the exercise with the example to see how messy it would

library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;

entity state_test is

port (
clk : in std_logic;
reset : in std_logic;
txValid : in std_logic;
txReady : in std_logic;
dataIn : in std_logic_vector(7 downto 0);

tx_data_out : out std_logic_vector(7 downto 0)

end state_test;

architecture behav of state_test is
type state_typ is (TXD_IDLE, TXD_ACTIVE, TXD_END);

signal act_txd_state : state_typ :=3D TXD_IDLE;
signal tx_data_next, tx_data : std_logic_vector(7 downto 0);

tx_data_out <=3D tx_data_next;

tx_data_next <=3D
"00000000" when reset =3D
"01000000" when ((act_txd_state =3D TXD_IDLE) and (txValid =3D '1')
and (dataIn =3D "00000000")) else
"11000000" when ((act_txd_state =3D TXD_IDLE) and (txValid =3D '1')
and (dataIn /=3D "00000000")) else
dataIn when ((act_txd_state =3D

sync_proc : process (clk, reset) is
if (reset =3D '1') then
act_txd_state <=3D TXD_IDLE;
tx_data <=3D (others =3D> '0');
elsif (clk'event and clk =3D '1') then
tx_data <=3D tx_data_next;

case act_txd_state is

when TXD_IDLE =3D
if (txValid =3D '1') then
act_txd_state <=3D TXD_ACTIVE;
end if;

if (txReady =3D '1') then
act_txd_state <=3D TXD_END;
end if;

when TXD_END =3D
if (txValid =3D '0') then
act_txd_state <=3D TXD_IDLE;
end if;
end case;

end if;
end process sync_proc;
end behav;
Yes...indeed that is the only way (or one of few ways) to eliminate th
output FF...
In some way it is "similar" to what I've written with variables but usin
concurrent signal assignment....effectively it is not a complete mess...

Anyway...probably I was wrong...I don't need to avoid registered outpu
(and a lot of you were right)...in fact I'm always assigning the "output
of a FF...therefore I'm sure that when fsm reach a state the output wil
change to the value that is required on that state on the same clock fron
and I don't have to wait for another clock front to have that value o
output...am I correct???

I saw it in simulation...looking at the RTL synthesys diagram wasn't clea
for me....I have, further, the advantage of registered output....

Probably I will rewrite everything (or the most part) in 1 proces
style...it seems more clear....I'm evaluating...


Anyway...probably I was wrong...I don't need to avoid registered output
(and a lot of you were right)...in fact I'm always assigning the "output"
of a FF...therefore I'm sure that when fsm reach a state the output will
change to the value that is required on that state on the same cloc
and I don't have to wait for another clock front to have that value on
This is not completely true...state change immediately if change conditio
is met...output is still at values on previous state...on the next cloc
front the output goes to the new state value...this is the effect of th
output FF...I'm sorry a misunderstanding....

...the difference is now clear....I've just simulated the tw
situations...(1 process, 2 process-no register)...and checked th

Anyway...since all input signals "should be sinchronous" to the raise fron
of the clock and the produced output MUST be synch I think that the bes
way to achieve that is registering output and evaluating input on cloc
raise...my previous implementation was wrong because output is allowed t
change asynchronously...I have to correct...and I think that 1 proces
style fit better the requirements...

If in some cases I will need to unreg output I will use concurren
statements outside of the process as pointed out in the discussion...

Thanx to everyone who helped me...

