alternate synchronous process template

J

jens

Guest
Most of us are familiar with this synchronous process template...

synchronous_process_template: process(clock, reset)
begin
if reset = '1' then
-- initialize registers here
elsif rising_edge(clock) then
-- assign registers here
end if;
end process synchronous_process_template;

However one problem with it is if there are multiple signals being
assigned, but not all of them need the reset. If not all signals are
being reset, then a gated clock is created for those signals (which
will cause some compilers to puke and is a bad idea even if it doesn't
puke). One option is to use two processes (one with reset and one
without), but that doesn't always work that well, especially if there
are variables that all signals need access to.

Here's an alternate synchronous process template...

alternate_synchronous_process_template: process(clock, reset)
begin
if rising_edge(clock) then
-- assign all registers here
end if;
if reset = '1' then
-- initialize some or all registers here
end if;
end process alternate_synchronous_process_template;

For example...

if rising_edge(clock) then
s1 <= <whatever>;
s2 <= <whatever>;
end if;
if reset = '1' then
s1 <= (others => '0');
end if;

This template makes it possible to reset signal s1 but not s2. Running
it through Xilinx tools yielded the desired RTL results (and identical
to a two-process model).
 
jens wrote:
Most of us are familiar with this synchronous process template...

synchronous_process_template: process(clock, reset)
begin
if reset = '1' then
-- initialize registers here
elsif rising_edge(clock) then
-- assign registers here
end if;
end process synchronous_process_template;

However one problem with it is if there are ...
Actually there are several problems with this 'synchronous process
template' starting with the fact that it is not synchronous....it has
an asynchronous reset input therefore it is not synchronous. The other
problem with using this as a general template is that 'most' of the
time asynchronous resets should not be used in the design at all.
People tend to forget that when the reset signal gets shut off and it
violates the setup time of the clock that darn near anything can happen
to any of the outputs of that process. Sometimes the symptoms are
benign (something shows up one clock later), other times they are not
(notably, state machines).

The reason for the more serious symptoms is violating the basic design
principle that no asynchronous input should ever go into computing the
output of more than one storage element (generally flip flops) if the
next state of those storage elements also depend on the current state.

This template should never really be used unless you know for a fact
that 'reset' is guaranteed to have a known and acceptable setup/hold
time relationship to 'clock'. If that's not the case, or if you have a
situation where 'reset' might be asserted at a time when 'clock' is not
guaranteed to be running, then it's far better to design a shift
register type of synchronizer to synchronize the reset signal before
distributing it to reset any logic. That way only the shift register
design needs the asynchronous reset 'template' and shift registers have
no inherent problems with asynchronous inputs other than metastability.

If you have multiple clock domains with things needing to be reset in
each domain, than you need to generate individual resets that are each
synchronized to the appropriate clock.

Here's an alternate synchronous process template...

alternate_synchronous_process_template: process(clock, reset)
begin
if rising_edge(clock) then
-- assign all registers here
end if;
if reset = '1' then
-- initialize some or all registers here
end if;
end if; <-- Oops, you missed this one.
end process alternate_synchronous_process_template;
Here's an equivalent (i.e. produces the exact same synthesis and
simulation results) but doesn't require the reader to read from the
'bottom up'.

alternate_synchronous_process_template: process(clock, reset)
begin
if rising_edge(clock) then
if reset = '1' then
-- initialize some or all registers here
elsif;
-- assign all registers here
end if;
end if;
end process alternate_synchronous_process_template;

Good points though.

KJ
 
KJ,

Many systems require control of outputs and/or states even in the
absence of a clock; therefore an "asynchronous" reset is required.

The setup and hold of the "asynchronous" reset need only be with
respect to the deasserting edge (i.e. reset going away). That can be
accomplished by the following:

process (rstin, clk) is
variable rstreg : std_logic;
begin
if rstin = '1' then -- active high
rstout <= '1';
rstreg := '1';
elsif rising_edge(clk) then
rstout <= rst_reg; -- registered reference to rstreg
rstreg := '0';
end if;
end process;

In the above example, rstin is the asyncrhonous system reset input, and
rstout is the asynchonously asserted, synchronously deasserted reset
signal for the rest of the registers clocked by clk, and is tied to
their async reset inputs. This would need to be repeated for each
unrelated clock.

Another option is to use rstout from above to disable the clock, either
via individual clock enables on the flops, or via the enable feature of
the xilinx clock buffer, which does it "safely". Then send rstin to
the async rst inputs of all the flops. This has the advantage of one
net for all the async resets, and multiple versions of rstout for the
various clock buffers.

Finally, even your synchronous reset template does not meet the needs
of functioning registers even when reset is active. Because of the
elsif statement, those registers that were not reset will have their
clock enable inputs deasserted when reset is active, preventing them
from updating (because the elsif section does not execute if reset is
active). There really is no good way, in one process, to avoid the
separate "if reset" statement at the bottom of the process. On the
other hand, I kind of like the reset at the bottom, since when I'm
reading the code, I'm usually more interested in what happens while not
in reset, and that comes first. On the other other hand, you don't get
those warnings about feedback muxes when you forget to include a
register in the reset clause either.

Andy


KJ wrote:
jens wrote:
Most of us are familiar with this synchronous process template...

synchronous_process_template: process(clock, reset)
begin
if reset = '1' then
-- initialize registers here
elsif rising_edge(clock) then
-- assign registers here
end if;
end process synchronous_process_template;

However one problem with it is if there are ...
Actually there are several problems with this 'synchronous process
template' starting with the fact that it is not synchronous....it has
an asynchronous reset input therefore it is not synchronous. The other
problem with using this as a general template is that 'most' of the
time asynchronous resets should not be used in the design at all.
People tend to forget that when the reset signal gets shut off and it
violates the setup time of the clock that darn near anything can happen
to any of the outputs of that process. Sometimes the symptoms are
benign (something shows up one clock later), other times they are not
(notably, state machines).

The reason for the more serious symptoms is violating the basic design
principle that no asynchronous input should ever go into computing the
output of more than one storage element (generally flip flops) if the
next state of those storage elements also depend on the current state.

This template should never really be used unless you know for a fact
that 'reset' is guaranteed to have a known and acceptable setup/hold
time relationship to 'clock'. If that's not the case, or if you have a
situation where 'reset' might be asserted at a time when 'clock' is not
guaranteed to be running, then it's far better to design a shift
register type of synchronizer to synchronize the reset signal before
distributing it to reset any logic. That way only the shift register
design needs the asynchronous reset 'template' and shift registers have
no inherent problems with asynchronous inputs other than metastability.

If you have multiple clock domains with things needing to be reset in
each domain, than you need to generate individual resets that are each
synchronized to the appropriate clock.


Here's an alternate synchronous process template...

alternate_synchronous_process_template: process(clock, reset)
begin
if rising_edge(clock) then
-- assign all registers here
end if;
if reset = '1' then
-- initialize some or all registers here
end if;
end if; <-- Oops, you missed this one.
end process alternate_synchronous_process_template;

Here's an equivalent (i.e. produces the exact same synthesis and
simulation results) but doesn't require the reader to read from the
'bottom up'.

alternate_synchronous_process_template: process(clock, reset)
begin
if rising_edge(clock) then
if reset = '1' then
-- initialize some or all registers here
elsif;
-- assign all registers here
end if;
end if;
end process alternate_synchronous_process_template;

Good points though.

KJ
 
Andy wrote:
KJ,

Many systems require control of outputs and/or states even in the
absence of a clock; therefore an "asynchronous" reset is required.

I believe that is exactly what I said....what I was also said was that
'most' of the time asynchronous resets are not required and should not
be used because of the timing problems that one need to consider when
using async reset.

The setup and hold of the "asynchronous" reset need only be with
respect to the deasserting edge (i.e. reset going away).
Again...that's exactly what I said

That can be
accomplished by the following:

process (rstin, clk) is
variable rstreg : std_logic;
begin
if rstin = '1' then -- active high
rstout <= '1';
rstreg := '1';
elsif rising_edge(clk) then
rstout <= rst_reg; -- registered reference to rstreg
rstreg := '0';
end if;
end process;

In the above example, rstin is the asyncrhonous system reset input, and
rstout is the asynchonously asserted, synchronously deasserted reset
signal for the rest of the registers clocked by clk, and is tied to
their async reset inputs. This would need to be repeated for each
unrelated clock.

And this would be an example of a one deep shift register used to
asynchronously set 'reset' but synchronously clear it,
which...again...is exactly what I said. I'll also add here though that
the typical approach to resynchronizing async inputs is to put them
through two flops, the output of the second being the now
'synchronized' input, the output of the first having a higher
probability of being metastable than the first. Your example, of
course can be modified to add a second register, all I was saying is
that one should consider using a shift register. The depth of the
shift register would be a function of what you think the probabilities
are of metastability and how much you want to try to minimize it's
effects.

Another option is to use rstout from above to disable the clock, either
via individual clock enables on the flops, or via the enable feature of
the xilinx clock buffer, which does it "safely". Then send rstin to
the async rst inputs of all the flops. This has the advantage of one
net for all the async resets, and multiple versions of rstout for the
various clock buffers.

Good approach for Xilinx maybe others too. But has the disadvantage of
tying the code to a particular device architecture maybe a tad too
closely....and IMO tying your code to a particular vendor for something
as pedestrian as a reset signal is not a very good approach.

Finally, even your synchronous reset template does not meet the needs
of functioning registers even when reset is active. Because of the
elsif statement, those registers that were not reset will have their
clock enable inputs deasserted when reset is active, preventing them
from updating
Not following you on this one at all. There was no clock enable, it
was simply a template that uses synchronous resets instead of async

I'm usually more interested in what happens while not
in reset
I am too, but I also don't like having to wade through the code only to
gete to the bottom to find that none of this applies because I happen
to have reset active....this is a 'style' thing though.
, and that comes first.
After the two lines of code that say to check the reset first before
even bothering to look further for me, but to each his own.

KJ
 
KJ

Look again at the code I wrote; it IS a two deep shift register, with
metastability rejection built-in! Try it and see. (there is a typo, in
the reference to rst_out instead of rstout)

Your suggested solution in the case of needing a reset if the clock is
not running would not work. After you synchronize it (one or both both
edges), the fully synchronous circuits you suggest for the rest of the
chip would never respond to it without a clock! The point is, you must
have a half-synchronous reset, applied to the asynchronous resets of
your circuitry to ensure reset behavior in the absence of a clock,
combined with controlled behavior when it comes out of reset (and the
clock is active).

In your synchronous reset example, what happens if reset is active?
The elsif clause containing all the register assignments, including the
ones you don't want reset, does not execute. Therefore the synthesis
tool (synplicity) will use (not reset) as a clock enable on those
registers to prevent them from updating while reset is active.

Run your code through synthesis and see what you get...

Andy
 
Andy,

I think we're agreeing more than you might think
Look again at the code I wrote; it IS a two deep shift register, with
metastability rejection built-in! Try it and see. (there is a typo, in
the reference to rst_out instead of rstout)
You're right, my bad it is two deep.

Your suggested solution in the case of needing a reset if the clock is
not running would not work. After you synchronize it (one or both both
edges), the fully synchronous circuits you suggest for the rest of the
chip would never respond to it without a clock! The point is, you must
have a half-synchronous reset, applied to the asynchronous resets of
your circuitry to ensure reset behavior in the absence of a clock,
combined with controlled behavior when it comes out of reset (and the
clock is active).

I don't disaree with any of that. My point was that in 'most' cases, this
control is not necessary. Asynchronous resets have their place in the bag
of tricks, but if you need a resetable flip flop it should not be your first
choice. You should use a synchronously resetable flip flop unless the
situation for whatever reason, requires the outputs to be in a particular
state even with no clock.

In your synchronous reset example, what happens if reset is active?
The elsif clause containing all the register assignments, including the
ones you don't want reset, does not execute.
Understood and for 'most' cases (i.e. the ones that don't truly need to be
in any particular state immediately after reset) I would say that's just
fine. My method was simply to have the shift register essentially
'remember' that the reset came along, so that when the clock does start up,
it can provide that reset synchronously to the rest of the world. The way
it 'remembers' is that the shift register itself is asynchronously
resetable. Once the clock does start up it will still be outputing a
'reset' signal to the world that will shut off once the clock is running for
however many clocks deep I've made that shift register.

Therefore the synthesis
tool (synplicity) will use (not reset) as a clock enable on those
registers to prevent them from updating while reset is active.

As it should, that's what the code says to do.

Run your code through synthesis and see what you get...

I do, and it does just what I tell it to do

KJ
 
Oops....reset should NOT have been in the sensitivity list for my example of
a synchronous process template....apologies for the confusion.

Corrected_alternate_synchronous_process_template: process(clock)
begin
if rising_edge(clock) then
if reset = '1' then
-- initialize some or all registers here
elsif;
-- assign all registers here
end if;
end if;
end process alternate_synchronous_process_template;

KJ
 
KJ wrote:

Most of us are familiar with this synchronous process template...

synchronous_process_template: process(clock, reset)
begin
if reset = '1' then
-- initialize registers here
elsif rising_edge(clock) then
-- assign registers here
end if;
end process synchronous_process_template;

Actually there are several problems with this 'synchronous process
template' starting with the fact that it is not synchronous....it has
an asynchronous reset input therefore it is not synchronous.
Some people call a design having flipflops with asynchronous sets/resets
"partially synchronous". If you avoid this as you recommend it, it is
called "fully synchronous".


The reason for the more serious symptoms is violating the basic design
principle that no asynchronous input should ever go into computing the
output of more than one storage element (generally flip flops) if the
next state of those storage elements also depend on the current state.
Even a partially synchronous design can be reliable - even if there is
nothing known about setup/hold times in relation to the clock. The
reason is a "feature" of many circuits: Usually you have an async reset
and some /synchronous start condition/. As long as the sync start
condition is not true during reset you have plenty of time after reset
before something will happen in the design.

Example: A state machine waits for a key to be pressed. The key is
sampled and therefore this signal is synchronous. Even if the state
machine has an async reset - there is no problem.

Yes, I agree, that one has to take care if one choses a partially
synchronous or even an asynchronous (latches) design. On the other hand
you get some advantages in terms of area and (much more important) in
terms of power.

The fully synchronous design is a very good style for FPGAs, while the
asynchronous / partially synchronous design is very good for low-power
ASICs.


Ralf
 
Ralf Hildebrandt wrote:

The fully synchronous design is a very good style for FPGAs, while the
asynchronous / partially synchronous design is very good for low-power
ASICs.
If I use Andy's reset pulse with
the synched trailing edge (see above)
the "partially synchronous" problem
is solved, and now I can get better
use of resources on every fpga I
have benchmarked. The data is in
the comments at the bottom of
my reference design.

-- Mike Treseler
 
The fully synchronous design is a very good style for FPGAs, while the
asynchronous / partially synchronous design is very good for low-power
ASICs.

Actually the fully synchronous method that I mentioned is just as good for
low power ASICs. You're forgetting that synchronous designs work all the
way from DC (i.e. gated clocks) up to whatever is the maximum operating
frequency of the design. 'Synchronous' in no way implies a free running
clock.

If one only uses the async reset input of a flop to control the shift
register that is used to receive and then distribute the reset signals that
are now sync'ed to the appropriate clock domain (and then all other
processes using synchronous reset method) you'll consume no more power or
area or anything regardless of whether the overall design is synchronous,
gated clock or asynchronous.

KJ
 
If I use Andy's reset pulse with
the synched trailing edge (see above)
the "partially synchronous" problem
is solved, and now I can get better
use of resources on every fpga I
have benchmarked. The data is in
the comments at the bottom of
my reference design.
Mike,

I'm assuming by your statement 'better use of resources' you're referring to
the benchmark performance results that you have posted in 'uart.vhd' in
which it looks like you got identical performance for the 'a_rst' and
'v_rst' templates and those in turn are somewhat better than the 's_rst'
template....if not, then ignore the rest of this.

After looking at the benchmark though I think that you may have a
methodology problem in how you go about comparing the relative performance
of these templates. Generally what I do when benchmarking timing
performance is to add a 'wrapper' around the top level entity. That wrapper
takes all of the external inputs and adds a flip flop delay before sending
them into the entity that I'm trying to benchmark. Similarly I take all of
the outputs of the entity and put them through flip flop delays before
outputting them.

The reason for the flip flops is so that the timing analysis on the
benchmark will not be affected by I/O pad delays which are typically larger
than the internal delays. If you don't do this than to fairly compare
performance you need to compare computed max clock frequency as well as Tsu
and Tco in order to make a full comparison. But usually that comparison is
typically not useful either since, if the entity is to be embedded in the
final design than you won't really have a Tsu and Tco from external pins to
worry about. The flip flop isolation allows a staight up comparison of
computed clock frequency to be made between the things being benchmarked.

I tried running the design through Quartus and the Tsu for the reset signal
input is actually larger than the clock period. Since your results only
list a single clock frequency result I believe that the 'asynchronous' forms
used in your benchmark are getting artificially better results than one
would actually expect to see in a real design.

Any thoughts/comments, am I missing something here?

KJ
 
My point about the not reset clock enable is that the behavior of the
two circuits is not identical, and therefore is NOT "an equivalent
(i.e. produces the exact same synthesis and simulation results) but
doesn't require the reader to read from the 'bottom up'. " as you said
it was.

In the case of synchronously reset circuits, the following example:

if rising_edge(clk) then
if reset = '1' then
outputa <= 0;
else
outputb <= outputb + 1;
outputa <= inputa;
end if;

behaves differently than:

if rising_edge(clk) then
outputb <= outputb + 1;
outputa <= inputa;
if reset = '1' then
outputa <= '0';
end if;
end if;

in that outputb stops counting during reset in the upper example, but
keeps going during the lower example. That in and of itself may or may
not be an issue, but when the reset signal becomes an additional input
to the logic cone on some of the flops, then it makes a difference in
both performance and behavior.

In the OP, the author has simply shown a way to provide the behavior of
two separate processes (one with async reset, one without), in one
process, a valuable technique if one wants to minimize processes and
maximize use of variables.

There are many things that can be inferred from RTL that do not have
the option of an async reset, and mixing them with asynchronously reset
logic using the OP example is beneficial.

Andy
 
If I use Andy's reset pulse with
the synched trailing edge (see above)
the "partially synchronous" problem
is solved, and now I can get better
use of resources on every fpga I
have benchmarked. The data is in
the comments at the bottom of
my reference design.

For what it's worth I wasn't able to reproduce your "higher performance
using asynchronous resets" results (although I ran using a slightly
older version of Quartus). Sometimes synchronous was better, sometimes
asynchronous. The two different async approaches did yield the same
results.

I also benchmarked using the 'flip/flop wrapper' that I mentioned in a
previous post in order to make sure that max clock frequency results
are meaningful when comparing the two approaches (code posted at the
end).

---- Start of results ----
Top Level entity Template_c SW Fmax1 Fmax2
=================== =========== === ====== =====
uart 0 (async) Q 216.26
uart 1 (sync) Q 222.77
uart 2 (async 2) Q 216.26

syn_benchmark_uart 0 (async) Q 228.94
syn_benchmark_uart 1 (sync) Q 260.69
syn_benchmark_uart 2 (async 2) Q 228.94


uart 0 (async) SQ 474.0 405.68
uart 1 (async) SQ 417.3 381.83
uart 2 (async) SQ 474.0 405.68

syn_benchmark_uart 0 (async) SQ 464.6 385.95
syn_benchmark_uart 1 (async) SQ 356.6 321.85
syn_benchmark_uart 2 (async) SQ 464.6 385.95

- SW column is 'Software used' and was either
1. Quartus 5.0 SP2 all the way through (Q)
2. Synplify 8.4 front end, Quartus 5.0 SP2 fitting (SQ)
- Targetted device in all cases is Altera EP2S15F484C3. By the way,
Mike your code seems to have a typo
for the part number in your uart.vhd file
- Fmax1 is the pre-fit estimated timing reported by Synplify. Fmax2 is
the final post-fit computed timing.
---- End of results ----
---- Start of added code ----
library IEEE;
use IEEE.STD_LOGIC_1164.all; -- nothing fancy
entity syn_benchmark_uart is
generic (template_c : natural := 2; -- 0=a_rst, 1=s_rst,
others=>v_rst
char_len_c : natural := 8; -- An instance without
generic map
tic_per_bit_c : natural := 4); -- gets these constant
values.
port (
clock : in std_ulogic; -- port maps to std_logic with no
conversion
reset : in std_ulogic;
address : in std_ulogic;
writeData : in std_logic_vector(char_len_c-1 downto 0);
write_stb : in std_ulogic;
readData : out std_logic_vector(char_len_c-1 downto 0);
read_stb : in std_ulogic;
serialIn : in std_ulogic;
serialOut : out std_ulogic
);
end syn_benchmark_uart;

architecture synth_only of syn_benchmark_uart is
signal reset_int : std_ulogic;
signal address_int : std_ulogic;
signal writeData_int : std_logic_vector(char_len_c-1 downto 0);
signal write_stb_int : std_ulogic;
signal readData_int : std_logic_vector(char_len_c-1 downto 0);
signal read_stb_int : std_ulogic;
signal serialIn_int : std_ulogic;
signal serialOut_int : std_ulogic;
begin

-----------------------------------------------------------------------------
-- Wrap all I/O signals of the entity with flip flops so that timing
analysis
-- comparisons can be performed simply by looking for maximum clock
frequency

-----------------------------------------------------------------------------
process(Clock)
begin
if rising_edge(Clock) then
reset_int <= reset;
address_int <= address;
writeData_int <= writeData;
write_stb_int <= write_stb;
read_stb_int <= read_stb;
serialIn_int <= serialIn;

readData <= readData_int;
serialOut <= serialOut_int;
end if;
end process;

The_Uart: entity work.uart generic map(
template_c => template_c,
char_len_c => char_len_c,
tic_per_bit_c => tic_per_bit_c)
port map(
clock => clock,
reset => reset_int,
address => address_int,
writeData => writeData_int,
write_stb => write_stb_int,
readData => readData_int,
read_stb => read_stb_int,
serialIn => serialIn_int,
serialOut => serialOut_int);

end architecture synth_only;
---- End of added code ----
 
KJ wrote:

For what it's worth I wasn't able to reproduce your "higher performance
using asynchronous resets" results (although I ran using a slightly
older version of Quartus). Sometimes synchronous was better, sometimes
asynchronous. The two different async approaches did yield the same
results.
If you have the quartus 5 numbers I will add the to
the comments.

I also benchmarked using the 'flip/flop wrapper' that I mentioned in a
previous post in order to make sure that max clock frequency results
are meaningful when comparing the two approaches (code posted at the
end).
I think you're on to something.
With your permission, I will duplicate your results,
add utilization and quartus 6 data,
and append this wrapper example to to the table.

Thanks for putting in the time
and especially for publishing the results.
Running controlled tests like this is the only way
to cut through the fog of random opinion and vendor claims.

-- Mike Treseler
 
"Mike Treseler" <mike_treseler@comcast.net> wrote in message
news:4fo1nrF1jk2e5U1@individual.net...
KJ wrote:

I also benchmarked using the 'flip/flop wrapper' that I mentioned in a
previous post in order to make sure that max clock frequency results
are meaningful when comparing the two approaches (code posted at the
end).

Thanks for putting in the time
and especially for publishing the results.
Running controlled tests like this is the only way
to cut through the fog of random opinion and vendor claims.
Putting on my Xilinx hat for a second: all the characterization data that
goes into the datasheets for Xilinx IP cores is obtained in a standard,
controlled, systematic way, similar to that which KJ described above. We try
hard to eliminate the effects of I/O and excessive routing delays, to arrive
at figures that are representative, meaningful and realistic. There is no
"cheating"; we use the same tools as the customer gets. All figures are
actual numbers that were achieved, post map&PAR. Extra tool switches and
non-standard assumptions are (or at least should be!) stated along with the
numbers.

So I'm not taking offence at what you said (I know none was meant), but
would you like to explain further what you meant by this "fog"? Is there
anything you would like to see changed in the way we present performance
data? If so, what are your suggestions?

Of course, if it involves trying to control the marketing claims that make
it onto the sparkle sheets then we both know that's a lost cause. ;-)

Cheers,

-Ben-
 
In the OP, the author has simply shown a way to provide the behavior of
two separate processes (one with async reset, one without), in one
process, a valuable technique if one wants to minimize processes and
maximize use of variables.

I'm not sure that 'maximize use of variables' is any sort of useful
metric (function/performance/power/code maintainability are more
useful), but I agree with you and the original post author that what
was posted is an improvement over using two processes and belongs in
the bag of tricks.

But the original post also mentioned this in the context of this being
a good way to avoid unwanted gated clocks. And in my original post I
simply mentioned that an asynchronously resetable shift register used
to generate the reset signal to everything else in the design and then
using synchronous resets throughout the design avoids the entire
situation entirely and in most cases costs darn near nothing and
performs virtually the same.

The only functional difference between the two is that PRIOR to that
first rising edge of the clock the outputs are in a different state.
AFTER the first rising edge everything is the same. The reset signal
itself can come when the clock is shut off, it's just the result of
that reset that doesn't show up until the clock does start.

As a practical matter, that functional difference is generally of no
importance....for the simple reason that the reason that the clock
isn't running is usually because something has knowingly shut it off
(i.e. maybe to conserve power). In any case, that thing that controls
the clock certainly knows to ignore the outputs of a function that it
is not actively using so the fact that the outputs aren't the way you
think they should be really doesn't matter darn near all the time.

If you think the slight functional difference is important because this
signal is a 'really important' signal that absolutely must come up
correctly (i.e. launch the missles) than think again. Before any
properly designed system would turn over control of that 'really
important' signal in the first place it would first test it to make
sure that it is working correctly (i.e. no false launches...no missed
launch commands). Only then would it allow that signal to control that
'really important' signal....and it would only do so after starting the
clock because the designer realizes that the outputs become valid after
the clock, not before.

If the clock isn't running because it is just busted than maybe the
slight functional difference does become important but only if it
prevents the system from properly diagnosing what field replacable unit
needs replacing or being able to route around the failing component.

Differences in performance, area and power consumption between the two
approaches depends entirely on the underlying technology but a lot of
times it really is close and again, not a real differentiator (keeping
in mind that 'synchronous' does not mean 'free running clock').

I use the terms 'generally' and 'darn near all the time' because I
recognize that there really may be situations where this subtle
difference may be important but I have yet to either need that
difference or have somebody describe the scenario where the state of
the outputs of a stopped clocked process are actually important prior
to the clock actually restarting. On the other hand I have run across
and had to fix numerous cases in other people designs where resets were
handled improperly (i.e. "Gee where did THAT blip on reset come
from?...so that's why everything took a dump" or "Dang, I guess the
trailing edge on reset DOES need to be synced huh?"

Since there are so many newbies out here anyway, and they tend to be
working with FPGAs, I thought I'd point out what is likely for them to
be a better approach, not dissing the original poster.

There are many things that can be inferred from RTL that do not have
the option of an async reset,
Think you meant "the option of a sync reset"

and mixing them with asynchronously reset
logic using the OP example is beneficial.
Agreed....keeping in mind that using async resets requires more 'skill'
(for lack of a better word) than sync resets.

KJ
 
If you have the quartus 5 numbers I will add the to
the comments.
Not sure what you're asking for? The version of Quartus?
- SW column is 'Software used' and was either
1. Quartus 5.0 SP2 all the way through (Q)
2. Synplify 8.4 front end, Quartus 5.0 SP2 fitting (SQ)

I think you're on to something.
With your permission, I will duplicate your results,
add utilization and quartus 6 data,
and append this wrapper example to to the table.

Go ahead and use the code. Might want to also add a comment that to
the code that the 'wrapper' likely breaks functionality of the design,
it is only used to allow for apples/apples performance measurement
comparisons....a sort of "clock performance testbench" of sorts.

KJ
 
KJ wrote:

Not sure what you're asking for? The version of Quartus?
Sorry, I meant flops, luts, and Fmax like the other cases.

Go ahead and use the code. Might want to also add a comment that to
the code that the 'wrapper' likely breaks functionality of the design,
it is only used to allow for apples/apples performance measurement
comparisons....a sort of "clock performance testbench" of sorts.
I expect that functionality will be ok,
but the testbench might need some improvements.

Thanks again.

-- Mike Treseler
 
Ben Jones wrote:

Putting on my Xilinx hat for a second: all the characterization data that
goes into the datasheets for Xilinx IP cores is obtained in a standard,
controlled, systematic way ...
The intent of my reference design is to compare
code styles. I only include utilization and Fmax
to compare templates, not devices.

All the vendors have excellent device data sheets
and I believe they are based on reality.
However, none of the vendors have any serious
synthesis examples with all primitives inferred
from code. The casual user of vendor tools
is sucked right into the wizard vortex, and
most never make it out.

-- Mike Treseler


ps: using standard libraries in the
vhdl examples would be a plus.
 
Mike Treseler wrote:
KJ wrote:

Not sure what you're asking for? The version of Quartus?

Sorry, I meant flops, luts, and Fmax like the other cases.

-----------------------
Top level entity = uart
-----------------------
-- template_a_rst;
-------------------------------------------------------------------------------
-- Quartus 5.0 SP2 216 MHz 50 FF 73 ALUT ep2s15sf484c3
by Kevin Jennings
-- Synplify 8.4+Quartus 5.0 SP2 405 MHz 50 FF 74 ALUT ep2s15sf484c3
by Kevin Jennings
-------------------------------------------------------------------------------
-- template_s_rst;
-------------------------------------------------------------------------------

-- Quartus 5.0 SP2 222 MHz 50 FF 77 ALUT ep2s15sf484c3
by Kevin Jennings
-- Synplify 8.4+Quartus 5.0 SP2 381 MHz 50 FF 77 ALUT ep2s15sf484c3
by Kevin Jennings
-------------------------------------------------------------------------------
-- template_v_rst;
-------------------------------------------------------------------------------

-- Quartus 5.0 SP2 216 MHz 50 FF 73 ALUT ep2s15sf484c3
by Kevin Jennings
-- Synplify 8.4+Quartus 5.0 SP2 405 MHz 50 FF 74 ALUT ep2s15sf484c3
by Kevin Jennings
-------------------------------------------------------------------------------

-------------------------------------
Top level entity = syn_benchmark_uart
-------------------------------------

-- template_a_rst;
-------------------------------------------------------------------------------
-- Quartus 5.0 SP2 228 MHz 72 FF 93 ALUT ep2s15sf484c3
by Kevin Jennings
-- Synplify 8.4+Quartus 5.0 SP2 385 MHz 72 FF 95 ALUT ep2s15sf484c3
by Kevin Jennings
-------------------------------------------------------------------------------
-- template_s_rst;
-------------------------------------------------------------------------------

-- Quartus 5.0 SP2 260 MHz 72 FF 98 ALUT ep2s15sf484c3
by Kevin Jennings
-- Synplify 8.4+Quartus 5.0 SP2 321 MHz 72 FF 96 ALUT ep2s15sf484c3
by Kevin Jennings
-------------------------------------------------------------------------------
-- template_v_rst;
-------------------------------------------------------------------------------

-- Quartus 5.0 SP2 228 MHz 72 FF 93 ALUT ep2s15sf484c3
by Kevin Jennings
-- Synplify 8.4+Quartus 5.0 SP2 385 MHz 72 FF 95 ALUT ep2s15sf484c3
by Kevin Jennings
-------------------------------------------------------------------------------
 

Welcome to EDABoard.com

Sponsor

Back
Top