Regarding process time calculation

V

varun_agr

Guest
Sir
I want to know that how we can calculate time taken by a process or i
Xilinx ISE anywhwhere we can see it as we are using concurrent programmin
and want to know time taken by each process in behavioral modelling.
eg:
process(sensitivity list)

variable declaratons
begin
programming codes
end process
Now we want time taken by such process.
Thanks
Varun Maheshwari


---------------------------------------
Posted through http://www.FPGARelated.com
 
"varun_agr" <VARUN_AGR@n_o_s_p_a_m.n_o_s_p_a_m.YAHOO.COM> writes:

Sir
I want to know that how we can calculate time taken by a process or in
Xilinx ISE anywhwhere we can see it as we are using concurrent programming
and want to know time taken by each process in behavioral modelling.
eg:
process(sensitivity list)

variable declaratons
begin
programming codes
end process
Now we want time taken by such process.
It's entirely dependent on what chip you implement your logic in.
You'll have to synthesise and then run place and route. The timing
analyser can then provide you with timing information.

What is normally done is to provide a clock that goes to every process
and wrap the logic in :

if rising_edge(clk) then
-- logic description in here
end if;

You then tell the tools how fast the clock is that you intend to feed to
the device, and the timing tools will then tell you if the logic (and
internal wiring) you've described is fast enough to have finished
between two consecutive clock edges.

This is very unlike software programming, where you write some code and
then instrument it to get timing information (which will be more or less
variable depending on data access patterns, caching and all the other
non-determinism that comes with a modern microprocessor system).

In hardware, the tools know everything they need to about the timing of
the chips and can give you a "cast-iron" statement as to how fast things
will be in the worst-case.

Cheers,
Martin

--
martin.j.thompson@trw.com
TRW Conekt - Consultancy in Engineering, Knowledge and Technology
http://www.conekt.co.uk/capabilities/39-electronic-hardware
 
On Wed, 03 Aug 2011 01:42:24 -0500, varun_agr wrote:

Sir
I want to know that how we can calculate time taken by a process or in
Xilinx ISE anywhwhere we can see it as we are using concurrent
programming and want to know time taken by each process in behavioral
modelling. eg:
process(sensitivity list)

variable declaratons
begin
programming codes
end process
Now we want time taken by such process. Thanks
Varun Maheshwari
If you mean how long will it take in the FPGA, see Martin's answer.

If you mean how long it will take in the tool, be explicit in your
question. There may be a way to get the computer time; look in your
documentation.

--
www.wescottdesign.com
 
On Wed, 03 Aug 2011 13:42:21 +0100, Martin Thompson wrote:

"varun_agr" <VARUN_AGR@n_o_s_p_a_m.n_o_s_p_a_m.YAHOO.COM> writes:

Sir
I want to know that how we can calculate time taken by a process or in
Xilinx ISE anywhwhere we can see it as we are using concurrent
programming and want to know time taken by each process in behavioral
modelling. eg:
process(sensitivity list)

variable declaratons
begin
programming codes
end process
Now we want time taken by such process.

It's entirely dependent on what chip you implement your logic in. You'll
have to synthesise and then run place and route. The timing analyser
can then provide you with timing information.

What is normally done is to provide a clock that goes to every process
and wrap the logic in :

if rising_edge(clk) then
-- logic description in here
end if;

You then tell the tools how fast the clock is that you intend to feed to
the device, and the timing tools will then tell you if the logic (and
internal wiring) you've described is fast enough to have finished
between two consecutive clock edges.

This is very unlike software programming, where you write some code and
then instrument it to get timing information (which will be more or less
variable depending on data access patterns, caching and all the other
non-determinism that comes with a modern microprocessor system).

In hardware, the tools know everything they need to about the timing of
the chips and can give you a "cast-iron" statement as to how fast things
will be in the worst-case.
If that's so, then why have I seen so many first-cut FPGA designs fail
when run over the full military temperature range? And why have I seen
FPGA designs fail during temperature cycling after months in production,
after some unknown process change by the vendor?

Tools know _a lot_ of what they need, and can give you a _very good_
statement of how fast things need to be in worst case. But if you really
want things to be cast-iron solid then you need to be conservative in how
you specify your margins, you need to design as if timing matters, and
you need to make absolutely sure that any wiring that is external to the
FPGA meets it's timing, too.

--
www.wescottdesign.com
 
On Wed, 03 Aug 2011 16:33:46 +0000, glen herrmannsfeldt wrote:

Martin Thompson <martin.j.thompson@trw.com> wrote:
"varun_agr" <VARUN_AGR@n_o_s_p_a_m.n_o_s_p_a_m.YAHOO.COM> writes:

I want to know that how we can calculate time taken by a process or in
Xilinx ISE anywhwhere we can see it as we are using concurrent
programming and want to know time taken by each process in behavioral
modelling.
(snip)

It's entirely dependent on what chip you implement your logic in.
You'll have to synthesise and then run place and route. The timing
analyser can then provide you with timing information.

(snip)
This is very unlike software programming, where you write some code and
then instrument it to get timing information (which will be more or
less variable depending on data access patterns, caching and all the
other non-determinism that comes with a modern microprocessor system).

It will be variable, but if one wanted one could find a worst-case time
for most processors. One normally wants closer to average case. Assume
no cache hits, no instruction overlap, it could be done.
If you're doing something that's really hard real time (i.e. fails once
and the product -- or the operator -- dies), then for the critical cases
you want to know absolute worst-case maximums.

Few things are really that hard real time, though.

--
www.wescottdesign.com
 
glen herrmannsfeldt wrote:
Martin Thompson <martin.j.thompson@trw.com> wrote:
"varun_agr" <VARUN_AGR@n_o_s_p_a_m.n_o_s_p_a_m.YAHOO.COM> writes:

I want to know that how we can calculate time taken by a process or in
Xilinx ISE anywhwhere we can see it as we are using concurrent programming
and want to know time taken by each process in behavioral modelling.
(snip)

It's entirely dependent on what chip you implement your logic in.
You'll have to synthesise and then run place and route. The timing
analyser can then provide you with timing information.

(snip)
This is very unlike software programming, where you write some code and
then instrument it to get timing information (which will be more or less
variable depending on data access patterns, caching and all the other
non-determinism that comes with a modern microprocessor system).

It will be variable, but if one wanted one could find a worst-case
time for most processors. One normally wants closer to average case.
Depends... for hard-real-time systems, the worst case execution time
(WCET) is also important, at least for the certification of a
safety-critical system.

Assume no cache hits, no instruction overlap, it could be done.
For cached and pipelined processors such crude assumptions will usually
give a hugely overestimated WCET bound, which may not be useful. There
are tools that use advanced processor and program analysis to compute
much better WCET bounds. See
http://en.wikipedia.org/wiki/Worst-case_execution_time and the first
referenced article therein.

Most current high-performance processors are so greedy and short-sighted
in their internal scheduling that they have so-called "timing anomalies"
which means, for example, that a cache hit at a particular point in the
program may give a larger overall execution time than a cache miss at
that point. Finding the worst-case behaviour by manual methods or
intuitive reasoning is quite hard for such processors.

--
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
. @ .
 
Martin Thompson <martin.j.thompson@trw.com> wrote:
"varun_agr" <VARUN_AGR@n_o_s_p_a_m.n_o_s_p_a_m.YAHOO.COM> writes:

I want to know that how we can calculate time taken by a process or in
Xilinx ISE anywhwhere we can see it as we are using concurrent programming
and want to know time taken by each process in behavioral modelling.
(snip)

It's entirely dependent on what chip you implement your logic in.
You'll have to synthesise and then run place and route. The timing
analyser can then provide you with timing information.
(snip)
This is very unlike software programming, where you write some code and
then instrument it to get timing information (which will be more or less
variable depending on data access patterns, caching and all the other
non-determinism that comes with a modern microprocessor system).
It will be variable, but if one wanted one could find a worst-case
time for most processors. One normally wants closer to average case.
Assume no cache hits, no instruction overlap, it could be done.

In hardware, the tools know everything they need to about the timing of
the chips and can give you a "cast-iron" statement as to how fast things
will be in the worst-case.
They might do some rounding up in the timing, to be sure.

-- glen
 
glen herrmannsfeldt <gah@ugcs.caltech.edu> writes:

Martin Thompson <martin.j.thompson@trw.com> wrote:
This is very unlike software programming, where you write some code and
then instrument it to get timing information (which will be more or less
variable depending on data access patterns, caching and all the other
non-determinism that comes with a modern microprocessor system).

It will be variable, but if one wanted one could find a worst-case
time for most processors. One normally wants closer to average case.
Normally in a non-real-time system. I don't do many of those, so that
colours my comments ;)

Assume no cache hits, no instruction overlap, it could be done.
It could, yes, but you'd have to do it yourself. That's part of the
point - in FPGA-land we have tools that do it for us.

Also, absolutely-worst-case for a cached "modern" processor would be
dreadfully pessimistic, to the point of being very little use IMHO. For
example, how would a simple sort come out if you assumed no cache hits
at all throughout execution? We know that's too extreme, but how do we
put bounds on what is "sensible"?

More usefully (again IMHO) there are statistical methods for measuring
execution time and its variability down various code paths, which can be
used to provide arbitrarily high levels of confidence as to the
likelihood of missing a real-time deadline, as well as showing what to
optimise to improve the worst-case - this sometimes means using what
might be regarded in mainstream compsci as "inefficient" algorithms, as
they have better bounds on worst-case performance, even at the expense
of average performance. I ought to stop now, I'm no doubt "rambling to
the choir" as well as getting off-topic :)

In hardware, the tools know everything they need to about the timing of
the chips and can give you a "cast-iron" statement as to how fast things
will be in the worst-case.

They might do some rounding up in the timing, to be sure.
I'm not sure what you mean by that: The tools "know" worst-case timings
for the various silicon paths and add them all up. Rounding doesn't
come into it.

--
martin.j.thompson@trw.com
TRW Conekt - Consultancy in Engineering, Knowledge and Technology
http://www.conekt.co.uk/capabilities/39-electronic-hardware
 
Tim Wescott <tim@seemywebsite.com> writes:

On Wed, 03 Aug 2011 13:42:21 +0100, Martin Thompson wrote:

In hardware, the tools know everything they need to about the timing of
the chips and can give you a "cast-iron" statement as to how fast things
will be in the worst-case.

If that's so, then why have I seen so many first-cut FPGA designs fail
when run over the full military temperature range? And why have I seen
FPGA designs fail during temperature cycling after months in production,
after some unknown process change by the vendor?
I can think of lots of reasons that don't come down to the quality of
the timing models... Or did you track the problem down to an error in
the timing analysis tools and models?

Tools know _a lot_ of what they need, and can give you a _very good_
statement of how fast things need to be in worst case.
I put "cast-iron" in quotes because nothing in real-life is ever 100%.
They are very good though (IME) when used correctly... but only with the
things that they know about - the innards of the silicon. In the
context of the original question "how long will my logic take to run?" I
think that's reasonable.

But if you really
want things to be cast-iron solid then you need to be conservative in how
you specify your margins, you need to design as if timing matters, and
you need to make absolutely sure that any wiring that is external to the
FPGA meets it's timing, too.
Of course for a complete system, the engineers responsible still need to
do all the "external engineering" to make sure that they have (amongst other
things):

* Specified the timing constraints correctly
(and covered all the paths that matter!)
* Supplied "clean enough" power
* Supplied a "clean enough" clock
* Crossed clock domains properly
* Taken into account the timings of external parts (as you mentioned)
* Used production-grade speedfiles.
* ... and yes, put a bit more margin on top
(the amount of which depends on the criticality of the system

Cheers,
Martin

--
martin.j.thompson@trw.com
TRW Conekt - Consultancy in Engineering, Knowledge and Technology
http://www.conekt.co.uk/capabilities/39-electronic-hardware
 
On Thu, 04 Aug 2011 10:56:51 +0100, Martin Thompson wrote:

Tim Wescott <tim@seemywebsite.com> writes:

On Wed, 03 Aug 2011 13:42:21 +0100, Martin Thompson wrote:

In hardware, the tools know everything they need to about the timing
of the chips and can give you a "cast-iron" statement as to how fast
things will be in the worst-case.

If that's so, then why have I seen so many first-cut FPGA designs fail
when run over the full military temperature range? And why have I seen
FPGA designs fail during temperature cycling after months in
production, after some unknown process change by the vendor?

I can think of lots of reasons that don't come down to the quality of
the timing models... Or did you track the problem down to an error in
the timing analysis tools and models?
Most of my experience with this has been as an amused observer, rather
than an appalled participant. It stemmed from a corporate rule "thou
shalt make the synthesis tool happy about timing" and a rather aggressive
design group from out of state that was known to have scripts that would
do synthesis runs over and over until one happened to meet timing, then
stop and ship that file 'upstairs'.

The balance has stemmed from folks taking the Xilinx speed grades at face
value, and using what the synthesis tool says (using Xilinx's defaults)
at face value. Once the group learned that you have to force a bit of
margin into the process (and the group thinks that a design that fails to
synthesize once out of ten is a problem, as opposed to thinking that a
design that succeeds once out of twenty is 'shippable') then those
problems went away.

--
www.wescottdesign.com
 
Martin Thompson <martin.j.thompson@trw.com> wrote:

(snip on timing of processors and FPGAs)

More usefully (again IMHO) there are statistical methods for measuring
execution time and its variability down various code paths, which can be
used to provide arbitrarily high levels of confidence as to the
likelihood of missing a real-time deadline, as well as showing what to
optimise to improve the worst-case - this sometimes means using what
might be regarded in mainstream compsci as "inefficient" algorithms, as
they have better bounds on worst-case performance, even at the expense
of average performance. I ought to stop now, I'm no doubt "rambling to
the choir" as well as getting off-topic :)
Sounds right to me. Now, isn't that also true for FPGAs?

In hardware, the tools know everything they need to about the timing
of the chips and can give you a "cast-iron" statement as to how
fast things will be in the worst-case.

They might do some rounding up in the timing, to be sure.

I'm not sure what you mean by that: The tools "know" worst-case
timings for the various silicon paths and add them all up.
Rounding doesn't come into it.
Consider that some timings might be Gaussian distributed with
a tail to infinity. Also, that it is really difficult to get
the exact timings for all possible routing paths. As with the
real-time processor, you don't really want worst-case, but
just sufficiently unlikely to be exceeded times.

Some of the distributions won't quite be Gaussian, as some of
the worst-case die are rejected. Consider, though, that there
are a finite number of electrons on a FET gate, ever decreasing
as they get smaller.

Some of the favorite problems in physics class include determining
the probability of all the air molecules going to one half of
a room, or of a person quantum-mechanically tunneling through
a brick wall, running at a given speed. Both have a very low,
but non-zero, probability.

-- glen
 
Tim Wescott <tim@seemywebsite.com> writes:

Most of my experience with this has been as an amused observer, rather
than an appalled participant. It stemmed from a corporate rule "thou
shalt make the synthesis tool happy about timing" and a rather aggressive
design group from out of state that was known to have scripts that would
do synthesis runs over and over until one happened to meet timing, then
stop and ship that file 'upstairs'.
Ahh, I can see where that might cause hassle.

I'm not a fan of stochastic timing closure... I tend to fix timing
problems by grovelling through the tmiing reports on the failing paths
and fixing something in the RTL to make life easier for the tools
(I tend not to push silicon right to it's limits, so I can get
away with that!)

Cheers,
Martin

--
martin.j.thompson@trw.com
TRW Conekt - Consultancy in Engineering, Knowledge and Technology
http://www.conekt.co.uk/capabilities/39-electronic-hardware
 
glen herrmannsfeldt <gah@ugcs.caltech.edu> writes:

Martin Thompson <martin.j.thompson@trw.com> wrote:

(snip on timing of processors and FPGAs)

More usefully (again IMHO) there are statistical methods for measuring
execution time and its variability down various code paths, which can be
used to provide arbitrarily high levels of confidence as to the
likelihood of missing a real-time deadline, as well as showing what to
optimise to improve the worst-case - this sometimes means using what
might be regarded in mainstream compsci as "inefficient" algorithms, as
they have better bounds on worst-case performance, even at the expense
of average performance. I ought to stop now, I'm no doubt "rambling to
the choir" as well as getting off-topic :)

Sounds right to me. Now, isn't that also true for FPGAs?
The statistics or the algorithms?

Certainly, FPGAs often suit different algorithms.

And yes, the statistics is also there... I should've made that clear at
the start - I put "worst-case" in quotes without explaining what i
meant. Sorry about that.

In hardware, the tools know everything they need to about the timing
of the chips and can give you a "cast-iron" statement as to how
fast things will be in the worst-case.

They might do some rounding up in the timing, to be sure.

I'm not sure what you mean by that: The tools "know" worst-case
timings for the various silicon paths and add them all up.
Rounding doesn't come into it.

Consider that some timings might be Gaussian distributed with
a tail to infinity. Also, that it is really difficult to get
the exact timings for all possible routing paths.
Indeed. It's a little easier for the FPGA, especially in a highly
pipelined design as the timing paths are broken up much more - so there
are less "long chains of logic and routing" than their might be "chains
of control flow through software".

As you say, it's a very simialr problem, but the FPGA is much more
constrained.

As with the
real-time processor, you don't really want worst-case, but
just sufficiently unlikely to be exceeded times.
Indeed, and because the FPGA vendors give us a timing analyser that
people are going to have a reasonably high level of trust in, they
define "unlikely to be exceeded" pretty conservatively. Of course, if
your system is pushing the edges in some sense (esp. criticality of
operation) the prudent engineer will allow more margin etc.

Some of the distributions won't quite be Gaussian, as some of
the worst-case die are rejected. Consider, though, that there
are a finite number of electrons on a FET gate, ever decreasing
as they get smaller.

Some of the favorite problems in physics class include determining
the probability of all the air molecules going to one half of
a room, or of a person quantum-mechanically tunneling through
a brick wall, running at a given speed. Both have a very low,
but non-zero, probability.
I think we're in violent agreement. My point really was (and I shouln't
have used the phrase "worst-case" without qualification) that getting an
arbitrarily high confidence of "meeting timing" is readily-available
with FPGA tools, but in the software domain, it's only just starting to
be an automatable task, and that out-of-the-box commercial compilers
(AFAIK) are of no assistance in that respect.

Cheers,
Martin

--
martin.j.thompson@trw.com
TRW Conekt - Consultancy in Engineering, Knowledge and Technology
http://www.conekt.co.uk/capabilities/39-electronic-hardware
 
Martin Thompson <martin.j.thompson@trw.com> wrote:
(snip)

Ahh, I can see where that might cause hassle.

I'm not a fan of stochastic timing closure... I tend to fix timing
problems by grovelling through the tmiing reports on the failing paths
and fixing something in the RTL to make life easier for the tools
(I tend not to push silicon right to it's limits, so I can get
away with that!)
After my previous post, I was thinking about how FPGA companies
have an incentive to quote faster devices, and so want the timing
reports to be fast, without an excessive margin.

There are always process variations, and for the user voltage
and temperature variations. For most, it shouldn't be so hard
to stay within the voltage range, though maybe not so hard to
miss by a little. (Slightly defective power supply.)

Temperature is a little harder to control. If one is too close
on timing, and the temperature is a little high, the design
could fail.

-- glen
 
On Fri, 05 Aug 2011 08:38:56 +0100, Martin Thompson wrote:

Tim Wescott <tim@seemywebsite.com> writes:

Most of my experience with this has been as an amused observer, rather
than an appalled participant. It stemmed from a corporate rule "thou
shalt make the synthesis tool happy about timing" and a rather
aggressive design group from out of state that was known to have
scripts that would do synthesis runs over and over until one happened
to meet timing, then stop and ship that file 'upstairs'.

Ahh, I can see where that might cause hassle.

I'm not a fan of stochastic timing closure... I tend to fix timing
problems by grovelling through the tmiing reports on the failing paths
and fixing something in the RTL to make life easier for the tools (I
tend not to push silicon right to it's limits, so I can get away with
that!)
Interestingly enough, the engineering group that I was with (which was a
short walk away from the production line) felt exactly the way you did --
they would go as far as look for the worst possible path, and if it
passed timing, but barely, they'd look for things sticking out and fix
them.

The other group consistently got rewarded for "getting done" quickly, and
somehow no one ever kept track of the number of times that their work was
what was stalling production (while we fixed it in our "spare time",
because we were closer).

--
www.wescottdesign.com
 
The balance has stemmed from folks taking the Xilinx speed grades at face
value, and using what the synthesis tool says (using Xilinx's defaults)
at face value. Once the group learned that you have to force a bit of
margin into the process (and the group thinks that a design that fails to
synthesize once out of ten is a problem, as opposed to thinking that a
design that succeeds once out of twenty is 'shippable') then those
problems went away.

Tim, are talking about synthesis or the timing analysis after P&R?


Nial
 
On Fri, 05 Aug 2011 16:43:01 +0100, Nial Stewart wrote:

The balance has stemmed from folks taking the Xilinx speed grades at
face value, and using what the synthesis tool says (using Xilinx's
defaults) at face value. Once the group learned that you have to force
a bit of margin into the process (and the group thinks that a design
that fails to synthesize once out of ten is a problem, as opposed to
thinking that a design that succeeds once out of twenty is 'shippable')
then those problems went away.


Tim, are talking about synthesis or the timing analysis after P&R?
Sorry -- muddled thinking working in concert with mostly being an
observer.

Timing analysis after P&R, yes.

--
www.wescottdesign.com
 

Welcome to EDABoard.com

Sponsor

Back
Top