mapper optimization

B

Brannon King

Guest
VHDL/Verilog compilers perform an optimization that I think should be done
in the mapper. I think it is part of the "slice packing." Maybe someone can
explain why this is done in this fashion. What I want is to use my 3rd-party
structural EDIF, and currently I'm having to perform this optimization
manually. The optimization is this: Suppose I have three OR gates where they
are cascaded such that the output of the first goes into the second and the
output of the second goes into a third. The other inputs for the three gates
all come from the same top layer. It is possible to reorder those gates such
that the first two OR gates are in the same layer and the third has inputs
coming from the first two gates. The Map/Par seems to have a much easier
time with the Timespec when I start out with the binary (latter) ordered
gates, yet I would think it would be an easy optimization for the mapper to
perform. Thoughts?
 
Brannon King wrote:
VHDL/Verilog compilers perform an optimization that I think should be done
in the mapper. I think it is part of the "slice packing." Maybe someone can
explain why this is done in this fashion. What I want is to use my 3rd-party
structural EDIF, and currently I'm having to perform this optimization
manually.
Consider getting the source code.
A netlist is much more difficult
to work with.
-- Mike Treseler
 
"Brannon King" <bking@starbridgesystems.com> wrote in message
news:bu6s3c$4s2@dispatch.concentric.net...
VHDL/Verilog compilers perform an optimization that I think should be done
in the mapper. I think it is part of the "slice packing." Maybe someone
can
explain why this is done in this fashion. What I want is to use my
3rd-party
structural EDIF, and currently I'm having to perform this optimization
manually. The optimization is this: Suppose I have three OR gates where
they
are cascaded such that the output of the first goes into the second and
the
output of the second goes into a third. The other inputs for the three
gates
all come from the same top layer. It is possible to reorder those gates
such
that the first two OR gates are in the same layer and the third has inputs
coming from the first two gates. The Map/Par seems to have a much easier
time with the Timespec when I start out with the binary (latter) ordered
gates, yet I would think it would be an easy optimization for the mapper
to
perform. Thoughts?
Put your specific problem on the backburner. You raise a point which has
concerned me for awhile .... That is that synthesis tools for FPGAs have
very minimal control over timing driven synthesis .... especially as
compared
to Synopsis Design Compiler for ASIC flows.

With an ASIC flow, the library primitives for the synthesis tool map 1-to-1
with
what the equivalent place-and-route tools operate. There are estimates for
routing based on wire-load models. The synthesis tool optimizes based on
what
it thinks delays will result from logic levels and routes. There is then a
method
to take the resulting netlist after it has been routed and actual wire load
delays
extracted, and feed this back into the synthesis tool to "recalibrate."

The fact that there is a "mapping" process done AFTER the synthesis tool
produces a netlist is a big deviant in trying to converge on a similar
approach.
Parts of mapping are inherently a timing driven process.

I am not advocating "moving" the map process to the synthesis tool ...
rather I think the map process should be able to accept
directives/recommendations
(in a .ucf file format) from the synthesis tool on which paths are critical
ones to
reduce the number of clb levels where possible.

Anyway ... thanks for letting me vent.

--
Regards,
John Retta
Owner and Designer
Retta Technical Consulting Inc.

web : www.rtc-inc.com
 
Hi Brannon,

This type of optimization (and many others) could be done up-front by
synthesis tools -- and I think some of them will do this. The netlists that
3rd party synthesis tools produce are composed of technology-mapped logic
elements (LUT + some other stuff) and flip-flops, plus other gunk (RAMs,
IOs, ...). However, synthesis is limited since it must guess what your
critical path is. Not only must you make sure you inform your tools via
timing constraints, but they must guess what the down-stream P&R tool will
do.

Irrespective of what synthesis tool you use, Quartus can perform netlist
optimizations (aka localized resynthesis, or physical synthesis) in various
stages of the placement and routing flow. The optimizations are performed
based on the slack of various connections in your design, and thus the
decisions will be timing-driven. The P&R tool has the advantage of knowing
what the true critical path(s) of your design are, since it knows the
placement and then routing used, and thus is in a good position to make
these tweaks. By using Netlist Optimizations, Quartus can take a good
synthesis result and make it even better!

This option is NOT enabled by default in Quartus. The reason is that these
optimizations can result in node name changes between your P&R netlist and
your synthesis netlist, which can complicate the entry of constraints or
running your design through formal verification. And you only need Netlist
Optimizations when you are not meeting timing.

To enable Netlist Optimizations, go to the Settings dialog box in the
Assignments menu and select the Netlist Optimizations page. See AN297
(http://www.altera.com/literature/an/an297.pdf) and AN198
(http://www.altera.com/literature/an/an198.pdf).for more information on
these and other methods for improving your design performance in Quartus.

Or try out the "Design Space Explorer" tool. DSE will try a whole bunch of
Quartus settings for you to find those that yield the best results for your
design. Netlist Optimizations are just one of the things it tries. For
more information on DSE and other ways to close timing in your design see
AN198.

Regards,

Paul Leventis
Altera Corp.


"Brannon King" <bking@starbridgesystems.com> wrote in message
news:bu6s3c$4s2@dispatch.concentric.net...
VHDL/Verilog compilers perform an optimization that I think should be done
in the mapper. I think it is part of the "slice packing." Maybe someone
can
explain why this is done in this fashion. What I want is to use my
3rd-party
structural EDIF, and currently I'm having to perform this optimization
manually. The optimization is this: Suppose I have three OR gates where
they
are cascaded such that the output of the first goes into the second and
the
output of the second goes into a third. The other inputs for the three
gates
all come from the same top layer. It is possible to reorder those gates
such
that the first two OR gates are in the same layer and the third has inputs
coming from the first two gates. The Map/Par seems to have a much easier
time with the Timespec when I start out with the binary (latter) ordered
gates, yet I would think it would be an easy optimization for the mapper
to
perform. Thoughts?
 

Welcome to EDABoard.com

Sponsor

Back
Top