Writing PCI constraints in Altera

T

tushit

Guest
Hi,
I am fairly new to FPGAs. I am trying to write the constraints for the
PCI module on an Altera Stratix device. I am using QuartusII for all
synthesis and P&R.
The PCI spec says I need to ensure a setup time of 7ns for all pins.
The PCI clock itself works at 33Mhz. I want to know the following:
1) Is it okay if I just constraint the PCI clk of my design to 50Mhz
(30ns for the 33Mhz clock and another 10ns to ensures that the setup
time is met)? I realise this will be an overkill on the internal logic
but may save me some effort.
2) The other way I think to do this is to constraint the PCI clk to
33MHz and specify the external delay on all the PCI signals to 7 or
8ns. While setting PCI clk to 33Mhz I also ticked the option of
including external delays in the frequency calculation. Is this the
correct approach? OR do I need to setup the tco.
Thanks in advance.
Regards
Tushit
 
"tushit" <tushitjain@yahoo.com> wrote in message
news:ec6aab0.0404130920.42fa2dfd@posting.google.com...
Hi,
I am fairly new to FPGAs. I am trying to write the constraints for the
PCI module on an Altera Stratix device. I am using QuartusII for all
synthesis and P&R.
The PCI spec says I need to ensure a setup time of 7ns for all pins.
The PCI clock itself works at 33Mhz. I want to know the following:
1) Is it okay if I just constraint the PCI clk of my design to 50Mhz
(30ns for the 33Mhz clock and another 10ns to ensures that the setup
time is met)? I realise this will be an overkill on the internal logic
but may save me some effort.
2) The other way I think to do this is to constraint the PCI clk to
33MHz and specify the external delay on all the PCI signals to 7 or
8ns. While setting PCI clk to 33Mhz I also ticked the option of
including external delays in the frequency calculation. Is this the
correct approach? OR do I need to setup the tco.
Thanks in advance.
Regards
Tushit
Hi Tushit,

You can get an idea of the type of constraints required by downloading
the Altera PCI Megacore and studying the constraint files that ship with
it. To download the PCI Megacore:

1. Open http://www.altera.com/products/ip/ipm-index.html
2. Type in PCI in the IP Megasearch box.
3. Click on the Try OpenCorePlus for PCI Compiler, 32 bit Master/Target.
4. Download the Free Evaluation.
5. Install it into a directory.
6. cd to pci_compiler-v3.0.0\pci_mt32\const_files
7. There are three constraint file scripts in this directory
a. mt32_66_30_ep1c12f324c7_q40.tcl
b. mt32_66_30_ep1s40f1020c6_q40.tcl
c. mt32_stratixii.tcl

Study the constraints created in the mt32_stratixii.tcl in particular the
procs set_pci_timing and constraint_file.

That should help answer your questions.

- Subroto Datta
Altera Corp.
 
The PCI spec says I need to ensure a setup time of 7ns for all pins.
If you need a setup time of 7ns, simply add a TSU_REQUIREMENT of 7ns.
You can add this assignment individually to each of your pins (using
the Assignment Editor), use wildcards to group pins together, or
simply add a global Tsu requirement ("Timing Settings" DLG).
Using Tcl (if you are a command-line type of guy), simply do:

set_global_assignment -name TSU_REQUIREMENT 7ns
or
set_instance_assignment -to * -name TSU_REQUIREMENT 7ns

(Use "quartus_sh --qhelp" for more info on Tcl)

The PCI clock itself works at 33Mhz. I want to know the following:
1) Is it okay if I just constraint the PCI clk of my design to 50Mhz
(30ns for the 33Mhz clock and another 10ns to ensures that the setup
time is met)? I realise this will be an overkill on the internal logic
but may save me some effort.
Not really, increasing the code frequency is not going to help you
with your I/O timing. If anything, it will make it worse. You need an
I/O timing constraint to get the fitter to optimize the I/O path(s)

2) The other way I think to do this is to constraint the PCI clk to
33MHz and specify the external delay on all the PCI signals to 7 or
8ns. While setting PCI clk to 33Mhz I also ticked the option of
including external delays in the frequency calculation. Is this the
correct approach? OR do I need to setup the tco.
This will also work, specially in V4.0 where we added support for the
new INPUT_MAX_DELAY constraint (a great improvement over the
EXTERNAL_DELAY feature in V3.0), but in your case, it seems like a
simple TSU requirement is all you need (at least in terms of time
constraining the design)

And yes, the Tsu will only optimize your input path. For the output
path, you need to specify a TCO_REQUIREMENT using the same methodology
(or OUTPUT_MAX_DELAY in V4.0)

As Subroto indicated, studying the PCI core provided by Altera is a
good way to learn how to do it.

-David Karchmer
Altera Corp.
 
Hi Tushit,

As Subroto said, the best thing to do is to study Altera's PCI core to
get all the constraints right.

Here's a quick summary of the constraints for 33 MHz PCI:

- 7 ns Tsu constraint on all inputs
- 11 ns Tco constraint on the outputs
- 33 MHz constraint on the PCI clock
- 0 ns Th constraint on the inputs

Don't forget the Th (hold-time) constraint, since the PCI spec needs
it.

The Tsu and Tco constraints can instead be converted to clock path
constraints with the INPUT_MAX_DELAY constraint as David said, but it
would be easier to just set them as Tsu and Tco since then you don't
have to work on precisely what INPUT_MAX_DELAY you have to set.

Vaughn

tushitjain@yahoo.com (tushit) wrote in message news:<ec6aab0.0404130920.42fa2dfd@posting.google.com>...
Hi,
I am fairly new to FPGAs. I am trying to write the constraints for the
PCI module on an Altera Stratix device. I am using QuartusII for all
synthesis and P&R.
The PCI spec says I need to ensure a setup time of 7ns for all pins.
The PCI clock itself works at 33Mhz. I want to know the following:
1) Is it okay if I just constraint the PCI clk of my design to 50Mhz
(30ns for the 33Mhz clock and another 10ns to ensures that the setup
time is met)? I realise this will be an overkill on the internal logic
but may save me some effort.
2) The other way I think to do this is to constraint the PCI clk to
33MHz and specify the external delay on all the PCI signals to 7 or
8ns. While setting PCI clk to 33Mhz I also ticked the option of
including external delays in the frequency calculation. Is this the
correct approach? OR do I need to setup the tco.
Thanks in advance.
Regards
Tushit
 
Hi,
Thanks for all the help. I wrote the constraints as you have
described, but I am not able to meet the setup time requirement. The
PCI design was done originaly for an ASIC and changing it will be a
big project by itself. My setup time on some paths is 11-12ns. This is
because of a lot of comb. logic in the data path between pin and
register. Is it possible to add delays to the clock path only for the
register which has the setup time violation? This would mean that I
would be trading off freq. for setup time.
Does Quartus do this for me through any optimization options? I did
see a tsu-freq trade off but that is opposite of what I need.
Thanks again for all the help.
Regards
Tushit

vbetz@altera.com (Vaughn Betz) wrote in message news:<48761f7f.0404160830.662dc9d8@posting.google.com>...
Hi Tushit,

As Subroto said, the best thing to do is to study Altera's PCI core to
get all the constraints right.

Here's a quick summary of the constraints for 33 MHz PCI:

- 7 ns Tsu constraint on all inputs
- 11 ns Tco constraint on the outputs
- 33 MHz constraint on the PCI clock
- 0 ns Th constraint on the inputs

Don't forget the Th (hold-time) constraint, since the PCI spec needs
it.

The Tsu and Tco constraints can instead be converted to clock path
constraints with the INPUT_MAX_DELAY constraint as David said, but it
would be easier to just set them as Tsu and Tco since then you don't
have to work on precisely what INPUT_MAX_DELAY you have to set.

Vaughn

tushitjain@yahoo.com (tushit) wrote in message news:<ec6aab0.0404130920.42fa2dfd@posting.google.com>...
Hi,
I am fairly new to FPGAs. I am trying to write the constraints for the
PCI module on an Altera Stratix device. I am using QuartusII for all
synthesis and P&R.
The PCI spec says I need to ensure a setup time of 7ns for all pins.
The PCI clock itself works at 33Mhz. I want to know the following:
1) Is it okay if I just constraint the PCI clk of my design to 50Mhz
(30ns for the 33Mhz clock and another 10ns to ensures that the setup
time is met)? I realise this will be an overkill on the internal logic
but may save me some effort.
2) The other way I think to do this is to constraint the PCI clk to
33MHz and specify the external delay on all the PCI signals to 7 or
8ns. While setting PCI clk to 33Mhz I also ticked the option of
including external delays in the frequency calculation. Is this the
correct approach? OR do I need to setup the tco.
Thanks in advance.
Regards
Tushit
 
tushitjain@yahoo.com (tushit) wrote in message news:<ec6aab0.0404192227.772520b0@posting.google.com>...
Hi,
Thanks for all the help. I wrote the constraints as you have
described, but I am not able to meet the setup time requirement. The
PCI design was done originaly for an ASIC and changing it will be a
big project by itself. My setup time on some paths is 11-12ns. This is
because of a lot of comb. logic in the data path between pin and
register. Is it possible to add delays to the clock path only for the
register which has the setup time violation? This would mean that I
would be trading off freq. for setup time.
Does Quartus do this for me through any optimization options? I did
see a tsu-freq trade off but that is opposite of what I need.
Thanks again for all the help.
Regards
Tushit
Hi Tushit,

It sounds like you have too many levels of logic on your set-up path.
That is definitely the most difficult set of paths in PCI.

Quartus does not have an option to automatically delay the clock to a
register. There are (tricky) ways to do it by hand, but I wouldn't
recommend going down that route.

Which device and speed grade are you using? Which synthesis tool?
Knowing what you're using will help me give more focused answers.

Altera's PCI cores have 2 or 3 levels of logic on the Tsu critical
paths. The most critical paths are those involving trdy and irdy in
most cases, since those high-fanout signals are harder to localize.
So the most important thing to meeting PCI timing is to get a small
number of levels of logic on those paths. If you are using Quartus
Integrated Synthesis and finding it is not doing a good job on that
path, you can put lcell buffers in your HDL to tell the mapper where
you want the lcell boundaries. In most circuits this isn't necessary,
but PCI is a case where synthesis can fall short.

Another, simpler option, is to turn on physical synthesis and see if
it improves your results. Physical synthesis knows what the placement
is, so it can make better informed decisions about what should be a
logic cell than the front-end synthesis.

The good news is that if you get the levels of logic down to a
reasonable level, the fitter should do the rest automatically for you,
so long as you're using Quartus II 4.0 or later. We meet 66 MHz,
64-bit PCI with no place & route constraints in Stratix, so 33 MHz is
easy for the fitter.

Hope this helps. Let me know how it turns out!

Vaughn
Altera
 
Hi,
You are right, the trdy,irdy, cben, framen are the problem areas.
I am using quartus to do the synthesis and P&R. I looked at the timing
analysis report and the report for delay in data path looks like this:
I have edited slightly to make it readable...
------------------------------------------------------------------
Info: 1: + IC(0.000 ns) + CELL(0.976 ns) = 0.976 ns; Loc. = Pin_AT6;
PIN Node = 'cben[3]'
Info: 2: + IC(2.595 ns) + CELL(0.213 ns) = 3.784 ns; Loc. =
LC_X92_Y16_N1; COMB Node = '
Info: 3: + IC(0.364 ns) + CELL(0.213 ns) = 4.361 ns; Loc. =
LC_X92_Y16_N3; COMB Node = '
Info: 4: + IC(0.139 ns) + CELL(0.087 ns) = 4.587 ns; Loc. =
LC_X92_Y16_N4; COMB Node = '
Info: 5: + IC(0.351 ns) + CELL(0.087 ns) = 5.025 ns; Loc. =
LC_X92_Y16_N9; COMB Node = '
Info: 6: + IC(1.121 ns) + CELL(0.332 ns) = 6.478 ns; Loc. =
LC_X91_Y19_N8; COMB Node = '
Info: 7: + IC(0.139 ns) + CELL(0.087 ns) = 6.704 ns; Loc. =
LC_X91_Y19_N9; COMB Node = '
Info: 8: + IC(0.352 ns) + CELL(0.087 ns) = 7.143 ns; Loc. =
LC_X91_Y19_N3; COMB Node = '
Info: 9: + IC(2.143 ns) + CELL(0.213 ns) = 9.499 ns; Loc. =
LC_X82_Y31_N6; COMB Node = '
Info: 10: + IC(0.340 ns) + CELL(0.087 ns) = 9.926 ns; Loc. =
LC_X82_Y31_N9; COMB Node ='
Info: 11: + IC(1.658 ns) + CELL(0.087 ns) = 11.671 ns; Loc. =
LC_X88_Y27_N8; COMB Node = '
Info: 12: + IC(1.527 ns) + CELL(0.087 ns) = 13.285 ns; Loc. =
LC_X82_Y31_N2; COMB Node = '
Info: 13: + IC(1.641 ns) + CELL(0.087 ns) = 15.013 ns; Loc. =
LC_X81_Y26_N0; COMB Node = '
Info: 14: + IC(0.139 ns) + CELL(0.087 ns) = 15.239 ns; Loc. =
LC_X81_Y26_N1; COMB Node = '
Info: 15: + IC(0.593 ns) + CELL(0.087 ns) = 15.919 ns; Loc. =
LC_X82_Y26_N5; COMB Node = '
Info: 16: + IC(0.366 ns) + CELL(0.213 ns) = 16.498 ns; Loc. =
LC_X82_Y26_N1; COMB Node = '
Info: 17: + IC(0.918 ns) + CELL(0.364 ns) = 17.780 ns; Loc. =
LC_X85_Y26_N2; REG Node = '
Info: Total cell delay = 3.394 ns
Info: Total interconnect delay = 14.386 ns
---------------------------------------------------------------------------
The delay in clock path is about 4ns and this gives a tsu of 13 ns or
so.
It is going through a lot of combo nodes (I think 17!!). Will it help
to do a manual fitting.

To check if the routing delays could be reduced I cleaned up my device
and did a syn and P&R only with the PCI module. I assume this will
give a better P&R fit but I still got a similar slack for tsu. My
device util. with the full design in 75% of a stratix EP1S80 C6 grade.
With only PCI this goes down to ~20%.

I also tried the physical synthesis of combo logic option but this
didn't help.

Someone suggested reducing the fanout of the signals by duplicating
them, but I assume Quartus must be doing that for me. I know xilinx
has a "max fanout" setting, though I couldn't find it in quartus. If I
need to do this manually how will I do this?

If all else fails I will have to look into redesigning the combo logic
manually.
Thanks and regards
Tushit

Hi Tushit,

It sounds like you have too many levels of logic on your set-up path.
That is definitely the most difficult set of paths in PCI.

Quartus does not have an option to automatically delay the clock to a
register. There are (tricky) ways to do it by hand, but I wouldn't
recommend going down that route.

Which device and speed grade are you using? Which synthesis tool?
Knowing what you're using will help me give more focused answers.

Altera's PCI cores have 2 or 3 levels of logic on the Tsu critical
paths. The most critical paths are those involving trdy and irdy in
most cases, since those high-fanout signals are harder to localize.
So the most important thing to meeting PCI timing is to get a small
number of levels of logic on those paths. If you are using Quartus
Integrated Synthesis and finding it is not doing a good job on that
path, you can put lcell buffers in your HDL to tell the mapper where
you want the lcell boundaries. In most circuits this isn't necessary,
but PCI is a case where synthesis can fall short.

Another, simpler option, is to turn on physical synthesis and see if
it improves your results. Physical synthesis knows what the placement
is, so it can make better informed decisions about what should be a
logic cell than the front-end synthesis.

The good news is that if you get the levels of logic down to a
reasonable level, the fitter should do the rest automatically for you,
so long as you're using Quartus II 4.0 or later. We meet 66 MHz,
64-bit PCI with no place & route constraints in Stratix, so 33 MHz is
easy for the fitter.

Hope this helps. Let me know how it turns out!

Vaughn
Altera
 
Someone suggested reducing the fanout of the signals by duplicating
them, but I assume Quartus must be doing that for me. I know xilinx
has a "max fanout" setting, though I couldn't find it in quartus. If I
need to do this manually how will I do this?
To set the Max Fanout use the Quartus II Assignment Editor. The steps are
as follows:

1. Click on Assignments->Assignment Editor
2. Click on the Logic Options Button in the top right.
3. Double Click on am empty cell in the To column. You can either type in
your instance name whose fan out you want to restrict or click on the arrow
button which will bring up the node finder. You can select the name in the
node finder and hit OK.
4.In the Assignmnet Name field down select Maximum Fan-Out from the drop
down.
5. In the Value Column type on the Fan-Out number.

Alternatively if you know the name of the instance whose Fan Out you want to
restrict from the timing report, right click on the name in the timing
report and select Locate to Assignment Editor. This will open up the
Assignment Editor and populate the To column for you. Then follow steps 2, 4
and 5 above.


- Subroto Datta
Altera Corp.
 
tushitjain@yahoo.com (tushit) wrote in message news:<ec6aab0.0404260148.55675258@posting.google.com>...
Hi,
You are right, the trdy,irdy, cben, framen are the problem areas.
[... snip ...]

The delay in clock path is about 4ns and this gives a tsu of 13 ns or
so.
It is going through a lot of combo nodes (I think 17!!). Will it help
to do a manual fitting.

To check if the routing delays could be reduced I cleaned up my device
and did a syn and P&R only with the PCI module. I assume this will
give a better P&R fit but I still got a similar slack for tsu. My
device util. with the full design in 75% of a stratix EP1S80 C6 grade.
With only PCI this goes down to ~20%.

I also tried the physical synthesis of combo logic option but this
didn't help.

Someone suggested reducing the fanout of the signals by duplicating
them, but I assume Quartus must be doing that for me. I know xilinx
has a "max fanout" setting, though I couldn't find it in quartus. If I
need to do this manually how will I do this?

If all else fails I will have to look into redesigning the combo logic
manually.
Thanks and regards
Tushit
Hi Tushit,

I don't think you'll have much luck with manual placement and routing,
or emptying the device of other logic. The problem is simply too many
logic levels on the Tsu critical path.

Maximum fanout constraints aren't going to be much help here either,
since in the PCI cores I've seen the high-fanout signals are trdy and
irdy, and since those are sourced by IOs you can't duplicate them.

You'll have to redesign the Tsu-critical logic, or guide the
technology mapper to a better solution for Tsu by adding lcell buffers
to your HDL.

Regards,

Vaughn
Altera
 
Hi Vaughn, Subroto
Thanks for all your help. I am abandoning trying to meet the setup
time since the project is a prototyping of an ASIC design on FPGA and
will not go to a customer. As long as the PCI works on some PC with
reasonable reliability we will be happy and the design does seem to
work okay even with the 7ns slack on the setup time. I think this may
be because the PCI slot of my PC supports 66Mhz PCI in the same slot
and so the motherboard and PCI chip on it may have lower tco and
propagation delay than the PCI spec. requires, giving me extra margin
for the tsu.
Thank you once again.
Regards
Tushit

Hi Tushit,

I don't think you'll have much luck with manual placement and routing,
or emptying the device of other logic. The problem is simply too many
logic levels on the Tsu critical path.

Maximum fanout constraints aren't going to be much help here either,
since in the PCI cores I've seen the high-fanout signals are trdy and
irdy, and since those are sourced by IOs you can't duplicate them.

You'll have to redesign the Tsu-critical logic, or guide the
technology mapper to a better solution for Tsu by adding lcell buffers
to your HDL.

Regards,

Vaughn
Altera
 

Welcome to EDABoard.com

Sponsor

Back
Top