Question about OC PCI Cores

Nial Stewart · Sep 14, 2010

Hmm 4MB/s is very slow for PCI.

As above (/below) it's reasonable target performance, no x86 system
allows you to burst to a target card (seems ridiculous but is the case).

If you want better PCI performance you have to have master functionality
on your plug in card.

Maximum performance depends you your PC architecture and what else is
plugged into your PCI bus. I believe that 60-70 MB/s is considered
a reasonable result for real world systems.

Nial.

Sink0 · Sep 14, 2010

Hmm 4MB/s is very slow for PCI.

As above (/below) it's reasonable target performance, no x86 system
allows you to burst to a target card (seems ridiculous but is the case).

If you want better PCI performance you have to have master functionality
on your plug in card.

Maximum performance depends you your PC architecture and what else is
plugged into your PCI bus. I believe that 60-70 MB/s is considered
a reasonable result for real world systems.

Nial.

Hmm 60-70 MB/s would be a very reasonable speed. I would be happy wit
40-50MB/s. DO you mean. to achive that speed i would have to use the PC
master of my PCI card to read and write data from/to the x86 or send dat
from the x86 to the card with the PC as a master and send data from the PC
board to the x86 with the PCI board as a master? I am just curious, what
speed is achivable with a x1 PCI express bus? Any idea?

Thank you

---------------------------------------
Posted through http://www.FPGARelated.com

Nial Stewart · Sep 14, 2010

Hmm 60-70 MB/s would be a very reasonable speed. I would be happy with
40-50MB/s. DO you mean. to achive that speed i would have to use the PCI
master of my PCI card to read and write data from/to the x86 or send data
from the x86 to the card with the PC as a master and send data from the PCI
board to the x86 with the PCI board as a master?

No, all transactions have to me mastered by the plug in card.

Nial.

Nico Coesel · Sep 14, 2010

"Sink0" <sink00@n_o_s_p_a_m.n_o_s_p_a_m.gmail.com> wrote:

On Sep 13, 7:02=A0pm, "Sink0" <sink00@n_o_s_p_a_m.n_o_s_p_a_m.gmail.com
wrote:
Hmm, you have used it. can you give me some help please? I can't
understa=
nt
some points of documentations. I am planing on use it with a 32 bits
WB.
But i could not understand the =A0bytes lane table at implementation
sect=
ion.

Unfortunately I don't have the VHDL code in front of me to remind me
of exactly how I connected up the wishbone side, but for me I was
using 32-bit registers, and for that using the core was pretty
straightforward.

Which was the max speed with PCI32lite you could achive?

I was getting a little over a cycle a micro second iirc. So that's a
little over 4MB/s.

Which driver you
are using? I will run my card on a linux system too.

I wrote my own driver from scratch.

But was you able to
perform burst writes?

I didn't need burst writes and so didn't try. But I don't think you
will be able to do that with the pci32lite core - it's pretty basic,
fine for simple reading/writing to registers but not suitable for
higher performance data transfer. For that Master functionality is
needed.

Hmm 4MB/s is very slow for PCI. Can you send me your driver file? I got

Actually it is not. Like others pointed out: most x86 can't initiate
burst mode. The PCI card has to initiate burst mode. If the card
cannot do that you'll see the PCI bus has a lot of overhead for single
32bit transfers.

--
Failure does not prove something is impossible, failure simply
indicates you are not using the right tools...
nico@nctdevpuntnl (punt=.)
--------------------------------------------------------------

Sink0 · Sep 14, 2010

"Sink0" <sink00@n_o_s_p_a_m.n_o_s_p_a_m.gmail.com> wrote:

On Sep 13, 7:02=A0pm, "Sink0
sink00@n_o_s_p_a_m.n_o_s_p_a_m.gmail.com
wrote:
Hmm, you have used it. can you give me some help please? I can't
understa=
nt
some points of documentations. I am planing on use it with a 32 bits
WB.
But i could not understand the =A0bytes lane table at implementation
sect=
ion.

Unfortunately I don't have the VHDL code in front of me to remind me
of exactly how I connected up the wishbone side, but for me I was
using 32-bit registers, and for that using the core was pretty
straightforward.

Which was the max speed with PCI32lite you could achive?

I was getting a little over a cycle a micro second iirc. So that's a
little over 4MB/s.

Which driver you
are using? I will run my card on a linux system too.

I wrote my own driver from scratch.

But was you able to
perform burst writes?

I didn't need burst writes and so didn't try. But I don't think you
will be able to do that with the pci32lite core - it's pretty basic,
fine for simple reading/writing to registers but not suitable for
higher performance data transfer. For that Master functionality is
needed.

Hmm 4MB/s is very slow for PCI. Can you send me your driver file? I got

Actually it is not. Like others pointed out: most x86 can't initiate
burst mode. The PCI card has to initiate burst mode. If the card
cannot do that you'll see the PCI bus has a lot of overhead for single
32bit transfers.

--
Failure does not prove something is impossible, failure simply
indicates you are not using the right tools...
nico@nctdevpuntnl (punt=.)

Hmm but if initiate burst with my card, will x86 be able to handle it or
need to make use DMA? And what would be the benefit of using PCI instead o
USB2.0? If i use a chip like PLX PCI 9030 will i be able to achive hig
speed at PCI bus?

Thank you!

---------------------------------------
Posted through http://www.FPGARelated.com

Nico Coesel · Sep 14, 2010

"Sink0" <sink00@n_o_s_p_a_m.n_o_s_p_a_m.gmail.com> wrote:

"Sink0" <sink00@n_o_s_p_a_m.n_o_s_p_a_m.gmail.com> wrote:

On Sep 13, 7:02=A0pm, "Sink0"
sink00@n_o_s_p_a_m.n_o_s_p_a_m.gmail.com
wrote:
Hmm, you have used it. can you give me some help please? I can't
understa=
nt
some points of documentations. I am planing on use it with a 32 bits
WB.
But i could not understand the =A0bytes lane table at implementation
sect=
ion.

Unfortunately I don't have the VHDL code in front of me to remind me
of exactly how I connected up the wishbone side, but for me I was
using 32-bit registers, and for that using the core was pretty
straightforward.

Which was the max speed with PCI32lite you could achive?

I was getting a little over a cycle a micro second iirc. So that's a
little over 4MB/s.

Which driver you
are using? I will run my card on a linux system too.

I wrote my own driver from scratch.

But was you able to
perform burst writes?

I didn't need burst writes and so didn't try. But I don't think you
will be able to do that with the pci32lite core - it's pretty basic,
fine for simple reading/writing to registers but not suitable for
higher performance data transfer. For that Master functionality is
needed.

Hmm 4MB/s is very slow for PCI. Can you send me your driver file? I got

Actually it is not. Like others pointed out: most x86 can't initiate
burst mode. The PCI card has to initiate burst mode. If the card
cannot do that you'll see the PCI bus has a lot of overhead for single
32bit transfers.

Hmm but if initiate burst with my card, will x86 be able to handle it or i
need to make use DMA? And what would be the benefit of using PCI instead of
USB2.0? If i use a chip like PLX PCI 9030 will i be able to achive high
speed at PCI bus?

DMA is something you should forget when talking about PCI. PCI is
about pushing or pulling blocks of data between a master and a slave.
For some reason a PC cannot pull a block of data from a PCI card so
the PCI card needs to be capable of becoming a master and push data
into the PC's memory.

The driver needs to allocate a piece of fixed non-swappable memory and
get the physical address from the OS. The driver sets this address and
the number of bytes to transfer in the PCI card's registers. After
this, the PCI card can write (push) the data into the PC's memory.

--
Failure does not prove something is impossible, failure simply
indicates you are not using the right tools...
nico@nctdevpuntnl (punt=.)
--------------------------------------------------------------

Sink0 · Sep 14, 2010

DMA is something you should forget when talking about PCI. PCI is
about pushing or pulling blocks of data between a master and a slave.
For some reason a PC cannot pull a block of data from a PCI card so
the PCI card needs to be capable of becoming a master and push data
into the PC's memory.

The driver needs to allocate a piece of fixed non-swappable memory and
get the physical address from the OS. The driver sets this address and
the number of bytes to transfer in the PCI card's registers. After
this, the PCI card can write (push) the data into the PC's memory.

--
Failure does not prove something is impossible, failure simply
indicates you are not using the right tools...
nico@nctdevpuntnl (punt=.)
--------------------------------------------------------------

That was a very useful piece of information. But explain me something. Let
supose a situation. PC write data to PCI card. After a while PCI card
become master of the bus and send a burst of data to PC. Lets say we are
using Linux OS. TO send and read data to/from the PCI card its
straigtfoward using the functions read and write memory. But considering
the PCI card as a master, how am i going to read that data at the PC side?
With the same read memory function does not seens the most intuitive answer
becouse i supose the read memory function would be associated to a PCI
transaction with a CBE pins pattern.

Thank you!!!

---------------------------------------
Posted through http://www.FPGARelated.com

Nico Coesel · Sep 14, 2010

"Sink0" <sink00@n_o_s_p_a_m.n_o_s_p_a_m.gmail.com> wrote:

DMA is something you should forget when talking about PCI. PCI is
about pushing or pulling blocks of data between a master and a slave.
For some reason a PC cannot pull a block of data from a PCI card so
the PCI card needs to be capable of becoming a master and push data
into the PC's memory.

The driver needs to allocate a piece of fixed non-swappable memory and
get the physical address from the OS. The driver sets this address and
the number of bytes to transfer in the PCI card's registers. After
this, the PCI card can write (push) the data into the PC's memory.

--
Failure does not prove something is impossible, failure simply
indicates you are not using the right tools...
nico@nctdevpuntnl (punt=.)
--------------------------------------------------------------

That was a very useful piece of information. But explain me something. Let
supose a situation. PC write data to PCI card. After a while PCI card
become master of the bus and send a burst of data to PC. Lets say we are
using Linux OS. TO send and read data to/from the PCI card its
straigtfoward using the functions read and write memory. But considering
the PCI card as a master, how am i going to read that data at the PC side?
With the same read memory function does not seens the most intuitive answer
becouse i supose the read memory function would be associated to a PCI
transaction with a CBE pins pattern.

The PCI card is supposed to generate an interrupt when the transfer is
finished. The driver which handles the interrupts knows that all the
data is written in the buffer and can start processing the data.

--
Failure does not prove something is impossible, failure simply
indicates you are not using the right tools...
nico@nctdevpuntnl (punt=.)
--------------------------------------------------------------

Sink0 · Sep 14, 2010

The PCI card is supposed to generate an interrupt when the transfer is
finished. The driver which handles the interrupts knows that all the
data is written in the buffer and can start processing the data.

--
Failure does not prove something is impossible, failure simply
indicates you are not using the right tools...
nico@nctdevpuntnl (punt=.)
--------------------------------------------------------------

Thank you very much. You are helping a lot!!. I am designing the drive
following the Linux Device Driver book and an driver i found at Open Cores
Do you have any other good reference for that kind of information? Have yo
ever worked with PCI design?

Thank you!!

---------------------------------------
Posted through http://www.FPGARelated.com

Brian Drummond · Sep 15, 2010

On Tue, 14 Sep 2010 07:11:35 -0500, "Sink0"
<sink00@n_o_s_p_a_m.n_o_s_p_a_m.gmail.com> wrote:

Hmm 4MB/s is very slow for PCI.

As above (/below) it's reasonable target performance, no x86 system
allows you to burst to a target card (seems ridiculous but is the case).

If you want better PCI performance you have to have master functionality
on your plug in card.

Maximum performance depends you your PC architecture and what else is
plugged into your PCI bus. I believe that 60-70 MB/s is considered
a reasonable result for real world systems.

Nial.

Hmm 60-70 MB/s would be a very reasonable speed. I would be happy with
40-50MB/s. DO you mean. to achive that speed i would have to use the PCI
master of my PCI card to read and write data from/to the x86 or send data
from the x86 to the card with the PC as a master and send data from the PCI
board to the x86 with the PCI board as a master? I am just curious, whats
speed is achivable with a x1 PCI express bus? Any idea?

Using a Virtex-5 with a x1 PCIe interface (Xilinx ML506 board) using the
straight Xilinx PCIe slave implementation, I found I could write to the board at
25MB/s approximately.

Reading was another matter - effectively, the host PC writes a request to the
board, which then performs a write back to the host. This only gave about
2.5MB/sec.

In both cases the same was true - the host cannot initiate burst transfers.

I rolled my own simple DMA hardware to generate burst transfers, using the
Xilinx PCIe core as a master, and that could read at 150MB/s. I expect to get a
little more when I double-buffer, so that DMA can continue into one buffer while
the second is being transferred to user code.

Xilinx's own DMA cores may or may not work - I can't tell. They are, bizarrely,
only available in Verilog, despite multiple independent Webcases requesting VHDL
versions. However, rolling a simplified version in VHDL is quick and easy.

I don't have a figure for DMA speeds into the card, but expect them to be
slightly lower, because the card will have to initiate burst reads. (That means
it requests the host to write bursts back to it, which increases the turnround
time)

You will find a PCI driver for the Raggedstone PCI card on Enterpoint's site. It
would be slightly rude to use it on someone else's product, but it may serve as
an example for your own driver.

I can strongly recommend adding "Essential Linux Device Drivers" (Venkateswaran)
to the shopping list. While "LDD" 3rd edition is very good, the two books cover
different aspects of driver design and I feel I need both. "ELDD" is also newer;
there is much in "LDD" that is out of date, unless you are still using a 2.4 or
very early 2.6 kernel.

- Brian

H. Peter Anvin · Sep 15, 2010

On 09/13/2010 09:30 AM, Nial Stewart wrote:

I have used the pci32lite in a Spartan 3. What I found was that:

(1) I couldn't get it to do burst reads, because surprisingly, the PC
doesn't do burst reads. At least that was the conclusion I came to
after googling to find out why my linux driver was breaking up burst
requests into individual transactions. If you want to do burst reads,
you need master functionality in the PCI IP. Also, iirc, pci32lite
documentation indicated that it "supports" burst mode by signaling to
the host system that bursts need to be broken into individually
transactions.

AFAIK no x86 based machines will drive burst accesses to plug in
cards.

They can if you mark the memory write combining.

-hpa

Sink0 · Sep 15, 2010

On 09/13/2010 09:30 AM, Nial Stewart wrote:
I have used the pci32lite in a Spartan 3. What I found was that:

(1) I couldn't get it to do burst reads, because surprisingly, the PC
doesn't do burst reads. At least that was the conclusion I came to
after googling to find out why my linux driver was breaking up burst
requests into individual transactions. If you want to do burst reads,
you need master functionality in the PCI IP. Also, iirc, pci32lite
documentation indicated that it "supports" burst mode by signaling to
the host system that bursts need to be broken into individually
transactions.

AFAIK no x86 based machines will drive burst accesses to plug in
cards.

They can if you mark the memory write combining.

-hpa

But is wirite combining reliable? All i have reared is that lots of proble
can happens.

Sink

---------------------------------------
Posted through http://www.FPGARelated.com

Weng Tianxiang · Sep 16, 2010

On Sep 15, 3:40 am, "Sink0" <sink00@n_o_s_p_a_m.n_o_s_p_a_m.gmail.com>
wrote:

On 09/13/2010 09:30 AM, Nial Stewart wrote:
I have used the pci32lite in a Spartan 3. What I found was that:

(1) I couldn't get it to do burst reads, because surprisingly, the PC
doesn't do burst reads. At least that was the conclusion I came to
after googling to find out why my linux driver was breaking up burst
requests into individual transactions. If you want to do burst reads,
you need master functionality in the PCI IP. Also, iirc, pci32lite
documentation indicated that it "supports" burst mode by signaling to
the host system that bursts need to be broken into individually
transactions.

AFAIK no x86 based machines will drive burst accesses to plug in
cards.

They can if you mark the memory write combining.

-hpa

But is wirite combining reliable? All i have reared is that lots of problem
can happens.

Sink

---------------------------------------
Posted throughhttp://www.FPGARelated.com

Hi,
Here is my experience with developing a 64-bit/66MHz PCI core and
multiple products relating with the core design, using both Altera
chip Flex 20K(?) and Xilinx Virtex II-1000.

The testing data rate from our client report was: 480MB/s. I was not
involved in the testing procedure. Later I developed a new version
which can run 528MB/s, but failed to implement in a board, because the
original board was wired in such a way that it didn't recognize
different enable signals from two different CE.

1. Altera chip Flex 20K had big layout issue to meet 66MHz for both
IRDY and TRDY pins. FRAME and STOP were never a problem in my design.
Each code changes needed 1 week to do layout manually to make them
meet 66MHz. The time spent entirely on IRDY and TRDY pins.

The state machines for Master or Slave were both written using Altera
ASM(?) language that makes the layout manually possible.

After losing a big customer due to the 1-week manually layout, the
company decided to switch to Xilinx Virtex II-1000.

2. Xilinx has an invention about how to generate IRDY and TRDY signals
in the US patent :6218864_Xilinx-Structure and method for generating a
clock enable signal in a PLD.pdf
http://www.google.com/patents/about?id=DnwGAAAAEBAJ&dq=PCI,+IRDY,+TRDY+inassignee:Xilinx&as_drrb_ap=q&as_minm_ap=0&as_miny_ap=&as_maxm_ap=0&as_maxy_ap=&as_drrb_is=q&as_minm_is=0&as_miny_is=&as_maxm_is=0&as_maxy_is=&num=100

After my design was transferred from Altera to Xilinx, the final
running frequency can be reached up to 68MHz after one compilation
without big efforts, and even one military product occupies 90% LUT
space and it still runs above 66MHz.

3. Because Xilinx doesn't provide a low program language similar to
Altera's, both the master state machine and slave state machine must
be carefully rewritten and kept in an independent module to compile,
otherwise it couldn't meet the 66MHz running frequency.

From two companies' practice dealing with PCI bus, I found Altera was
always one step slower than Xilinx. After PCI core implementation, it
was very clear that IRDY and TRDY had combinational heavy burdens,
Altera did nothing to invent something to improve it, but Xilinx did
find something useful for a patent and it works well at least in my
case. This invention is good, but not the best. I had a best invention
for it, but it was too later for an invention to be useful. Because
PCI bus had been outdated.

4. Xilinx is excellent in developing new technology. Its new
technology may be fast to apply for patents, but not always the best
in ASIC field.

But the two best and most important inventions in FPGA field are
Xilinx inventions: 1. Using memory address to get its function. 2:
Introducing carry-in structure in FPGA field.

Altera had a motto when one opened its software: What you can do, we
can do better. At first I thought it was a good statement when dealing
with my manually layout for PCI bus, after switching to Xilinx chip, I
changed my mind and thought it was a real shameful motto, compared
with Xilinx efforts in the same field. In the PCI bus respect, I
thought Altera was a loser.

Weng

Nial Stewart · Sep 16, 2010

In the PCI bus respect, I thought Altera was a loser.

In Altera's defence this must have been a good few years ago, the Flex devices
haven't been 'current' for a while.

And comparing a flex device with a Virtex you're comparing a low cost
device with a much more expensive part!

These days I'd expect _any_ FPGA to fairly easily meet 33MHz 32 Bits and
any current Stratix or Virtex to meet 66MHz 64 Bits.

Nial.

Nial Stewart · Sep 16, 2010

4. The same design using Flex 20K by Altera got about 56MHz running
frequency, and I had to spend 1 week to manually layout to get it over
66.7MHz. All critical paths are IRDY and TRDY.
5. The same design using Virtex II by Xilinx usually got 67MHz or
above running frequency after first compilation.

Do you mean Apex20K? That was released in 1998, I can't find the date for
the Virtex II release but it was 2001/2002 I think which makes it the next
generation of devices.

If you're comparing a Flex20K with a VirtexII you're comparing a low cost
device with a top of the range device which again isn't a fair comparison.

Anyway this isn't really helping the OP with his queries.

Nial.

Weng Tianxiang · Sep 16, 2010

On Sep 16, 2:54 am, "Nial Stewart"
<nial*REMOVE_TH...@nialstewartdevelopments.co.uk> wrote:

In the PCI bus respect, I thought Altera was a loser.

In Altera's defence this must have been a good few years ago, the Flex devices
haven't been 'current' for a while.

And comparing a flex device with a Virtex you're comparing a low cost
device with a much more expensive part!

These days I'd expect _any_ FPGA to fairly easily meet 33MHz 32 Bits and
any current Stratix or Virtex to meet 66MHz 64 Bits.

Nial.

1. The chip switch from Flex 20K to Virtex II occurs in 2003. At that
time, I thought both were at the same level of technology. And I heard
that the price of Virtex II -1000 was even lower than Flex 20K's at
the time. Maybe there was a promotion selling scheme behind the door
by Xilinx to encourage customers to switch to its products.

2. Running frequency is a lifeline: death or live. No matter how good
your design is, when it fails to reach the specified running
frequency, all is dead. When redesigning with a new military product,
the LUT space usage in Virtex II was above 93%. At that time, one can
imagine that if everything went well except the running frequency, the
product board had to be redesigned, chips had to be reselected and
deadline would be rescheduled and a big penalty was pending.
fortuitously after several fine tuning of compilation parameters,
using Xilinx ISE 8.x, it passed the 66.7MHz running frequency, but it
never passed above 66.7MHz using Xlinx ISE 9.x when a colleague tried
to confirm the running frequency using the latest version of ISE 9.x.

3. The PCI legacy circuits from both Altera and Xilinx go into their
advanced structures of their next generations, but the shortcoming
with Altera handling PCI bus method doesn't go away as you claim. It
costs a lot of route resources to get the 66MHz. When LUT space is
being used up, the problem will pop up and goes after you.

I would like to provide more information about the chip switch.
4. The same design using Flex 20K by Altera got about 56MHz running
frequency, and I had to spend 1 week to manually layout to get it over
66.7MHz. All critical paths are IRDY and TRDY.
5. The same design using Virtex II by Xilinx usually got 67MHz or
above running frequency after first compilation.

Weng

rickman · Sep 17, 2010

On Sep 16, 10:25 am, Weng Tianxiang <wtx...@gmail.com> wrote:

1. The chip switch from Flex 20K to Virtex II occurs in 2003. At that
time, I thought both were at the same level of technology. And I heard
that the price of Virtex II -1000 was even lower than Flex 20K's at
the time. Maybe there was a promotion selling scheme behind the door
by Xilinx to encourage customers to switch to its products.

Chip prices are greatly affected by the process generation in two
ways. One is that newer processes allow more chips off the same size
wafer so the cost per die goes down. So they can charge less for the
newer generations per LUT. But average selling price is very
important to them since this is the real determiner of profit. That's
why they have to keep adding larger and larger parts with high prices
as part of their business model.

The other is that FPGA makers understand the value of design wins
early in the life cycle of a new chip family. So they will go to
great lengths to get those wins when the chips are still warm from the
prototype ovens. Once the chips are shipping in volume they already
know how successful their new line is going to be and they start
promoting the next generation. Xilinx is even bigger at this than the
others. They will cut prices on new families to the bone. I once got
a 50,000 qty price on expected shipments of 1,000 per year on a new
part. A few months later as the design was about to go to production,
I wanted to change to a new package with fewer pins and they would
only match the old price even though the new package would save them
lots of dough.

Between those two effects, I would expect to see nearly a two to one
difference in pricing on parts from different generations if
everything else is equal. In fact, I am surprised the Altera folks
didn't push you into a newer part. It's not just a matter of what is
best for the company, the salesmen get big incentives for design wins
on the newest product families. You did speak with the salesmen from
both companies, right?

2. Running frequency is a lifeline: death or live. No matter how good
your design is, when it fails to reach the specified running
frequency, all is dead. When redesigning with a new military product,
the LUT space usage in Virtex II was above 93%. At that time, one can
imagine that if everything went well except the running frequency, the
product board had to be redesigned, chips had to be reselected and
deadline would be rescheduled and a big penalty was pending.
fortuitously after several fine tuning of compilation parameters,
using Xilinx ISE 8.x, it passed the 66.7MHz running frequency, but it
never passed above 66.7MHz using Xlinx ISE 9.x when a colleague tried
to confirm the running frequency using the latest version of ISE 9.x.

As you say, for some designs frequency is paramount. But this is not
only a function of the silicon. The software plays as large a role or
even larger. That is why I seldom look at a family data sheet if I
want to consider speed. If the software gives me an inefficient
implementation, the data sheet is useless. At that time I believe the
older Altera parts were still being designed using MAX+PLUS II. I can
personally verify that this software was a total dog for getting good
speed results. They eventually switched all their parts over to
Quartus which gives much better results. So if you want to make any
general statements comparing two FPGA families, make sure you have
used recent software.

3. The PCI legacy circuits from both Altera and Xilinx go into their
advanced structures of their next generations, but the shortcoming
with Altera handling PCI bus method doesn't go away as you claim. It
costs a lot of route resources to get the 66MHz. When LUT space is
being used up, the problem will pop up and goes after you.

Again, this is a software issue as well. Unless you have tested your
design on current chips using current software, the comparison means
nothing.

Regards,

Rick

Sebastien Bourdeauducq · Sep 18, 2010

On Sep 10, 10:49 pm, Gabor <ga...@alacron.com> wrote:

Xilinx cores make use of special features of certain pins on the
device with names like IRDY and TRDY that have some built-in logic to speed up the
combinatorial PCI requirements. I don't think the synthesis tools support these
functions directly.

AFAIK this function is performed by a special logic cell in the FPGA
called PCILOGIC on Spartan 2 and PCILOGICSE on Spartan 3E/6, with
dedicated routing to the I/Os. To make the those cells appear in FPGA
Editor (they don't by default, apparently Xilinx don't want you to use
them), enter in the FPGA Editor command line (the white text box at
the bottom of the window):
select site *PCI*
add
I guess you can instantiate those primitives in the HDL and the P&R
tool will nicely use the dedicated I/O routing channels.

However the FPGA's have been getting faster, so you may not need
the extra stunt hardware to meet PCI timing anymore.

Maybe, but then you have to properly use the UCF constraint system,
which is another evil =]

S.

Weng Tianxiang · Sep 18, 2010

On Sep 17, 3:32 pm, Sebastien Bourdeauducq
<sebastien.bourdeaud...@gmail.com> wrote:

On Sep 10, 10:49 pm, Gabor <ga...@alacron.com> wrote:

Xilinx cores make use of special features of certain pins on the
device with names like IRDY and TRDY that have some built-in logic to speed up the
combinatorial PCI requirements. I don't think the synthesis tools support these
functions directly.

AFAIK this function is performed by a special logic cell in the FPGA
called PCILOGIC on Spartan 2 and PCILOGICSE on Spartan 3E/6, with
dedicated routing to the I/Os. To make the those cells appear in FPGA
Editor (they don't by default, apparently Xilinx don't want you to use
them), enter in the FPGA Editor command line (the white text box at
the bottom of the window):
select site *PCI*
add
I guess you can instantiate those primitives in the HDL and the P&R
tool will nicely use the dedicated I/O routing channels.

However the FPGA's have been getting faster, so you may not need
the extra stunt hardware to meet PCI timing anymore.

Maybe, but then you have to properly use the UCF constraint system,
which is another evil =]

S.

"At that time I believe the
older Altera parts were still being designed using MAX+PLUS II. I
can
personally verify that this software was a total dog for getting good
speed results. They eventually switched all their parts over to
Quartus which gives much better results. "

The claim is false. I used Quartus 4.x at that time.

"Do you mean Apex20K? That was released in 1998, I can't find the date
for
the Virtex II release but it was 2001/2002 I think which makes it the
next
generation of devices. "

Yes. Apex 20K. Virtex II release Document mentioned date: V1.1
December 6, 2000.

At the time both were top products for their companies.

Weng

rickman · Sep 18, 2010

On Sep 17, 7:06 pm, Weng Tianxiang <wtx...@gmail.com> wrote:

On Sep 17, 3:32 pm, Sebastien Bourdeauducq

sebastien.bourdeaud...@gmail.com> wrote:
On Sep 10, 10:49 pm, Gabor <ga...@alacron.com> wrote:

Xilinx cores make use of special features of certain pins on the
device with names like IRDY and TRDY that have some built-in logic to speed up the
combinatorial PCI requirements. I don't think the synthesis tools support these
functions directly.

AFAIK this function is performed by a special logic cell in the FPGA
called PCILOGIC on Spartan 2 and PCILOGICSE on Spartan 3E/6, with
dedicated routing to the I/Os. To make the those cells appear in FPGA
Editor (they don't by default, apparently Xilinx don't want you to use
them), enter in the FPGA Editor command line (the white text box at
the bottom of the window):
select site *PCI*
add
I guess you can instantiate those primitives in the HDL and the P&R
tool will nicely use the dedicated I/O routing channels.

However the FPGA's have been getting faster, so you may not need
the extra stunt hardware to meet PCI timing anymore.

Maybe, but then you have to properly use the UCF constraint system,
which is another evil =]

S.

"At that time I believe the
older Altera parts were still being designed using MAX+PLUS II. I
can
personally verify that this software was a total dog for getting good
speed results. They eventually switched all their parts over to
Quartus which gives much better results. "

The claim is false. I used Quartus 4.x at that time.

"Do you mean Apex20K? That was released in 1998, I can't find the date
for
the Virtex II release but it was 2001/2002 I think which makes it the
next
generation of devices. "

Yes. Apex 20K. Virtex II release Document mentioned date: V1.1
December 6, 2000.

At the time both were top products for their companies.

Weng

Ok, but the fact remains that you are judging these two companies on
products that are multiple generations old using software that is a
decade old. You are aware that technology advances, right?

Rick

Question about OC PCI Cores

Nial Stewart

Guest

Sink0

Guest

Nial Stewart

Guest

Nico Coesel

Guest

Sink0

Guest

Nico Coesel

Guest

Sink0

Guest

Nico Coesel

Guest

Sink0

Guest

Brian Drummond

Guest

H. Peter Anvin

Guest

Sink0

Guest

Weng Tianxiang

Guest

Nial Stewart

Guest

Nial Stewart

Guest

Weng Tianxiang

Guest

rickman

Guest

Sebastien Bourdeauducq

Guest

Weng Tianxiang

Guest

rickman

Guest

Log in

Welcome to EDABoard.com

Sponsor