Is it possible to implement Ethernet on bare metal FPGA, Wit

On Tuesday, February 5, 2019 at 12:33:18 AM UTC+2, Tom Gardner wrote:
Back in the late 80s there was the perception that TCP was
slow, and hence new transport protocols were developed to
mitigate that, e.g. XTP.

In reality, it wasn't TCP per se that was slow. Rather
the implementation, particularly multiple copies of data
as the packet went up the stack, and between network
processor / main processor and between kernel and user
space.

TCP per se *is* slow when frame error rate of underlying layers is not near zero.

Also, there exist cases of "interesting" interactions between Nagle algorithm at transmitter and ACK saving algorithm at receiver that can lead to slowness of certain styles of TCP conversions (Send mid-size block of data, wait for application-level acknowledge, send next mid-size block) that is typically resolved by not following the language of RFCs too literally.
 
On 07/02/2019 11:07, already5chosen@yahoo.com wrote:
On Tuesday, February 5, 2019 at 12:12:47 PM UTC+2, David Brown wrote:
On 04/02/2019 21:55, gnuarm.deletethisbit@gmail.com wrote:

I don't know a lot about TCP/IP, but I've been told you can implement it to many different degrees depending on your requirements. I think it had to do with the fact that some aspects are specified rather vaguely, timeouts and who manages the retries, etc. I assume this was not as full an implementation as you might have on a PC. So I wonder if this is an apples to oranges comparison.


That is correct - there are lots of things in IP networking in general,
and TCP/IP on top of that, which can be simplified, limited, or handled
statically. For example, TCP/IP has window size control so that each
end can automatically adjust if there is a part of the network that has
a small MTU (packet size) - that way there will be less fragmentation,
and greater throughput. That is an issue if you have dial-up modems and
similar links - if you have a more modern network, you could simply
assume a larger window size and leave it fixed. There are a good many
such parts of the stack that can be simplified.



Are there any companies selling TCP/IP that they actually list on their web site?


TCP window size and MTU are orthogonal concepts.
Judged by this post, I'd suspect that you know more about TCP that Rick C, but less than Rick H which sounds like the only one of 3 of you that had his own hands dirty in attempt to implement it.

They are different concepts, yes, window size can be reduced to below
MTU size on small systems to ensure that you don't get fragmentation,
and you don't need to resend more than one low-level packet. But it is
not a level of detail that I have needed to work at, so I have no
personal experience of that.
 
On 07/02/19 10:23, already5chosen@yahoo.com wrote:
On Tuesday, February 5, 2019 at 12:33:18 AM UTC+2, Tom Gardner wrote:

Back in the late 80s there was the perception that TCP was slow, and hence
new transport protocols were developed to mitigate that, e.g. XTP.

In reality, it wasn't TCP per se that was slow. Rather the implementation,
particularly multiple copies of data as the packet went up the stack, and
between network processor / main processor and between kernel and user
space.

TCP per se *is* slow when frame error rate of underlying layers is not near
zero.

That's a problem with any transport protocol.

The solution to underlying frame errors is FEC, but that
reduces the bandwidth when there are no errors. Choose
what you optimise for!


Also, there exist cases of "interesting" interactions between Nagle algorithm
at transmitter and ACK saving algorithm at receiver that can lead to slowness
of certain styles of TCP conversions (Send mid-size block of data, wait for
application-level acknowledge, send next mid-size block) that is typically
resolved by not following the language of RFCs too literally.

That sounds like a "corner case". I'd be surprised
if you couldn't find corner cases in all transport
protocols.
 
On Thursday, February 7, 2019 at 10:04:09 PM UTC+2, Tom Gardner wrote:
On 07/02/19 10:23, already5chosen@yahoo.com wrote:
On Tuesday, February 5, 2019 at 12:33:18 AM UTC+2, Tom Gardner wrote:

Back in the late 80s there was the perception that TCP was slow, and hence
new transport protocols were developed to mitigate that, e.g. XTP.

In reality, it wasn't TCP per se that was slow. Rather the implementation,
particularly multiple copies of data as the packet went up the stack, and
between network processor / main processor and between kernel and user
space.

TCP per se *is* slow when frame error rate of underlying layers is not near
zero.

That's a problem with any transport protocol.

TCP is worse than most.
Partly because it's jack of all trades in terms of latency and bandwidth.
Partly, because it's stream (rather than datagram) oriented, which makes recovery, based on selective retransmission far more complicated=less practical.

The solution to underlying frame errors is FEC, but that
reduces the bandwidth when there are no errors. Choose
what you optimise for!


Also, there exist cases of "interesting" interactions between Nagle algorithm
at transmitter and ACK saving algorithm at receiver that can lead to slowness
of certain styles of TCP conversions (Send mid-size block of data, wait for
application-level acknowledge, send next mid-size block) that is typically
resolved by not following the language of RFCs too literally.

That sounds like a "corner case". I'd be surprised
if you couldn't find corner cases in all transport
protocols.

Sure. But not a rare corner case. And again, far less likely to happen to datagram-oriented reliable transports.
 
On Thu, 07 Feb 2019 20:04:04 +0000, Tom Gardner wrote:

On 07/02/19 10:23, already5chosen@yahoo.com wrote:
On Tuesday, February 5, 2019 at 12:33:18 AM UTC+2, Tom Gardner wrote:

Back in the late 80s there was the perception that TCP was slow, and
hence new transport protocols were developed to mitigate that, e.g.
XTP.

In reality, it wasn't TCP per se that was slow. Rather the
implementation,
particularly multiple copies of data as the packet went up the stack,
and between network processor / main processor and between kernel and
user space.

TCP per se *is* slow when frame error rate of underlying layers is not
near zero.

That's a problem with any transport protocol.

The solution to underlying frame errors is FEC, but that reduces the
bandwidth when there are no errors. Choose what you optimise for!

FEC does reduce bandwidth in some sense, but in all of the Ethernet FEC
implementations I've done, the 64B66B signal is recoded into something
more efficient to make room for the FEC overhead. IOW, the raw bit rate
on the fibre is the same whether FEC is on or off.

Perhaps a more important issue is latency. In my experience these are
block codes, and the entire block must be received before it can be
corrected. The last one I did added about 240ns when FEC was enabled.

Optics modules (e.g. QSFP) that have sufficient margin to work without
FEC are sometimes marketed as "low latency" even though they have the
same latency as the ones that require FEC.

Regards,
Allan
 
On 08/02/19 10:35, Allan Herriman wrote:
On Thu, 07 Feb 2019 20:04:04 +0000, Tom Gardner wrote:

On 07/02/19 10:23, already5chosen@yahoo.com wrote:
On Tuesday, February 5, 2019 at 12:33:18 AM UTC+2, Tom Gardner wrote:

Back in the late 80s there was the perception that TCP was slow, and
hence new transport protocols were developed to mitigate that, e.g.
XTP.

In reality, it wasn't TCP per se that was slow. Rather the
implementation,
particularly multiple copies of data as the packet went up the stack,
and between network processor / main processor and between kernel and
user space.

TCP per se *is* slow when frame error rate of underlying layers is not
near zero.

That's a problem with any transport protocol.

The solution to underlying frame errors is FEC, but that reduces the
bandwidth when there are no errors. Choose what you optimise for!

FEC does reduce bandwidth in some sense, but in all of the Ethernet FEC
implementations I've done, the 64B66B signal is recoded into something
more efficient to make room for the FEC overhead. IOW, the raw bit rate
on the fibre is the same whether FEC is on or off.

Perhaps a more important issue is latency. In my experience these are
block codes, and the entire block must be received before it can be
corrected. The last one I did added about 240ns when FEC was enabled.

Optics modules (e.g. QSFP) that have sufficient margin to work without
FEC are sometimes marketed as "low latency" even though they have the
same latency as the ones that require FEC.

Accepted.

My background with FECs is in radio systems, where the
overhead is worse and block length much longer!
 
On 05/02/2019 04:47, gnuarm.deletethisbit@gmail.com wrote:
On Monday, February 4, 2019 at 11:30:33 PM UTC-5, A.P.Richelieu wrote:
Den 2019-02-04 kl. 07:29, skrev Swapnil Patil:
Hello folks,

Let's say I have Spartan 6 board only and i wanted to implement Ethernet communication.So how can it be done?

I don't want to connect any Hard or Soft core processor.
also I have looked into WIZnet W5300 Ethernet controller interfacing to spartan 6, but I don't want to connect any such controller just spartan 6.
So how can it be done?

It is not necessary to use spartan 6 board only.If it possible to workout with any another boards I would really like to know. Thanks

Netnod has an open source implementation for a 10GB Ethernet MAC
and connects that to an NTP server, all in FPGA.
It was not a generic UDP/IP stack, so they had some problems
with not beeing able to handle ICMP messages when I last
looked at the stuff 2 years ago.

They split up incoming packets outside so that all UDP packet
to port 123 went to the FPGA.

So it's not a stand alone solution. Still, 10 Gbits is impressive. I've designed comms stuff at lower rates but still fast enough that things couldn't be done in single width, rather they had to be done in parallel. That gets complicated and big real fast as the speeds increase. But then "big" is a relative term. Yesterday's "big" is today's "fits down in the corner of this chip".

Chips don't get faster so much these days, but they are still getting bigger!


Rick C.

---- Tesla referral code - https://ts.la/richard11209

I've done it, not a full every single RFC implemented job, but a limited
UDP support. The way it worked (initially) was to use Lattice's
tri-speed Ethernet MAC with Marvell Gigabit Phy (and later on a switch).
The FPGA handled UDPs in and out in real time and offloaded any traffic
it didn't understand (like tcp stuff) to an Arm Cortex M4. It needed 32
bit wide SDRAM to keep up with the potential peak data transfer rate.
We did it because the FPGA was acquiring the data and sending it to a PC
(and sometimes getting data from a PC and streaming it out), the FPGA
did some data processing and buffering - to get the data to the PC it
had to use Ethernet, it could have been done (at the time, several years
ago) with a PCI interface to a PC class processor running a full OS, but
this would have used far too much power. The Lattice XP3 FPGA did all
the grunt work and used a couple of watts (might have been as much as
three watts).
The UDP system supported multi fragment messages and used a protocol
which would allow for messages to be sent again if needed.

If any one wants to pay for tcp-ip and all the trimmings I'd be happy to
consider it.


MK

---
This email has been checked for viruses by AVG.
https://www.avg.com
 
On Friday, February 8, 2019 at 3:33:01 PM UTC+2, Michael Kellett wrote:
On 05/02/2019 04:47, gnuarm.deletethisbit@gmail.com wrote:
On Monday, February 4, 2019 at 11:30:33 PM UTC-5, A.P.Richelieu wrote:
Den 2019-02-04 kl. 07:29, skrev Swapnil Patil:
Hello folks,

Let's say I have Spartan 6 board only and i wanted to implement Ethernet communication.So how can it be done?

I don't want to connect any Hard or Soft core processor.
also I have looked into WIZnet W5300 Ethernet controller interfacing to spartan 6, but I don't want to connect any such controller just spartan 6.
So how can it be done?

It is not necessary to use spartan 6 board only.If it possible to workout with any another boards I would really like to know. Thanks

Netnod has an open source implementation for a 10GB Ethernet MAC
and connects that to an NTP server, all in FPGA.
It was not a generic UDP/IP stack, so they had some problems
with not beeing able to handle ICMP messages when I last
looked at the stuff 2 years ago.

They split up incoming packets outside so that all UDP packet
to port 123 went to the FPGA.

So it's not a stand alone solution. Still, 10 Gbits is impressive. I've designed comms stuff at lower rates but still fast enough that things couldn't be done in single width, rather they had to be done in parallel. That gets complicated and big real fast as the speeds increase. But then "big" is a relative term. Yesterday's "big" is today's "fits down in the corner of this chip".

Chips don't get faster so much these days, but they are still getting bigger!


Rick C.

---- Tesla referral code - https://ts.la/richard11209


I've done it, not a full every single RFC implemented job, but a limited
UDP support.

To that level, who didn't have it done?
Me, personally, I lost count for how many times I did it in last 15 years.
But only a transmitters. It's not that UDP reception to pre-configured port would be much harder, I just never had a need for it.
But TCP is a *completely* different story. And then standard application protocols that run on top of TCP.

The way it worked (initially) was to use Lattice's
tri-speed Ethernet MAC with Marvell Gigabit Phy (and later on a switch).
The FPGA handled UDPs in and out in real time and offloaded any traffic
it didn't understand (like tcp stuff) to an Arm Cortex M4. It needed 32
bit wide SDRAM to keep up with the potential peak data transfer rate.
We did it because the FPGA was acquiring the data and sending it to a PC
(and sometimes getting data from a PC and streaming it out), the FPGA
did some data processing and buffering - to get the data to the PC it
had to use Ethernet, it could have been done (at the time, several years
ago) with a PCI interface to a PC class processor running a full OS, but
this would have used far too much power. The Lattice XP3 FPGA did all
the grunt work and used a couple of watts (might have been as much as
three watts).
The UDP system supported multi fragment messages and used a protocol
which would allow for messages to be sent again if needed.

If any one wants to pay for tcp-ip and all the trimmings I'd be happy to
consider it.


MK

---
This email has been checked for viruses by AVG.
https://www.avg.com
 
Tom Gardner wrote:
On 07/02/19 10:23, already5chosen@yahoo.com wrote:
On Tuesday, February 5, 2019 at 12:33:18 AM UTC+2, Tom Gardner wrote:

Back in the late 80s there was the perception that TCP was slow, and
hence
new transport protocols were developed to mitigate that, e.g. XTP.

In reality, it wasn't TCP per se that was slow. Rather the
implementation,
particularly multiple copies of data as the packet went up the stack,
and
between network processor / main processor and between kernel and
user space.

TCP per se *is* slow when frame error rate of underlying layers is not
near
zero.

That's a problem with any transport protocol.

The solution to underlying frame errors is FEC, but that
reduces the bandwidth when there are no errors. Choose
what you optimise for!


Also, there exist cases of "interesting" interactions between Nagle
algorithm
at transmitter and ACK saving algorithm at receiver that can lead to
slowness
of certain styles of TCP conversions (Send mid-size block of data,
wait for
application-level acknowledge, send next mid-size block) that is
typically
resolved by not following the language of RFCs too literally.

That sounds like a "corner case". I'd be surprised
if you couldn't find corner cases in all transport
protocols.

But if you need absolute maximum throughput, it's often advantageous to
move the retransmission mechanism up the software stack. You can
take advantage of local specialized knowledge rather than pay the "TCP
tax".

--
Les Cargill
 
On Monday, February 4, 2019 at 6:29:45 AM UTC, Swapnil Patil wrote:
Hello folks,

Let's say I have Spartan 6 board only and i wanted to implement Ethernet communication.So how can it be done?

I don't want to connect any Hard or Soft core processor.
also I have looked into WIZnet W5300 Ethernet controller interfacing to spartan 6, but I don't want to connect any such controller just spartan 6.
So how can it be done?

It is not necessary to use spartan 6 board only.If it possible to workout with any another boards I would really like to know. Thanks

--------------------------------------------------------------------------------
An indirect solution would be to offload Ethernet to a hard wired UDP/TCP asic.
Wiznet has developed such devices and I have somewhat more than 10k designs in the field that use them.
Unless you require a cheap high volume solution (requiring development, Verification and Validation time and money) then Wiznet may well be a zero time and minimal cost solution.
I have used both W5300 https://www.wiznet.io/product-item/w5300/
and W3150A+ https://www.wiznet.io/product-item/w3150a+/
devices.

TCP has the drawback of latency and automatic resends.
In real time application my preference is UDP, packets lost are okay but packets resend by TCP is a waste of bandwidth because these packets are out of date.
My application have been in heavy industry machine vision where fibre is too fragile and RF by line of sight and interference is not suitable.
 
On 10/02/19 03:04, Les Cargill wrote:
Tom Gardner wrote:
On 07/02/19 10:23, already5chosen@yahoo.com wrote:
On Tuesday, February 5, 2019 at 12:33:18 AM UTC+2, Tom Gardner wrote:

Back in the late 80s there was the perception that TCP was slow, and hence
new transport protocols were developed to mitigate that, e.g. XTP.

In reality, it wasn't TCP per se that was slow. Rather the implementation,
particularly multiple copies of data as the packet went up the stack, and
between network processor / main processor and between kernel and user space.

TCP per se *is* slow when frame error rate of underlying layers is not near
zero.

That's a problem with any transport protocol.

The solution to underlying frame errors is FEC, but that
reduces the bandwidth when there are no errors. Choose
what you optimise for!


Also, there exist cases of "interesting" interactions between Nagle algorithm
at transmitter and ACK saving algorithm at receiver that can lead to slowness
of certain styles of TCP conversions (Send mid-size block of data, wait for
application-level acknowledge, send next mid-size block) that is typically
resolved by not following the language of RFCs too literally.

That sounds like a "corner case". I'd be surprised
if you couldn't find corner cases in all transport
protocols.

But if you need absolute maximum throughput, it's often advantageous to move the
retransmission mechanism up the software stack. You can
take advantage of local specialized knowledge rather than pay the "TCP tax".

The devil is indeed in trading off generality for performance.

There's the old aphorism...
If you know how to optimise, then optimise.
If you don't know how to optimise, then randomise.
 

Welcome to EDABoard.com

Sponsor

Back
Top