EDK : FSL macros defined by Xilinx are wrong

As posted above, Ethereal can detect any frame if the capture and
display filters are clear... but it assumes that the NIC card in
your machine actually receives/accepts the packet.

If the packet must meet the minimum and maximum size
requirements (try 70 bytes for example) and has a proper FCS.
You may be able to configure your Ethernet board to
accept error packets, which might help troubleshoot the problem.

You didn't say which OS you were using, if you were
capturing in "promiscuous" mode (may require admin rights),
and if you have a direct (cross-over) connection or are going
through a switch or hub.

Have you verified that you can receive the 'same' frame if it
is sent via a different source (not your FPGA)?


"ashwin" <achiluka@gmail.com> wrote in message
news:1131549976.998254.72060@f14g2000cwb.googlegroups.com...
Hello everyone,
I am trying to implement ethernet MAC on a fpga using vhdl. I have a
transmit module which generates the ethernet frame and transmits to the
PC through the ethernet cable.
The ethernet frame consists of preamble, sfd, destination(MAC),
source(MAC), length,data,crc. I am using ethereal to detect this frame

I definitely think that my crc is right. But i am not able to detect
any frames in the ethereal.
I understand that by default ethereal has been set to detect ethernet
frame of layer 3 or above.

But the ethernet frame which i am sending is layer2. How do i change
my options in ethereal so, that i can detect this type of frame.

Ashwin
 
I wrote:
The problem is that you don't save any significant cost by having the
same size package with fewer balls or pins. So if the die size requires
a package 20mm on a side, it may as well have more than 350 balls, even
if some customers don't end up using all of them.
Tobias Weingartner wrote:
There is a cost associated with me trying to acquire the resources and
planning necessary to have a 1000+ pin FPGA mounted, routed and fed. On
the other hand, 144 pins, or even 208 pins in a quad flat pack is pretty
much doable...
Sure, but if the part doesn't fit in the die cavity of the package, it
ain't gonna happen.
 
Hi Fahad,

Yes but only if you have a floating license and Modelsim SE/LE in your
bundle. Spectrum is not supported under Linux but Precision is. HDL Designer
works OKish under Linux but you might have to tweak the fonts to make it
looks nicer. Also installing P&R tools is still a bit of a hit and miss, if
you have Redhat WS 3.0 (or WhiteBox/CentOS 3.5) you should be OK.

Hans
www.ht-lab.com

"fad" <fahad.arif@gmail.com> wrote in message
news:1131519282.510124.10500@g44g2000cwa.googlegroups.com...
Does anyone know the process of installing FPGA Adv on Linux based
system? Does FPGA Adv be have Linux support or not?
I have to install modelsim and leonardo spectrum on linux machine
separately as well. please suggest if anybody has already done the
process.
Regards,
Fahad
 
I have done successful designs which laid memory out in the application
to avoid having hardware refresh. One such machine just laid the
instruction and data fetches already executed by the clock/timer
routine
out in a straight line. To make this work, you must put low order
address lines on the row portion of the address multiplexor. With
memories that have a burst mode this may not always be the highest
performance option. CPU's with caches also can cause problems.
But in general for small cpu's, this isn't a problem.

You may find that the client is willing to take a slightly slower
memory,
with slightly higher software service latencies, and not accept this
tradeoff. Just depends how cost sensitive the design is.

If the cost of slightly larger and faster fpga isn't a budget stopper,
it's
probably best not to do this, as it can cause other problems if not
careful .... like memory randomly disappearing because some software
bug occured.
 
"Subhasri Krishnan" <subhasri.krishnan@gmail.com> wrote

if I read faster than I need to refresh, then I can avoid
refresh altogether. i.e if the refresh period is 64ms and if i access
the data every, say, 20ms then I don't have to refresh. Please tell me
if this is true or if I am getting confused.
Asserting RAS causes a row of capacitors to have their charge topped up.

If they are above the voltage sense threshold then the have at least some
charge, and they are given a full charge.

Asserting CAS causes one capacitor to be connected to a column line, and
this either drives charge in/out for writing, or senses it for a read.

Capacitors are not completely discharged when read, of course.

From the points above you can deduce what is going to happen and what needs
to be done.

So long as every row gets strobed at least once every 64 ms, every capacitor
is refreshed. It does not matter if this is done by a refresh cycle (RAS
only), or a read/write cycles (RAS then CAS).

The original IBM PC had an interrupt routine to do a series of DRAM accesses
to refresh the DRAM. It had no DRAM controller at all.

If you can arrange your system software so that every row is accessed every
refresh period, that should do the trick.

If you are doing a non-PC embedded system, the CPU may be running code from
ROM most of the time. You could try refreshing the rows with those cycles:
i.e. the DRAM gets a RAS during DRAM _and_ ROM cycles, but CAS only for RAM
access. The row address will be whatever the CPU address bus is driven to,
so obviously you have to make the ROM cycles cover every row. For this
reason, it is easier to do if you use the least-significant address bits for
the row address.

Note that the number of accesses needed is the square root of the DRAM chip
size. I don't think refreshing 64K DRAM chips is too bad (256 accesses), but
you might not like doing a 16Mbit DRAM chip (2048 accesses).
 
should be aware every dram is different in the number of rows that must
be
accessed and the max period between accesses/refersh.
 
So if its a 64Mb chip (4096 accesses and 64mx between accesses) and I
can do some kind of a serial reading then its better to skip the
refresh? I am looking to push the SDRAM to the limit and to get highest
bandwidth. anything other than bank interleaving and getting rid of
refresh that can be done to maximize performance? this is my first
controller and any suggestion is greatly appreciated.
 
motty wrote:
Mike--

Seems you are telling me to sample the data on the rising edge. This
is the same clock that the external part is seeing. The external part
changes data on the rising edge. I can't be sure that data is valid
then.
The external part can't change its outputs immediately with the rising
edge of the clock -- there's always some clock-to-out time. RTFDS.
While you're at it, add in some prop delay between the external device
and the FPGA.

Capturing the "previous" data on the rising edge of the clock is
basically how all synchronous systems work.

-a
 
might look at other mfgrs devices, as timings and setup for multibank
accesses
can make a huge difference if concurrent reads/writes are to the same
device.
 
g.wall wrote:
has anyone in the dig. design and reconfig. computing community looked
seriously at open source hardware design libraries, working toward a
hardware paradigm similar to that in the open source software community?
Problem 1.

There are ten times as many software designers
as digital hardware designers. The average software guy
is much better at setting up repositories, web sites
and running regression tests than the average hardware guy.
The average hardware guy knows enough HDL to get get by
and maybe enough C language to turn on a circuit board.
Standard software development processes like source control
and code reuse are much less evolved in the hardware area.

Problem 2.

The average software designer couldn't describe
two gates and flip flop in vhdl or verilog.

-- Mike Treseler
 
Meanwhile, I see more 'opening' of the Cell processor
The Cell processor architecture does have some interesting uses, and
strong memory bandwidth, which delivers better than impressive
performance
for it's target markets.

Architecturally it's strengths are also some of it's worst weaknesses
for
building high end machines that would scale well for applications which
assume distributed memory.

The cell processor is a next generation CPU to continue Moore's Law.
The FPGA's which follow to target the same high performance computing
market, will also come with application specific cores and multiple
memory
interfaces to kick but in the same markets. These FPGA's with the same
die size and production volumes will have the same cost. The large
FPGAs
today which have similar die sizes are produced in lower volumes at a
higher
cost which currently skews the cost effectiveness equation toward
traditional
CPUs. Missing are good compiler tools and libraries to even the playing
field.
Cell will suffer some from that too.
 
Should have noted that the FpgaC project is still looking for
additional
developers, and the long term results of this project are still very
open
to change. It would be great to be able to build a comprehensive set of
library that allow typical MPI and posix-threaded applications to build
and dynamically load/run on multiple FPGA platforms. And to mature
the compiler to handle a full traditional C syntax transparently.

I personally would like to see it handle distributed arithmetic
transparently,
so that it handles the data pipelining of high performance applications
well using data flow like strategies. But that is open to the team as a
whole, with inputs from the user community.
 
air_bits@yahoo.com wrote:
Problem 2.
The average software designer couldn't describe
two gates and flip flop in vhdl or verilog.

does that even matter for "reconfig. computing"?
The OP asked about open source hardware design libraries,
not reconfig. computing.

-- Mike Treseler
 
Mike Treseler <mike_treseler@comcast.net> writes:
Problem 2.

The average software designer couldn't describe
two gates and flip flop in vhdl or verilog.
Problem 3.

The average software designer couldn't describe two gates
and a flip-flop in C (or any other programming language), but
would instead describe something that synthesizes to a large
collection of gates and flip-flops.
 
Subhasri krishnan wrote:

Hey all,
I am designing(trying to design) an sdram controller (for a PC133
module) to work as fast as it is possible and as I understand from the
datasheet, if I read faster than I need to refresh, then I can avoid
refresh altogether. i.e if the refresh period is 64ms and if i access
the data every, say, 20ms then I dont have to refresh. Please tell me
if this is true or if I am getting confused.
Thanks in Advance.



This is true provided you access every single row, well at least every
row you have data in, within
the refresh time. This can be used to advantage in video frame buffers,
for example as long as the
frame time does not exceed the refresh time. So yes, it can be useful.
It doesn't save a lot of
meory bandwidth or time, but it can substantially simplify the DRAM
controller in your design.


--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930 Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

"They that give up essential liberty to obtain a little
temporary safety deserve neither liberty nor safety."
-Benjamin Franklin, 1759
 
Hi Anthony,

Data bus signals' tri-state FF not being included in an IOB is for
timing reasons.
Even when Address Stepping is used, those FFs still do rely on several
unregistered signals, and when larger devices are used, the unregistered
signals (i.e., FRAME#, IRDY#, etc.) will have to travel longer distance,
thus making harder to meet the PCI's stringent setup time requirement.
(3ns for 66MHz PCI and 7ns for 33MHz PCI.)
Instead, by not including a tri-state FF in IOBs, it allows the
tri-state FFs to be placed near the unregistered signals, making it
easier to meet setup time.
Once those unregistered signals go through LUTs, and get captured by a
FF, they become registered, and once registered, the registered signal
has much more timing margin. (15ns for 66MHz PCI and 30ns for 33MHz PCI.)


Kevin Brace


Anthony Ellis wrote:
Hi Kevin,

I can't figure your explanation. Even if you wnated to step (in clock cycles) the IO enable could still be in the IOB! Using an internal FF, with defined placement and routing, gives control of scew within the same cycle - if you wanted it!

Anthony.

--
Brace Design Solutions
Xilinx (TM) LogiCORE (TM) PCI compatible BDS XPCI PCI IP core available
for as little as $100 for non-commercial, non-profit, personal use.
http://www.bracedesignsolutions.com

Xilinx and LogiCORE are registered trademarks of Xilinx, Inc.
 
Problem 3.

The average software designer couldn't describe two gates
and a flip-flop in C (or any other programming language), but
would instead describe something that synthesizes to a large
collection of gates and flip-flops.
in TMCC/FpgaC (and Celoxica, and a number of other C HDL like tools)
what you just asked for is pretty easy, and comments otherwise are
pretty egocentric bigotry that just isn't justified.

int 1 a,b,c,d; // four bit values, possibly mapped to input pins
int 1 singlebit; // describes a single register, possibly an output
pin
singlebit = (a&b) | (c&d); // combinatorial sum of products for
ab+cd;

I can train most kids older than about 6-10 to understand this process
and
the steps to produce it. It doesn't take an EE degree to understand or
implement.

So beating your chest here is pretty childish, at best.
 
There is a small setup overhead for the main, but for example
this certainly does NOT synthesize "to a large collection of
gates and flip-flps" as you so errantly assert cluelessly:

main()
{

int a:1,b:1,c:1,d:1;
#pragma inputport (a);
#pragma inputport (b);
#pragma inputport (c);
#pragma inputport (d);

int sum_of_products:1;
#pragma outputport (sum_of_products);

while(1) {
sum_of_products = (a&b) | (c&d);
}
}

Produces the following default output (fpgac -S example.c) as
example.xnf:
LCANET, 4
PWR, 1, VCC
PWR, 0, GND
PROG, fpgac, 4.1, "Thu Nov 10 19:42:27 2005"
PART, xcv2000ebg560-8
SYM, CLK-AA, BUFGS
PIN, I, I, CLKin
PIN, O, O, CLK
END
SYM, FFin-0_1_0Running, INV
PIN, I, I, 0_1_0Zero
PIN, O, O, FFin-0_1_0Running
END
SYM, 0_1_0Running, DFF
PIN, D, I, FFin-0_1_0Running
PIN, C, I, CLK
PIN, CE, I, VCC
PIN, Q, O, 0_1_0Running
END
SYM, FFin-0_1_0Zero, BUF
PIN, I, I, 0_1_0Zero
PIN, O, O, FFin-0_1_0Zero
END
SYM, 0_1_0Zero, DFF
PIN, D, I, FFin-0_1_0Zero
PIN, C, I, CLK
PIN, CE, I, VCC
PIN, Q, O, 0_1_0Zero
END
SYM, 0_4__a, IBUF
PIN, I, I, a
PIN, O, O, 0_4__a
END
EXT, a, I
SYM, 0_4__b, IBUF
PIN, I, I, b
PIN, O, O, 0_4__b
END
EXT, b, I
SYM, 0_4__c, IBUF
PIN, I, I, c
PIN, O, O, 0_4__c
END
EXT, c, I
SYM, 0_4__d, IBUF
PIN, I, I, d
PIN, O, O, 0_4__d
END
EXT, d, I
SYM, 0_10__sum_of_products-OBUF, OBUF
PIN, I, I, 0_10__sum_of_products
PIN, O, O, sum_of_products
END
EXT, sum_of_products, O
SYM, FFin-0_10__sum_of_products, BUF
PIN, I, I, T0_15L49_0_10__sum_of_products
PIN, O, O, FFin-0_10__sum_of_products
END
SYM, 0_10__sum_of_products, DFF
PIN, D, I, FFin-0_10__sum_of_products
PIN, C, I, CLK
PIN, CE, I, 0_13_L21looptop
PIN, Q, O, 0_10__sum_of_products
END
SYM, FFin-0_13_L21looptop, EQN, EQN=((~I1)+(I0))
PIN, I1, I, 0_1_0Running
PIN, I0, I, 0_13_L21looptop
PIN, O, O, FFin-0_13_L21looptop
END
SYM, 0_13_L21looptop, DFF
PIN, D, I, FFin-0_13_L21looptop
PIN, C, I, CLK
PIN, CE, I, VCC
PIN, Q, O, 0_13_L21looptop
END
SYM, SYMT0_15L49_0_10__sum_of_products, EQN, EQN=((I0*I1)+(I2*I3))
PIN, I3, I, 0_4__a
PIN, I2, I, 0_4__b
PIN, I1, I, 0_4__c
PIN, I0, I, 0_4__d
PIN, O, O, T0_15L49_0_10__sum_of_products
END
EOF
 
Go back and read the first line of the first post, and you will clearly
see
the author included reconfigurable computing in the discussion.
 

Welcome to EDABoard.com

Sponsor

Back
Top