EDK : FSL macros defined by Xilinx are wrong

air_bits@yahoo.com wrote:

For those that haven't looked at this stuff, it's the next generation
HLL
FPGA environment, two steps above C with a cute GUI based system level
abstraction tool .... very cool :)

http://www.nallatech.com/mediaLibrary/images/english/4063.pdf

Yes. All of the next-gen websites are cute.
Why is a working code example so hard to find?

-- Mike Treseler
 
Mike Treseler wrote:
Yes. All of the next-gen websites are cute.
Why is a working code example so hard to find?
You can always ask the various sites, or some user. Robin seems to
be using and happy with the DIME stuff, email him for some samples.
Have you tried talking with the company?

Impluse C offers a full featured 30 day trial, and they are pretty cool
to
talk with, and have done a good job of productizing Streams C.

Streams C is free for non-commercial use, and is available from
http://www.streams-c.lanl.gov/

SA-C by the Colostate team (Wimm Bohm) looks like they don't
intend to make it publically available, except to companies funding
their research projects.

ASH by the CMU guys, isn't likely to get open source released either,
and is likely to end up licensed to someone for a revenue stream
from what I was told by one person last year ... but I haven't seen
even that yet. Mihai Budiu appears to now be at Microsoft,
and publishing papers from there on the technology, so maybe
Microsoft will be licensing the technology, or working from Mihai's
development independent or in partnership with CMU. The papers
have been very cool, but until it's publicly available or a product
it's hard to judge just how useful for others. The ASH team offered
training at a conference earlier this year, and may do more.

Celoxica isn't quite as easy to get a demo copy from, but some Xilinx
reps seem to have a copy, and they were offering training seminars with
Xilinx across the country.

FpgaC has some examples in the download image, and is free to run
your own tests with, and has no restrictions against commerical use.

It seems pretty easy to get working examples simply by downloading
or asking the sales guys ... who didn't respond to your asking?
 
GaLaKtIkUs™ wrote:
But why were these changes done ? is it a sacrifice made by Xilinx to
allow adding XtremeDSP and other hard cores inside the Virtex-4 ?
Or is it simply because a few people use the CLBs as memory or shift
registers?

I'd say it's more because even if you do use the slice as memory & shift
egister, it's unlikely that you need ALL of them to be capable of that ...

So by only making half of them with this capability, you don't loose
much but I'd guess you win quite some space & complexity.


Sylvain
 
Either way the reason Xilinx did this does not help someone that is porting
this code to a higher Virtex device. They are stuck with this error with no
understanding or explanation of how to proceed fixing their code to make it
work. This is not good support!! Why can't Xilinx make this clear to
engineers upgrading their devices.

-Andrew


"GaLaKtIkUsT" <taileb.mehdi@gmail.com> wrote in message
news:1131255526.842676.30480@g44g2000cwa.googlegroups.com...
But why were these changes done ? is it a sacrifice made by Xilinx to
allow adding XtremeDSP and other hard cores inside the Virtex-4 ?
Or is it simply because a few people use the CLBs as memory or shift
registers?
 
"Andrew Lohbihler" <xyz.interactive@rogers.com> schrieb im Newsbeitrag
news:DLqdnfJwZNJ8wvPenZ2dnUVZ_sOdnZ2d@rogers.com...
Either way the reason Xilinx did this does not help someone that is
porting
this code to a higher Virtex device. They are stuck with this error with
no
understanding or explanation of how to proceed fixing their code to make
it
work. This is not good support!! Why can't Xilinx make this clear to
engineers upgrading their devices.

-Andrew


"GaLaKtIkUsT" <taileb.mehdi@gmail.com> wrote in message
news:1131255526.842676.30480@g44g2000cwa.googlegroups.com...
But why were these changes done ? is it a sacrifice made by Xilinx to
allow adding XtremeDSP and other hard cores inside the Virtex-4 ?
Or is it simply because a few people use the CLBs as memory or shift
registers?
as already said the RPM stuff just is and will be 'family' specific.

and as only half of the slices are SLICEM then the RPM logic
must account for that and fix functions that need SLICEM function
at horisontal offset 2. This is relativly easy todo by calculating
the X values accordingly. The need todo this for existing V2
desing sure is additional PITA thing.

This 'feature' that only 50% of the slice's are now 'good'
old slices (in terms of Virtex slices) is defenetly fully clear
from Xilinx datasheets. Well there are others small things
that are harder do know, like that new devices only support
LVDS output when VCCIO is 2.5 (no 3.3V support)
and that lots of LVDS pins in Virtex4 are 'input' only.
Both the latter 'new fetures' did come as a surprise
to me - while I was testing an already ready made PCB.

--
BTW: Both Actel and Lattice have also a similar
to SLICEM/SLICEL difference between memory
capable and logic only slices - it does reduce
the die size.

Antti
 
Antti Lukats wrote:

BTW: Both Actel and Lattice have also a similar
to SLICEM/SLICEL difference between memory
capable and logic only slices - it does reduce
the die size.
Interesting.
The much-hyped SRL16 isn't free after all.
Perhaps Xilinx is discovering that some
designers use a lot of generic rtl code.

-- Mike Treseler
 
air_bits@yahoo.com wrote:
The reality is that forms of parallelism emerge when using C as
an HLL for FPGAs. The first is that the compiler is free to parallel
statements as much as can be done. This alone is typically enough
to bring the performance of a 300MHz fpga clock cycle near the
performance of a several GHz RISC/CISC CPU for code bodies that
have a significant inner loop. Second, explicit parallelism is
available
by replicating these inner loops by creating threads with the same
code body and using established MPI code structures and libraries.
An interesting point with this, of course, is that it's just splitting
the work less - instead of going

C -> object code -> Processor (with out of order execution),

this would seem to be a case where the management of of-of-order
execution type things is done statically at compile time, rather than
dynamically by the processor.

It could be interesting to see how far this could go -
Compile to code+processor, where the processor architecture is
implemented by the compiler subject to the requirements of the design.

My 2c,
Jeremy
 
Mike Treseler wrote:
Antti Lukats wrote:

BTW: Both Actel and Lattice have also a similar
to SLICEM/SLICEL difference between memory
capable and logic only slices - it does reduce
the die size.


Interesting.
The much-hyped SRL16 isn't free after all.
Well, half of them being "simple" slice isn't much
of a cost, there is quite always a need for simple
stuff that will fit nicely in a simple LUT4 ...

I wonder, in other brands of FPGA, what if you need
to delay a 16 bits bus by 16 clocks cycles ? You use
simple registers ? That would be 256 registers just
for that ...


Perhaps Xilinx is discovering that some
designers use a lot of generic rtl code.
They're inferred quite well for the common cases,
no need to instanciate them explicitly.


Sylvain
 
Jeremy Stringer wrote:
It could be interesting to see how far this could go -
Compile to code+processor, where the processor architecture is
implemented by the compiler subject to the requirements of the design.
I think that is being done already, tho at the simple end of the scale,
it does prove it is possible.
IIRC, it involved compiling the design twice. Once to generate the
Core+Codes, and again to remove unused portions of the core.
It can introduce other problems - if the CPU changes every time, that
complicates things more, and what looks like a few lines of code, might
enable a new block of the CPU, and have an unexpected hit on both %
Usage, and speed.

Cores themselves are not too large these days, the bigger bottleneck
is on chip code memory.

-jg
 
Andrew Lohbihler wrote:
Either way the reason Xilinx did this does not help someone that is porting
this code to a higher Virtex device. They are stuck with this error with no
understanding or explanation of how to proceed fixing their code to make it
work. This is not good support!! Why can't Xilinx make this clear to
engineers upgrading their devices.

-Andrew
The error was EXPLICIT !! How much MORE do you want? I've included the
error below.

The only reason for confusion in this instance is that the person
receiving the error doesn't know the target silicon well enough to
understand the issues. Once explained in plain language, the person
receiving the error was still unable to interpret the solution. (I
answered the question twice in this thread)


The error was:

ERROR:pack:1142 - A problem was encountered updating the component types
within the following shape:
The RPM "uChip0_Cal1_XA1_TA0/hset"

A problem was encountered trying to change the type of the component
containing the following symbols to "SLICEL".

<LUT, Shift, and FLOP instances showed up here>

The component is already of type "SLICEM". The setting of the
component type is necessary since it is an odd number of columns away
from a component which already has a type of "SLICEM". This second
component contains the following symbols:

<Two shift and 2 Flop instances showed up here>

This architecture has two types of components, SLICELs and SLICEMs, in
alternating columns. Only SLICEMs can contain RAM symbols.

________________________

Now, do you still want to gripe about Xilinx leaving the designer
without any idea why his placement failed ?!

Personally, I appreciate the level of support that *is* in the tools.
It doesn't take much to be able to figure out what's going on if you
choose to look past your own keyboard.
 
Jim Granville wrote:
IIRC, it involved compiling the design twice. Once to generate the
Core+Codes, and again to remove unused portions of the core.
It can introduce other problems - if the CPU changes every time, that
That's interesting :) ... who's tools are doing that?

The other extreme are Sarah's HarPE tools which even optimize away
pretty much the whole core into logic.
 
Andrew Lohbihler wrote:

John, I understand what you are saying and appreciate the follow through of
my error message. but given your solution as;

So 1) find all your memory elements, and 2) move them around so they're
either all in even RLOC columns or they're all in odd RLOC columns.

I don't understand how to implement this in ISE. How do you set the
constraint in ISE to ensure that the components end up in SLICEMs. Or
possibly how do you do that manually, or at the silicon level using ISE
tools?

-Andrew



Andrew,

You are going to have to find out where the placement is specified. It
may be in the source, in a UCF file or in the floorplanner. If you are
using RPM macros from a 3rd party, you may have to either recompile
those macros with fixes or contact the 3rd party for a modified macro
suitable for V4. Basically, you have to modify the placement where ever
it was specified.

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930 Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

"They that give up essential liberty to obtain a little
temporary safety deserve neither liberty nor safety."
-Benjamin Franklin, 1759
 
Antti Lukats wrote:
5) Some nice new (compared to other Spartan FPGAs)
package combinations like:
*Largest fabric in VQ100
*Largest fabric in chip-scale package
*Largest fabric in FT256
*Largest fabric in non-BGA package
Where did you see that? Looking at
http://www.xilinx.com/products/silicon_solutions/fpgas/product_tables.htm#Spartan3E
it seems only the two smallest models are available in VQ100 and TQ144
packages. The smallest package for the largest fabric is the FG320,
right? The datasheet has the same information.

--- Jecel
 
Jecel wrote:
Antti Lukats wrote:

5) Some nice new (compared to other Spartan FPGAs)
package combinations like:
*Largest fabric in VQ100
*Largest fabric in chip-scale package
*Largest fabric in FT256
*Largest fabric in non-BGA package


Where did you see that? Looking at
http://www.xilinx.com/products/silicon_solutions/fpgas/product_tables.htm#Spartan3E
it seems only the two smallest models are available in VQ100 and TQ144
packages. The smallest package for the largest fabric is the FG320,
right? The datasheet has the same information.
I think Antti meant that the 3e has the largest resource available in
(any) QFP, not that the biggest s3e die goes into a QFP.
ie it is more a LUT/package measure.

-jg
 
"Jim Granville" <no.spam@designtools.co.nz> schrieb im Newsbeitrag
news:436edf24$1@clear.net.nz...
Jecel wrote:
Antti Lukats wrote:

5) Some nice new (compared to other Spartan FPGAs)
package combinations like:
*Largest fabric in VQ100
*Largest fabric in chip-scale package
*Largest fabric in FT256
*Largest fabric in non-BGA package


Where did you see that? Looking at
http://www.xilinx.com/products/silicon_solutions/fpgas/product_tables.htm#Spartan3E
it seems only the two smallest models are available in VQ100 and TQ144
packages. The smallest package for the largest fabric is the FG320,
right? The datasheet has the same information.

I think Antti meant that the 3e has the largest resource available in
(any) QFP, not that the biggest s3e die goes into a QFP.
ie it is more a LUT/package measure.

-jg
yes Jim.

also that it has largest fabric in SAME package compared to S3 or other
Xilinx FPGA.

as example VQ100 is really nice package very thin, so largest LUTs you get
in VQ100 is S3e. etc..

worlds largest non BGA FPGA is Actel PA3-3000E I think.

Antti
 
My apologies for not responding to this: my computer crashed big-time, &
is only now on the air again.
I see several others took up the thread: I trust you got satisfactory
advice from them.

Emtech wrote:
Hi David
Th
is is for an RGB LED demo display application.

1. There will be mixing of colours done at say 3ms intervals for each colour
to stay ithin the 10ms and take avantage of persistance of vision.
2. Bits accuracy in the duty cycle is not very important since the PWM is
only for brightness control.
3. The outputs must at least be synchronized to the colour mixing intervals,
i.e. 3ms intervals. In other words, the PWM will further divide the 3ms
intervals to control brightness.
4. These will only be used in an RGB LED display application hence the only
real importance is the 10ms refresh limit.

Thank you for your input.
Peter.

"David Brooks" <davebXXX@iinet.net.au> wrote in message
news:435d7280$0$8621$5a62ac22@per-qv1-newsreader-01.iinet.net.au...

Can you further tell us:
1. What pulse repetition frequency you want
2. How many bits accuracy in the duty cycle
3. Must the outputs be synchronised?
4. Are they to drive model-control servos? (Those often respond not to the
average energy in the signal, but to the actual width. You can have a very
long interval between pulses, & still have them work).

Emtech wrote:

I have an application where I need to implement 24 or up to 32 PWM
outputs (8-bit) and
am considering using a small CPLD to handle the PWMs instead of doing it
all in software.
This does add a CPLD to the design, but frees the micro do to other
things.

Any recommendations on the CPLD & CPLD size without completing the VHDL
first?
 
Tobias Weingartner wrote:
In article <dkmrjm$cg8$1@online.de>, Antti Lukats wrote:
as example VQ100 is really nice package very thin, so largest LUTs you get
in VQ100 is S3e. etc..

I realize that there are people out there that need the 1000+ pin packages
that large-scale FPGAs offer... but I do wish that 2-5 million "gate" FPGAs
would come in VQ100/144 packages. Personally, I'd love to have the capacity,
but I really dont need (or want) the complexity and raw bandwidth of having
to deal with several hundred (or a thousand) pins...
Just doesn't work that way unfortunately. The large fabric requires a
large chip package to contain it. If you were to reduce the number of
pin outs, the it would actually require an even larger chip package, as
you would now have to add additional multiplexers etc to control the
routing to the pins.

It would certainly be nice to have the additional logic in a smaller
chip, but sorry to say this will only happen with geometry scaling, such
as transition to 90nm and possibly to 65nm in the near future.
 
Tony Acquah wrote:
Hello Everyone,
I am working on designing a digital modulator that will implement QAM
on FPGA.

The FPGA will be programmed in VHDL to modulate an input
signal(maybefrom a function generator)

A D/A converter will convert the digital signal to an analog one that
canbe transmitted.

Anyway, what I need now is algorithm and/or VHDL code that will carry
out the QAM on FPGA.

Any help is greatly appreciated

Anthony
Which form of QAM do you want to use?
This could be as simple as a look up table for most small inputs, I'd
probably use it for 16-QAM and up to 256-QAM. Which is probably all
you're going to target anyways.

If this is the only thing that you'll have in the chip it's probably
worth adding in some FEC coding as well. This would allow you to use
TCM, and hence provide slightly better performance under noisy conditions.

I note that you want to use a function generator to create the input
waveforms. This would generally be a serial train of digital pulses,
and perhaps a clock signal (or the clocking may be intrinsic to the
digital pulses ala Manchester encoding). The serial to parallel
converter is simply a shift register, which is clocked in x number of
times, then read out in parallel, then clocked in an additional x times
etc. There is probably a standard block for this that would be faster
in most cases than a manual implementation.
There is likely a standard block for the ROM to use as the QAM lookup
table. And if you wanted to implement TCM then I'm sure there's some
code around for a viterbi algorithm, and the convolutional generator.
 
Bevan Weiss wrote:
Just doesn't work that way unfortunately. The large fabric requires a
large chip package to contain it. If you were to reduce the number of
pin outs, the it would actually require an even larger chip package,
as you would now have to add additional multiplexers etc to control
the routing to the pins.
No, it wouldn't need any extra multiplexers. They would just not bond
out as many of the pads to pins. They already do that to offer several
package options for each FPGA.

The problem is that you don't save any significant cost by having the
same size package with fewer balls or pins. So if the die size requires
a package 20mm on a side, it may as well have more than 350 balls, even
if some customers don't end up using all of them.
 
The coder is just a remapping of the input I and Q bits to the IQ
plane. It is a straight-forward mapping whch usually does not need a
table. For example, QAM64 uses 6 bit symbols. 3 bits each specify I
and Q independently. Those 3 bits take on values +/-1, +/-3, +/-5 and
+/-7. There will also need to be a nyquist filter to limit the spectral
footprint and a modulator. If desired, there may also be a
convolutional encoder preceding the symbol conversion to I and Q.

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930 Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

"They that give up essential liberty to obtain a little
temporary safety deserve neither liberty nor safety."
-Benjamin Franklin, 1759
 

Welcome to EDABoard.com

Sponsor

Back
Top