EDK : FSL macros defined by Xilinx are wrong

rickman · Apr 21, 2006

Hal Murray wrote:

Why does it matter to this discussion?

Xilinx isn't stupid.

You are assuming facts that are not in evidence. ;^)

austin · Apr 21, 2006

Rick,

-snip-

I am not the one that you attacked. When I initially said I find your
post insulting, I didn't mean it insulted me.

OK. I understand. You were (attempting) to point out that I may be out
of line.

Thanks. Feel free to email me directly with such comments. I
appreciate them (really). Posts to the group such as yours can be
interpreted differently (wrongly in this case by me).

I meant that it appeared

to be insulting the person you were responding to. What you call
gentle poking was really a way of demeaning his comments and attacking
him on a personal level without offering a single factual response to
his statements.

Do you not see what I mean?

Most certainly.

-snip again-

You may not take disrespect, but you are quick to dish it out. Go back
to the post from Metal and your reply and tell me how you expect
someone reading it to consider that a respectful post.

I do not disagree with you: the gentle poking is sometime angry
prodding. It is also open to interpretation. The story of getting the
donkey's attention comes to mind. Where one draws a line is subjective.

As I said, if you feel the need to alert me, go ahead directly. People
do. I am certainly not perfect. I can always improve.

-snip some more-

I don't see how you calling "foul" has anything to do with your reply
to Metal and how it appeared disrespectful. You weren't calling foul,
you were dismissing his statements as being unimportant enough to even
deserve a real reply as well as appearing to insult him on a more
personal level.

Just look at all the rambling, irrelevant statements you have made in
this reply. I think it is clear that you consider your own opinions
far more valuable than anyone's comments on how you are perceived.

I will chalk this up to your feeling that I am missing the intent of
your orginal post. I've explained why I missed it, and I have told you
how others let me know their personnal opinions of my posts. That I
used a 2X4 to get the donkey's attention (getting back to the analogy
above) may seem extreme to you. Perhaps it was.

A posting to me directly with constructive criticism is completely
different than one to a newsgroup which can (and does) appear to be a
public attack.

In fact, any posting in public forum with a negative tone is potentially
looked at as a direct attack. Think about that for a moment.

Enough said. Take my comments or leave them. I don't think I can add
anything further that might be useful.

I have taken them. And, I see now your point now. It was useful. It
can be even more useful in the future. Thanks for taking the time to
try to get through to me. In future, just email me directly: when that
happens, I know the intent is not to publicly achieve some unknown goal,
but to tell me directly what you think (there are more donkey's than you
think out there in need of 2X4's).

You may notice that sometimes I will retract, or apologize, seemingly
out of the blue. These are times I have either reconsidered the post I
made myself, or received comments from others.

Again, if I can't admit I am sometimes wrong, then I am a real fool, not
someone who is just one occasionally.

Austin

John_H · Apr 21, 2006

J Silverman wrote:

Hi All,

Ok, you all kinda convinced me on this. After three days of searching for support software and not finding any, I started looking for another way of using a more modern FPGA. I was originally looking at the original Spartan (as they came in PLCC84 packaging) but cannot find any for sale in small quantities. So I started looking for ways on how to use a current FPGA in a breadboard, and I came across this site:

http://www.beldynsys.com/quadpacks.htm

They have a bunch of boards that will take SMT components and give you the ability to stick them in a breadboard. I was looking at the Q100-80 one and am wondering if that will work for a Spartan3 in VQ100 packaging. I'm not sure if the difference between VTQFP and TQFP is great enough to cause problems when trying to solder it on the board.

Thanks, J Silverman

Or just plug a development board onto your prototype breadboard. Are
you using wire-wrap?

Subhasri krishnan · Apr 21, 2006

Hi,
Yes I will try using the synchronous fifo (one of my last options). Are
sync fifos usually used as a buffer between 25Mhz and 100Mhz clock
domains? I think I can use some counters and fifo flags to generate
triggers. You mentioned that the boundaries have to be synchronized.
what are the boundaries that you are talking about? Please do explain
this.
By the way, my clocks are from a DCM (25Mhz and 100Mhz generated with
clock in from an external oscillator of 50Mhz) .
Thanks
Subhasri.K

Mike Treseler wrote:

Subhasri krishnan wrote:

I am having problems debugging my memory controller. My initial idea is
to capture a frame and display the same frame continuously. But I see
problems in the pattern captured (some spots with colours that arent
expected to be there). When I simulate the controller it works as it is
expected. I understand that the timing has to be right and I modified
everything so that its alright.

Verify that by running static timing at 100MHz.

I think that the problem lies with an asynchronous fifo used to buffer
data at 25Mhz.

Consider syncing that interface to 100Mhz
and using a synch fifo.

Also
can i use 2 clocks in the same module?

Not without working out synchronization at the boundaries.

-- Mike Treseler

Peter Alfke · Apr 21, 2006

I suggest an approach where everything runs off the same clock (100
MHz?). The synchronous FIFO then becomes very simple, just a state
machine that controls the two BlockRAM ports, one for writing, one for
reading.
All the mysterious trickery of asynchronous FIFO flag control
disappears, since everything happens on common clock edges.
In general, using multiple clocks should always be the tool of last
resort. Synchronous, single-clock designs are far easier to create and
especially to debug.
Peter Alfke, Xilinx Appplications.

leaf · Apr 21, 2006

John_H, its not that i'm saying the "after the single-cycle FRAME#
signal
is an idle state", what i mean is how can we follow state machine
design (as described in Appendix B of PCI Spec 3.0) to accomodate a
Configuration Read/Write....

---
young_leaf

John Williams · Apr 21, 2006

HiRaymond,

Raymond wrote:

I found a possible answer to my problem at the Answer Database.
I should unzip the opb_mdm_v2_01_a.zip to the pcores folder in my
project, update
the coreversion in the MHS file from 2.00.a to 2.01.a and make the
project use/point to these
files instead of the default ones.

But how do I make the project point to these files instead of the
default ones?

The local pcores directory will be searched first - anything in there
will override the defaults. You'll want to do a "make clean" in the
project directory to force a complete rebuild, after making a change
like this.

Regards,

John

John McCaskill · Apr 21, 2006

Georgios Pouiklis wrote:

Hi,

I'm looking for a V4FX devel. board, with both optical (e.g. SFP)MGTs and ddr2 memory (especially one with DIMM modules, if two ddr2 busses are available even better).

Can someone suggest something, I haven't found one incorporating both features.

Thanks, George

Hello George,

Take a look at this:

http://www.fastertechnology.com/

http://www.fastertechnology.com/extras/pics/p6a/

The P6 is a V4FX based PCI card with four SFP connectors connected to
the MGTs. It has one DDR2 SODIMM connector, and four SATA/SAS
connectors (we have not yet created any SATA/SAS IP for them, but plan
to in the future).

The V4FX is configured by a Spartan 3E from a MiniSD card, and has
access to the card after configuration to load software from it.

We built our first run of cards with the V4FX60. The same PCB will
also support the V4FX40 and the V4FX100, and we will support those some
time in the future.

We support customers creating FPGA designs for this card via EDK. We
supply an EDK repository that has the board definition file and IO
cores for this card. We also have a growing library of DSP cores.

I am about to leave on a trip, so I will only be in sporadic contact
for several weeks, but you can contact Charles Camp for more
information. His email is cjcamp @ our domain name. My email is
jhmccaskill @ ... I will only be able to check it on occasion. Our
phone number is also on the web site.

Regards,

John McCaskill

Andy Peters · Apr 21, 2006

Leow Yuan Yeow wrote:

Hi, for a program such as
case state is
when S0=
A <= B + C;
when S1=
Z <= X + Y;
does it mean that 2 adders are generated, or will the synthesis recognize
the adder can be shared?
Or to I have to specifically write a multiplexor for the adder? Thanks!

Are you optimizing for speed or area?

-a

leaf · Apr 21, 2006

Does it mean that Configuration cycles are not part of the state
machine (IDLE, B_BUSY ect.. )?
i guess i was not able to recognized it.

Anyway, if that's the case that solves it partially if not i don't
know...

my Hit is ORed bar_hits (i have BAR0 and BAR1) ( assynchronous of pci
clock )

-- Hit Detection --------------------------
match <= ( (Pci_Ad_Reg(31 downto 2) = BAR_0)
OR
(Pci_Ad_Reg(31 downto 4) = BAR_1) );

bar_hit_reg(0) <= '1' when ( match and ( cmdReg(1) = '1' ) ) else '0';
bar_hit_reg(1) <= '1' when ( match and ( cmdReg(0) = '1' ) ) else '0';
Hit <= ( bar_hit_reg(0) OR bar_hit_reg(1) ) ;
--------------------------------------------------

Pci_Ad_Reg is a registered Pci_Ad at rising edge of Pci_Clk...
is this advisable, also to register IRDY# and/or FRAME#?

---
Leaf

Mike Treseler · Apr 21, 2006

Leow Yuan Yeow wrote:

Hi, for a program such as
case state is
when S0=
A <= B + C;
when S1=
Z <= X + Y;
does it mean that 2 adders are generated, or will the synthesis recognize
the adder can be shared?
Or to I have to specifically write a multiplexor for the adder? Thanks!

There are no guarantees either way.
But it doesn't really matter.
There are not really any "adder" primitives
inside the fpga. Only gates and flops.

All synthesis guarantees is a netlist that
simulates the same as the source code.
There is no guarantee that the RTL or
technology schematic output
will look like I expect. But it will work.

If I code for an input mux with
one adder, I just might get two adders
and an output mux, if that better
matches the constraint settings or the whim
of the synthesis algorithm.
Or I might get just what I expect.
Or I might get something completely different.

Luckily, synthesis does a better
job, on the average, of packing gates
into a arbitrary device than I do.

-- Mike Treseler

Apr 21, 2006

Jim Granville wrote:

Why not take them a sound business plan, I'm sure they would listen ?

I was told once they have some adversion to becoming a systems company,
along with some NIH factors that might make a new kid on the block a
little unwelcome if waving a $3-5B business plan in the air.

I have at times been looking for an RC startup as a senior architect
and/or CTO, plus considering seeking funding based on my own work. I'd
still like to build the multi petaflop system I proposed to several
firms last year, using a large number of XC4VLX200's and RM9000's. Then
spin a few wafers with different programmable architecture to push past
an exaflop by decade end. In the short term I have a few student boards
to build, and finish my proof of concept work.

I've already said more here than I would have planned, but maybe that's
good, as Xilinx's competitors have something to consider about doing
this business right. Peter can keep pushing, and I might even level
their playing field a little more. They might even want to shut me up
by giving me the briefcase full of XC4VLX200's to do the proof of
concept machine right, so I can go sell petaflop RC super computers
with Xilinx defect managed parts instead of A-Team parts. Or maybe
there is an A-Team that is really interested in becoming a $5B company
this decade.

Jim Granville · Apr 21, 2006

Pablo Bleyer Kocik wrote:

Hello people.

As I announced some days ago, I updated the PacoBlaze3 core
[http://bleyer.org/pacoblaze/] now with a wide ALU that supports an 8x8
multiply instruction ('mul') and 16-bit add/sub operations ('addw',
'addwcy', 'subw', 'subwcy'). The new extension core is called
PacoBlaze3M. It could be useful performing small DSP functions and math
subroutines when there is a spare hardware multiplier block.

The implementation scheme modifies the PicoBlaze register model
dividing it in odd/even (high/low) sections with a multiplexing layer.
16-bit writes are performed on both odd/even registers. The multiply
operation accepts any two arbitrary registers and the wide add/sub
instructions operate on contiguous 16-bit "extended" registers.

Eg: (KCAsm code)

---8<---

test_mul: ; mul example
load s0, $ca ; s0 = 0xca
load s2, $fe ; s2 = 0xfe
mul s0, s2 ; {s1,s0} = 0xca * 0xfe = 0xc86c

test_addw: ; addw example
load s1, $ca ; s1 = 0xca ; mix cafe...
load s0, $fe ; s0 = 0xfe

load s3, $be ; s3 = 0xbe ; ...with beef
load s2, $ef ; s2 = 0xef

addw s2, s0 ; {C,s3,s2} = 0xbeef + 0xcafe = 0x189ed ; yes, you got
189'ed P

--->8---

I am having a bit of trouble intercepting the adder carry in a carry
chain with ISE using behavioral code. I am currently using two muxed
adders (one 8-bit, one 16-bit) for the addsub module instead of the
ideal high/low 8-bit adders with full and half carries. Any ideas on
how to implement this in ISE?

I will focus now in adding better documentation and some verification
scripts. I also have a small language on the works (sarKCAsm --how
original) that is a macro assembler with operations to code in
Pico/PacoBlaze using commands like s0 = s1+s2, s4 += s5, etc. I will
release that as soon as I finish teaching myself ANTLR.

Enjoy & rejoice ;o)

Sounds impressive.
You have seen the AS Assembler, and the Mico8 from Lattice ?

FWIR the Mioo8 is very similar to PicoBlaze ( as expected, both are
tiny FPGA targeted CPUs ), but I think with a larger jump and call reach
(but simpler RET options).
If you are loading on features, the call-lengths might need attention ?

Have you tried targeting this to a lattice device ?

-jg

Pablo Bleyer Kocik · Apr 21, 2006

Jim Granville wrote:

Sounds impressive.
You have seen the AS Assembler, and the Mico8 from Lattice ?

Yes, I am very much aware of Mico8 and I have used AS in several
projects in the past. I know that it supports PicoBlaze (and Mico8
now). But what I want to do now is a small version of a language like
HLA or terse for PicoBlaze. Something simple and readable that is easy
to modify like the current KCAsm (hey, adding the mul and add/sub
instructions took less than one minute. ;o)

Here is what sarKCAsm is currently looking like (currently a JavaCC
implementation, but I am swapping to ANTLR now because it has better
support for trees).

---8<---

s0 = $ca ; load
s1 = s0 + $fe ; same as s1 = s0, s1 += $fe
func($be, $ef) ; function call, s0 = $be, s1 = $ef

s3 = 16

loop:
func(s0, s1)
s0 == $55 ; compare
done Z? ; conditional jump
s3 -= 1
done Z?
loop ; unconditional jump

done:
done

func(s0: s0, s1): ; result + clobber list
s0 <- $0 ; read from port 0
s0 ^= s1 ; xor
s1 << C ; sla
# ; return

--->8---

FWIR the Mioo8 is very similar to PicoBlaze ( as expected, both are
tiny FPGA targeted CPUs ), but I think with a larger jump and call reach
(but simpler RET options).
If you are loading on features, the call-lengths might need attention ?

For now the limits of the PicoBlaze model have been within my needs
(IIRC, mico8 has the same 10-bit jumps/calls as PB3 and it is very
isomorphic to it). My main drive to create PacoBlaze was to get the
most versatile processor that I could use as a peripheral controller in
my projects (eg motor control, bus controller, PWM generator, audio
co-processor, specifically in the JBRD of my Javabotics project,
http://bleyer.org/javabotics/). It isn't difficult to extend the memory
model of PicoBlaze using PacoBlaze, though.

Have you tried targeting this to a lattice device ?

Not yet. I plan to synthesize the core using different tools that I
may have access to, but that is not in my list of priorities.

Cheers.

-- /"Naturally, there's got to be some
PabloBleyerKocik / limit, for I don't expect to live
pablo / forever, but I do intend to hang on
@bleyer.org / as long as possible." -- Isaac Asimov

Brannon · Apr 21, 2006

I've got a Dini DN8000K10 in hand that seems to work quite well and
having the features you were looking for.

ziggy · Apr 21, 2006

In article <1142888577.488377.237030@t31g2000cwb.googlegroups.com>,
"Pablo Bleyer Kocik" <pablobleyer@hotmail.com> wrote:

Hello people.

As I announced some days ago, I updated the PacoBlaze3 core
[http://bleyer.org/pacoblaze/] now with a wide ALU that supports an 8x8
multiply instruction ('mul') and 16-bit add/sub operations ('addw',
'addwcy', 'subw', 'subwcy'). The new extension core is called
PacoBlaze3M. It could be useful performing small DSP functions and math
subroutines when there is a spare hardware multiplier block.

Cool, though I have not had had time to even get 2.0 running yet.. (
life got in the way of fun stuff )

Jim Granville · Apr 21, 2006

Pablo Bleyer Kocik wrote:

Jim Granville wrote:

Sounds impressive.
You have seen the AS Assembler, and the Mico8 from Lattice ?

Yes, I am very much aware of Mico8 and I have used AS in several
projects in the past. I know that it supports PicoBlaze (and Mico8
now). But what I want to do now is a small version of a language like
HLA or terse for PicoBlaze.

I realised that; - just checking you knew of them

Something simple and readable that is easy
to modify like the current KCAsm (hey, adding the mul and add/sub
instructions took less than one minute. ;o)

Good targets.

Here is what sarKCAsm is currently looking like (currently a JavaCC
implementation, but I am swapping to ANTLR now because it has better
support for trees).

---8<---

s0 = $ca ; load
s1 = s0 + $fe ; same as s1 = s0, s1 += $fe
func($be, $ef) ; function call, s0 = $be, s1 = $ef

s3 = 16

loop:
func(s0, s1)
s0 == $55 ; compare
done Z? ; conditional jump
s3 -= 1
done Z?
loop ; unconditional jump

done:
done

func(s0: s0, s1): ; result + clobber list
s0 <- $0 ; read from port 0
s0 ^= s1 ; xor
s1 << C ; sla
# ; return

Will you also do boolean (Flag) functions ?

General comments: ( feel free to ignore... )

The expression clarity makes good sense, and I also like languages that
can accept flexible constants: viz $55 or 0x55 or 55H, or 2#01010101 or
16#55, or 2#01_0101_01.

I've also seen XOR AND OR NOT etc keywords supported, as well as the
terse C equivalents. ( which are a real throwback to when source size
mattered ).

but I'm not sure about labels in the left most code-column - that makes
code harder to scan, and indent etc, and not as clear in a syntax
highighted editor....

ie If you have to add a comment, then the language is probably not clear
enough....

# for return ? => why? - why not return, or RET or IFnZ RET
label then condition ? => most languages are IF_Z THEN or if_nZ DestAddr
Label for Loop jmp ? => REPEAT Label, or LOOP label

If a 12yr old kid can read the source, and not need a raft of prior
knowledge, then that's a good test of any language

-jg

Jim Granville · Apr 21, 2006

Pablo Bleyer Kocik wrote:

For now the limits of the PicoBlaze model have been within my needs
(IIRC, mico8 has the same 10-bit jumps/calls as PB3 and it is very
isomorphic to it).

I think I recall the Mico8 had more obvious expansion space in the
opcodes - but either way, this is the sort of expansion that is nice to
allow for early-on.

With more smarts, users _are_ going to need larger address space

The assembler should accept either size, and warn on the
smaller/larger ceiling, based on a target/build family define.

-jg

Apr 21, 2006

Jim Granville wrote:

Ray Andraka wrote:
fpga_toys@yahoo.com wrote:
The parts all have non-volatile storage for configuration.

I think John was meaning store the info in the ConfigFlashMemory.
Thus the read-erase-replace steps.
.. but, you STILL have to get this info into the FIRST design somehow....

Thanks Jim ... that is EXACTLY what I did say. It doesn't mater if the
configuration storage is on an 18V04, platform flash card, or a disk
drive.

Jim Granville · Apr 21, 2006

Ray Andraka wrote:
<snip>

In my experience, FPGAs can
do roughly 100x the performance of similar generation microprocessors,
give or take an order of magnitude depending on the exact application
and provided the FPGA design is done well. It is very easy to lose the
advantage by sub-optimal design. If I had a dollar for every time I've
gotten remarks that 100x performance is not possible, or that so and so
did an FPGA design expecting only 10x and it turned out slower than a
microprocessor because it wouldn't meet timing etc, I'd be retired.

How does a FPGA compare with something like the cell processor ?

I'd have thought that for reconfig computing, something like an
array of CELLS, with FPGA bridge fabric, would be a more productive
target for RC.
FPGAs are great at distributed fabric, but not that good at memory
bandwidth, especially at bandwidth/$.
DSP task can target FPGAs OK, because the datasets are relatively small.
Wasn't it Seymour Cray whot found that IO and Memory bandwidths
were the key, not the raw CPU grunt ?

-jg

EDK : FSL macros defined by Xilinx are wrong

rickman

Guest

austin

Guest

John_H

Guest

Subhasri krishnan

Guest

Peter Alfke

Guest

leaf

Guest

John Williams

Guest

John McCaskill

Guest

Andy Peters

Guest

leaf

Guest

Mike Treseler

Guest

Guest

Jim Granville

Guest

Pablo Bleyer Kocik

Guest

Brannon

Guest

ziggy

Guest

Jim Granville

Guest

Jim Granville

Guest

Guest

Jim Granville

Guest

Log in

Welcome to EDABoard.com

Sponsor