EDK : FSL macros defined by Xilinx are wrong

ISE Service Pack - Service Pack #3 is the latest Service Pack for
Xilinx ISE 7.1i. Download the available file to ensure that your
Xilinx software is up to date.
 
"elcielo" <kyson@paran-dot-com.no-spam.invalid> schrieb im Newsbeitrag
news:QeSdnWH_a6kpGlLfRVn_vg@giganews.com...
ISE Service Pack - Service Pack #3 is the latest Service Pack for
Xilinx ISE 7.1i. Download the available file to ensure that your
Xilinx software is up to date.
the issue was actually observable also with 7.1SP2 and spartan3, and Xilinx
response was that the issue is most likely not addressed in SP3
as it was rather rare case:

* 32 bit counters (n number of them) all clocked from different clocks, use
n global clocks
* JTAG BSCAN, some outputs auto promoted to use BUFG by 7.1 SP2

as ISE inserted BUFG on DRCK1 from BSCAN then total number of global clocks
exceeded 8, and as the 32 bit counters as RPM do not fit into a quadrant
then the resulting design did not route. (limitation of global clocks that
can enter a single quadrant)

to my surprise the issus has been solved in SP3, the designs do not fail any
more on routing, but without manually setting 'buffer_type' the auto
allocation of the BUFG is wrong, so I still need to manually 'disable' some
of the BUFG insertion.

in SP3 8 clocks are selected to be Global (at random?) the rest goes with
local routing, when 'false' clocks as are set to buffer_type=none then the
actual clocks are detected ok

Antti
 
Peter,

I understand that FT256 is a more complicated package than TQ144. But
why does it (the FT256 package) have a very heavier price impact on
XC3S400 than on XC3S200? Can I expect the same for Spartan3E?

Andrew,

"Who knows how Xilinx do their pricing,"
Yes, but I don't feel good when something looks illogical to me!

Luiz Carlos
 
http://www.justfuckinggoogleit.com/search.pl?query=edif

First link.

Cheers,
Jon
 
Sounds great.

Instead of a simple filter, undersampling is enough as it is, in fact,
a crude LP filter. Is this correct?

If were actually using filter, what specs would you use? Cutoff at 1ms,
3dB point at .5ms?

Thanks Ben
 
Analyze the requirements.
You must suppress the longest bounce, but you also want to react to the
fastest possible legitimate operation. Establish these two times ( say
10 ms for longest bounce, and 100 ms for fastest human action) and put
the cut-off somewhere in the middle.
Peter Alfke
 
Cogradulations..........Plz send C compiler for picoblaze with manual
at fahadislam2002@hotmail.com .......thanks
 
Sven wrote:
Hi,

i try to build a dynamic reconfiguration application with use of icap
and the embedded powerpc. By reverse engineering i found a lot of the
bit-combinations in the configuration-frames, which handle the routing
in the switch-boxes. The reverse engineering process takes a lot of
time. I think anybody can give me the remaining bit-combinations in the
frames?

Thanks, Sven

I've tried doing that as well. It's really hard to get the whole picture
with reverse engineering. I gave up after a few months. Try to follow
the approach in XAPP260 to do modular reconfiguration.
 
Hi Junaid,

I assume you're asking what this report means.

Wirelength results (all in units of 1 clb segments):
Total wirelength: 24 Average net length: 2.40000
Maximum net length: 5
This gives the amount of wire used to route your design, where each unit of
wire is a "wire segment" that spans 1 clb. You have 10 nets in your design,
and the routing of said nets spans 24 wire segments, giving an average of
2.4 wire segments per net. The longest routed net was 5 net segments.

Wirelength results in terms of physical segments:
Total wiring segments used: 24 Av. wire segments per net:
2.40000
Maximum segments used by a net: 5
This gives the wiring in terms of the number of pre-fabricated wires used
between logic blocks (clbs/LABs). In the example report you've posted, you
must be using a routing architecture where every wire is only one logic
block long, since this summary is the same as the summary above it. If you
had longer wires (say wires that spanned 4 clbs before they went through a
programmable switch), then the number of "physical segment wires" would be
smaller (6) than the wires in terms of 1-logic-block-long segments (24).

Hope this helps,

Vaughn
Altera (but with my academic hat on)
[v b e t z (at) altera.com]

"junaid" <k.najeeb@gmail.com> wrote in message
news:1120580001.595173.137780@g49g2000cwa.googlegroups.com...
I need more information wirelength results given by VPR
for ex:

Wirelength results (all in units of 1 clb segments):
Total wirelength: 24 Average net length: 2.40000
Maximum net length: 5

Wirelength results in terms of physical segments:
Total wiring segments used: 24 Av. wire segments per net:
2.40000
Maximum segments used by a net: 5

Kindly help me

Tanx in advance


junaid
 
Dear Sir,

Thanks a lot for your kind reply and also for clearing my doubts.
Expecting your help in future and also requesting you sir to contribute
more and more towards this direction, I stops here
 
Yeah, I understand this. But I can't wrap my head around how to code it.

Do you do like this:
if( clk'event and clk='1') then
partial_sum1_2bit <= '0'&bit0 + '0'&bit1;
partial_sum2_2bit <= '0'&bit0 + '0'&bit1;
partial_sum1_3bit <= '0'&partial_sum1_2bit + '0'&partial_sum2_2bit;
-- and so on
end if;

And then there is the question on how this all synthesizes, probably, for me
at 27MHz, opimized for area not speed. I could use a little insight from
someone who's done this before.

Brad Smallridge
b r a d @ a i v i s i o n . c o m
 
I also don't understand what you mean by "having your tool
retime them". I don't have Precision or any advance tools here.
 
64 is 0+63+1
63 is 31+31+1
31 is 15+15+1
15 is 7+7+1
7 is 3+3+1
simple recursion

a few adder rows should be pretty quick and way less resources than
BlockRam, takes about 6 levels of small adders
 
JD,

V4 is just like V2, V2P, V2P-X, and S3: it has traditional CMOS output
structures that have the diodes to ground and Vcco as part of the nfet
and pfet devices themselves.

"hot swap" means many different things to many people -

1. The most strict: insertion and removal of a device from a parallel
bus must not affect data being sent/received by others on the bus.
This is really tough. Even if the diodes aren't there (such as in a
competitor's part) there is still the power on/off of the IO and its
intrinsic capacitive loading (however small). At slow speeds this
works if the diodes are not present, but at high speeds the secondary
factors become primary, and even the "hot swap" part that claims full
compliance fails to meet the requirement of no glitches whatsoever.

2. Less strict: insertion and removal which uses a stepped or
sequenced connector. This is achievable. Our app notes detail these
solutions. They apply to V4 equally as to V2 or V2P. By sequencing
the connections, one can overcome the diode issue of clamping, and the
potential glitching issue by control of the pins prior to their mating.
Again, some engineering is required, but it does work.

3. Common: insertion and removal on a parallel bus that uses a
protocol to recognize insertion, and back off and retry (or ignor).
Nice, because you do nothing, and the system is designed to work even
if there are glitches.

4. Self-powering: since the diode to Vcco can be forward biased, and
the IO bank in the V4 needs 8 mA to power ON completely, the IO bank
can be powered from the wide parallel bus itself. A number of
customers figured this out (with our help), and their system backplanes
work this way. No glitches as the bus uses a very strong driver on
transmitting cards (which all together end up powering ON the IO banks
of inserted cards without glitching -- they are guaranteed to power on
tri-state before configuration).

5. MGT's, LVDS, or other point to point: here "hotswap" just means
that no damage is done when you insert/remove. And, no damage is done
to the MGTs on V2, V2P, or V4. Data isn't the issue (when the board is
unplugged, there is no point to point link!).

Hope this helps,

Austin
 
If I were to do it in Verilog, I might use
always @(posedge Clk27M)
TotalOnes <= in[0]+in[1]+in[2]+in[3]+in[4]+in[5]+in[6]+... and continue
typing until I reach +in[63];

The synthesizer MAY produce superb results.
If it grinds, split it into groups - 8 groups of 8 or 4 groups of 16 nad add
*those* values together as a multiple-value addition.


"Brad Smallridge" <bradsmallridge@dslextreme.com> wrote in message
news:11dr26q40k9876e@corp.supernews.com...
Yeah, I understand this. But I can't wrap my head around how to code it.

Do you do like this:
if( clk'event and clk='1') then
partial_sum1_2bit <= '0'&bit0 + '0'&bit1;
partial_sum2_2bit <= '0'&bit0 + '0'&bit1;
partial_sum1_3bit <= '0'&partial_sum1_2bit + '0'&partial_sum2_2bit;
-- and so on
end if;

And then there is the question on how this all synthesizes, probably, for
me
at 27MHz, opimized for area not speed. I could use a little insight from
someone who's done this before.

Brad Smallridge
b r a d @ a i v i s i o n . c o m
 
I would like to switch to Verilog, but not on this project.

If I were to do it in Verilog, I might use
always @(posedge Clk27M)
TotalOnes <= in[0]+in[1]+in[2]+in[3]+in[4]+in[5]+in[6]+... and continue
typing until I reach +in[63];
 
Vladislav, I agreee. And the nicest thing is that you can fold two
BlockRAMs into one, by using the two ports independently. So one
BlockRAM takes care of 24 inputs and generates two sets of 4 bits each.
That means you need only 3 BlockRAMs for up to 72 inputs. (plus a few
CLBs to combine the outputs, unless you want to use two more BlockRAMs
to do that) 5 BlockRAMs total gives a 2-clock latency.
It all depends what you are after, speed or cost.
Peter Alfke
 
Hi Peter,

Vladislav, I agreee. And the nicest thing is that you can fold two
BlockRAMs into one, by using the two ports independently. So one
BlockRAM takes care of 24 inputs and generates two sets of 4 bits each.
That means you need only 3 BlockRAMs for up to 72 inputs. (plus a few
CLBs to combine the outputs, unless you want to use two more BlockRAMs
to do that) 5 BlockRAMs total gives a 2-clock latency.
It all depends what you are after, speed or cost.
I personally feel that using blockrams is a bit wasteful - I coded something
up in VHDL that used 144LEs in an Altera Cyclone 1, slowest speed grade,
running at 115MHz with two clocks of latency as well. No idea how big that
would be in a Spartan - my guess is that it would be similar.

Then again, if there's no LUTs left, and there's some leftover BRAMs, then
sure this is a great solution.

BTW: Peter, would you (plural) mind if I downloaded a WebPack so I can
compare?

Best regards,


Ben
 
Ben Twijnstra wrote:

Ben, you are correct, IF you need the block RAMs elsewhere in your
design, or if they are not located conveniently with respect to the
logic this is related to. Using LUTs, it can be done in 5 layers of
logic, which even without pipelining but with floorplanning will run
pretty quickly. If you pipeline it on every layer, it might even
out-perform the BRAM , but only if you are very careful about the
placement.

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930 Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

"They that give up essential liberty to obtain a little
temporary safety deserve neither liberty nor safety."
-Benjamin Franklin, 1759
 
I wasn't suggesting you should switch to verilog, just the code that I
showed is Verilog but the concept should translate directly yo VHDL. Add 64
1-bit values in a single VHDL line. If the synthesizer doesn't do a good
job, have eight lines of eight values each then add those 8 4-bit results in
one line to get your 7-bit result.

"Brad Smallridge" <bradsmallridge@dslextreme.com> wrote in message
news:11dt99klhsiu2eb@corp.supernews.com...
I would like to switch to Verilog, but not on this project.

If I were to do it in Verilog, I might use
always @(posedge Clk27M)
TotalOnes <= in[0]+in[1]+in[2]+in[3]+in[4]+in[5]+in[6]+... and continue
typing until I reach +in[63];
 

Welcome to EDABoard.com

Sponsor

Back
Top