Nios II Going Live...

john jakson wrote:
<Snip>
I suspect that the several ASIC MT cpus that have recently come along
for the wireless set could well have the best int response esp 1 that
runs 8 threads at 250MHz (or was it 400MHz) because the threads run
all the time every 8th cycle. ANd these cpus don't have context to
swap since they have N contexts in ram.
The Ubicom part claims 0 or 1 cycles.

Technically Transputers don't have interrupts, thats too low a level
of looking at them, but they do service events with an incredibly
quick response for a variety of reasons but that was at 25MHz and
15yrs ago.

Now the R3 cpu also being an multithreaded (MT) cpu (and also now
running baby code BTW in C model) could designate 1 of its 16 threads
to poll some HW and take the event home. That would mean about 20-50
cycles of computation might pass before Pn noticed it had to do some
work. If Pn can find away to stay active in the IX engine without
branching (which causes process swap round robin style) then it could
notice an event in <4cycles. I don't think I will add support for
always stay active process. Now when the process thats does service an
interupt does get it's turn, it will have no registers to swap but it
may have to do some cache misses while workset becomes reloaded but
thats transparent to MT. If it pans out at 250MHz in V2Pro it may or
may not have fastest int response. It will however have the most
throughput of any FPGA cpu bordering on 1.3clock Freq from the sim
traces. It loves branches and transfers and swapping, its the nature
of the MT beastie.
In a hard-timesliced CPU, I can see two schemes for handling
interrupts, that would need sightly different hardware (no problem in a
FPGA-CPU:)
It is a CPU structure that would seem to fit well into FPGA resource.

- First scheme allows any/(first) available free timeslot to an
interrupt thread. This allows good granularity, but does not give the
smallest possible INT response.

- Other scheme is carefull to leave every second time-slot free, for
possible INT. INT response/context sw is MUCH faster (1-2 clocks), but
cost is that other threads cannot have more than 50% of the CPU.
With time-sliced CPUs threads have zero time-crosstalk, but the peak
CPU usage for any single thread is lower.

in most embedded applications, bounding the slowest path, and reducing
jitter, can matter more than fastest-possible-speed over a short
distance numbers.

-jg
 
lowest interrupt latency of any soft processor core
Interesting. Friends, what *is* this vaunted MicroBlaze interrupt latency
(in cycles or ns)? Is there some special mechanism, or is it simply clean
living?

(By the way, interrupt servicing (interrupt and return from interrupt)
completes in as few as 6 cycles on the good old xr16 soft processor core.)

Thanks,
Jan Gray
Gray Research LLC
 
Multiple embedded processors?

I built a Stratix design with 6 Nios' on it a long time ago. The
design from start to finish took less than half a day. Could have been
more if I had the space on the FPGA.

-- Pete

Processors, plural.

I'm still right.

Austin
 
Tim wrote:

and why are there so many transputer people in fpgaland?
May I rephrase it, and make a question out ot it ?
Are the any transputer loke-a-likes WORKING on an FPGA ?
 
"E.S." wrote:
Tim wrote:

and why are there so many transputer people in fpgaland?

May I rephrase it, and make a question out ot it ?
Are the any transputer loke-a-likes WORKING on an FPGA ?
Interesting question. I am giving some consideration to inserting my
own cpu into an FPGA to be programmed in Forth. I remember the
Transputer architecture and instruction set as being well suited to
implementing a stack language as well as being rather minimal. But I
can't find my copy of the instruction set reference manual. Anyone know
where I can find a copy?


--

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design URL http://www.arius.com
4 King Ave 301-682-7772 Voice
Frederick, MD 21701-3110 301-682-7666 FAX
 
rickman wrote:

Interesting question. I am giving some consideration to inserting my
own cpu into an FPGA to be programmed in Forth. I remember the
Transputer architecture and instruction set as being well suited to
implementing a stack language as well as being rather minimal. But I
can't find my copy of the instruction set reference manual. Anyone know
where I can find a copy?
It's a stack machine, but its model of how to use the stack is
fundamentally different from Forth's. The stack is very shallow and must
be *empty* at every branch! It's designed for expression evaluation in
Algol-like languages, nothing more.

The Transputer's intended strength is very fine-grained multitasking:
since it has almost no execution context to save, context switching is
very fast. They were also fairly radiation tolerant, so they were used
for some space applications. I'm sitting here watching telemetry from
HETE-2, which has four T805 CPU's on board.

In general, I would not recommend the Transputer architecture for most
applications. It basically combines the disadvantages of register and
stack machines. Its ratio of instruction/data accesses is very high
compared to other architectures, leading to serious inefficiency when
running ordinary serial code.

-jpd
 
"E.S." <emu@ecubics.com> wrote in message news:<ZlLrc.4206$zs2.1848@fe39.usenetserver.com>...
Tim wrote:

and why are there so many transputer people in fpgaland?

May I rephrase it, and make a question out ot it ?
Are the any transputer loke-a-likes WORKING on an FPGA ?
Not yet. Somebody else over in the Tp NG said they were going to clone
the thing wholesale for code compatibility but didn't say how or when
they would (a student I thinks).

Apart from me, I don't think anyone else has the inclination to try,
most people probably have a job, and the Tp has had enough trashing to
make alot of people stay away. And the complicated UK/US thing is
there too.


The MT core I am working on will likely get Tp scheduling and message
passing and more but it takes time. The core is only now running
simple threaded codes. Today it hit 100M cpu cycles on 16threads (the
same trivial code on all), but doesn't do much interesting yet till
more opcodes implemented. Its only been 3+ months since I started. I
have to go back and bring the Verilog back into sync with C RTL code
too. There are some big issues up ahead like the cache & TLB design,
sometimes MT helps alot, maybe not always.


Now I hope you'r not asking for a precise clone because as you must
know, FPGAs don't clone anything well for which they were not
originally intended for.

Anyway stick around to see posts on progress:)

regards

johnjakson_usa_com
 
rickman <spamgoeshere4@yahoo.com> wrote in message news:<40AFC172.C49E27FA@yahoo.com>...
"E.S." wrote:

Tim wrote:

and why are there so many transputer people in fpgaland?

May I rephrase it, and make a question out ot it ?
Are the any transputer loke-a-likes WORKING on an FPGA ?

Interesting question. I am giving some consideration to inserting my
own cpu into an FPGA to be programmed in Forth. I remember the
Transputer architecture and instruction set as being well suited to
implementing a stack language as well as being rather minimal. But I
can't find my copy of the instruction set reference manual. Anyone know
where I can find a copy?


--

The Transputer Instruction set is easy enough to find on the web, use
google to find a few portals with all the links, like classic old
comps, wotug, etc.

In c.s.transputer NG look for Ram and his home page and links, docs,
even OS stuff. The ISA also has a compilere writers guide to explain
it.

I never looked at it myself, never will, theres too many things I
never liked about it, byte encoding, and mostly because there are just
so many instructuions for what is supposed to be a simple cpu. later
on the whole kitchen sink fell into it, graphics rendering and so on.
Even related codes have odd hex codings, like they did a,b,c, x,y,x
then realized later to put in d but right after z.

regards

johnjakson_usa_com
 
rickman <spamgoeshere4@yahoo.com> wrote in message news:<40AFC172.C49E27FA@yahoo.com>...
"E.S." wrote:

Tim wrote:

and why are there so many transputer people in fpgaland?

May I rephrase it, and make a question out ot it ?
Are the any transputer loke-a-likes WORKING on an FPGA ?

Interesting question. I am giving some consideration to inserting my
own cpu into an FPGA to be programmed in Forth. I remember the
Transputer architecture and instruction set as being well suited to
implementing a stack language as well as being rather minimal. But I
can't find my copy of the instruction set reference manual. Anyone know
where I can find a copy?

I just realised, you probably don't need to look at the Transputer
because it has no sp type stack that you'd need for Forth, Pascal, C
etc. It does have a HW stack for eval expressions just like a HP RPN
calc but it is only 3 reg deep and tied into the scheduler for
switching processes when its empty IIRC..

The reason of course is the same as HDL, do Verilog or VHDL allow you
to write code that is stack based or implies recursion, no. Occam
being parallel is more like a HDL, think HW and not seq stack scopes.
For same reason I have no sp either. Infact although Occam allows
functions IIRC it forbid recursion because the compiler had to know in
advance how to lay out memory as much as youd floorplan cells.
Functions can still work with out an sp but other memory management
techniques are called for.

A google for Transputer & Forth should draw a blank I'd guess but I
could be wrong.

Yes it does look like they did it, well I wonder what they did?

regards

johnjakson_usa_com
 
john jakson wrote:

A google for Transputer & Forth should draw a blank I'd guess but I
could be wrong.

Yes it does look like they did it, well I wonder what they did?
It could be done, but it wouldn't be efficient. You'd have to implement
a separate software number stack.

-jpd
 
In comp.lang.forth John Doty <jpd@whispertel.losetheh.net> wrote:
john jakson wrote:

A google for Transputer & Forth should draw a blank I'd guess but I
could be wrong.

Yes it does look like they did it, well I wonder what they did?

It could be done, but it wouldn't be efficient.
I did it, and it was pretty fast. The top of the stack was cached in
the internal register stack and flushed to the workspace when full and
in a few other cases.

Andrew.
 
In article <adb3971c.0405222015.4821eb9@posting.google.com>,
john jakson <johnjakson@yahoo.com> wrote:

<SNIP>
I just realised, you probably don't need to look at the Transputer
because it has no sp type stack that you'd need for Forth, Pascal, C
etc. It does have a HW stack for eval expressions just like a HP RPN
calc but it is only 3 reg deep and tied into the scheduler for
switching processes when its empty IIRC..
It is not the stack (or a stack). These are registers at the lowest
level, almost a RISC way to write microcode explicitly.
The register at the next to lowest level are 16, and you can add two
registers and put it back in a third by 3 single byte instructions.
You can use a couple of those for stack pointers, without exhausting
resources, like in a Pentium.

<SNIP>
A google for Transputer & Forth should draw a blank I'd guess but I
could be wrong.
Of course you are wrong.
You should know by now, that there is a Forth for every processor
that is over 6 month old.

tforth is the precursor is iforth, and it is still available from DFW.
It is a solid piece of work, if I may say so myself.
There are more.

Yes it does look like they did it, well I wonder what they did?

regards

johnjakson_usa_com
--
Albert van der Horst,Oranjestr 8,3511 RA UTRECHT,THE NETHERLANDS
One man-hour to invent,
One man-week to implement,
One lawyer-year to patent.
 
andrew29@littlepinkcloud.invalid wrote in message news:<10b0pl4kcqiot99@news.supernews.com>...
In comp.lang.forth John Doty <jpd@whispertel.losetheh.net> wrote:
john jakson wrote:

A google for Transputer & Forth should draw a blank I'd guess but I
could be wrong.

Yes it does look like they did it, well I wonder what they did?

It could be done, but it wouldn't be efficient.

I did it, and it was pretty fast. The top of the stack was cached in
the internal register stack and flushed to the workspace when full and
in a few other cases.

Andrew.
Well people will find away to do whatever crazy thing they have in mind:)

regards

johnjakson_usa_com
 
Peter,

I was referring to hard IP uP, not soft cores.

Austin

Peter Sommerfeld wrote:
Multiple embedded processors?

I built a Stratix design with 6 Nios' on it a long time ago. The
design from start to finish took less than half a day. Could have been
more if I had the space on the FPGA.

-- Pete


Processors, plural.

I'm still right.

Austin
 
albert@spenarnc.xs4all.nl (Albert van der Horst) wrote in message news:<Hy5wHL.1nz.1.spenarn@spenarnc.xs4all.nl>...
In article <adb3971c.0405222015.4821eb9@posting.google.com>,
john jakson <johnjakson@yahoo.com> wrote:

SNIP
I just realised, you probably don't need to look at the Transputer
because it has no sp type stack that you'd need for Forth, Pascal, C
etc. It does have a HW stack for eval expressions just like a HP RPN
calc but it is only 3 reg deep and tied into the scheduler for
switching processes when its empty IIRC..

It is not the stack (or a stack). These are registers at the lowest
level, almost a RISC way to write microcode explicitly.
The register at the next to lowest level are 16, and you can add two
registers and put it back in a third by 3 single byte instructions.
You can use a couple of those for stack pointers, without exhausting
resources, like in a Pentium.

SNIP

A google for Transputer & Forth should draw a blank I'd guess but I
could be wrong.

Of course you are wrong.
You should know by now, that there is a Forth for every processor
that is over 6 month old.
Ofcourse I checked on the next line before signing off, and saw tforth and others

tforth is the precursor is iforth, and it is still available from DFW.
It is a solid piece of work, if I may say so myself.
There are more.

Yes it does look like they did it, well I wonder what they did?

regards

johnjakson_usa_com
 
Jesse Kempa wrote:

As an example, the user can debug many (we have tested up to
8) processorS (plural) simultaneously via a single JTAG connection and
a nice IDE environment.
This is where we are today, but it just doesn't play to the strength of
an FPGA. It's a bit like having one (or 8) block RAM. The FPGA really
gets rolling when we can make parallel use of a shed-load of resources.
Just as nothing can beat the bandwidth of a big FPGA with all the block
RAMs going in parallel, nothing will be able to touch an FPGA with lots
of application-tailored CPUs.

Not great for evaluating spreadsheets, but pretty good in other domains.

I once had a project (paper only) where each processor's instruction
stream was scanned for the opcodes used, then the corresponding FPGA
processor implementation was modified to match the usage.
 
In article <c8tdmh$8o2$1$8300dec7@news.demon.co.uk>,
Tim <tim@rockylogic.com.nooospam.com> wrote:
Jesse Kempa wrote:

As an example, the user can debug many (we have tested up to
8) processorS (plural) simultaneously via a single JTAG connection and
a nice IDE environment.

This is where we are today, but it just doesn't play to the strength of
an FPGA. It's a bit like having one (or 8) block RAM. The FPGA really
gets rolling when we can make parallel use of a shed-load of resources.
Just as nothing can beat the bandwidth of a big FPGA with all the block
RAMs going in parallel, nothing will be able to touch an FPGA with lots
of application-tailored CPUs.

Not great for evaluating spreadsheets, but pretty good in other domains.
Surely spreadsheets are pretty much infinitely parallel once you've
spread up the dependency graph for the cells among the various
processors ... word processing is the bit I have more trouble thinking
how to divide among a myriad processors, not that anyone types fast
enough for that to matter :)

Tom
 
"john jakson" <johnjakson@yahoo.com> wrote in message
news:adb3971c.0405211754.52bb304c@posting.google.com...

3) Its creators are British.


Perhaps I am doomed to fail on all 3 counts.

Anyway I may be a US citizen before this thing gets polished and can
deny the last rule as everything important has to seem to be invented
or reinvented in the US- (sadly).

Since my math isn't so great maybe I can deny the 2nd rule too:).
^^^^

John, looks like you're most of the way there ;-)

I still don't understand why Americans shorten mathematics
to 'math'.


Nial.
 
"Nial Stewart" <nial@nialstewartdevelopments.co.uk> wrote in message news:<40b36e96$0$4587$db0fefd9@news.zen.co.uk>...
"john jakson" <johnjakson@yahoo.com> wrote in message
news:adb3971c.0405211754.52bb304c@posting.google.com...

3) Its creators are British.


Perhaps I am doomed to fail on all 3 counts.

Anyway I may be a US citizen before this thing gets polished and can
deny the last rule as everything important has to seem to be invented
or reinvented in the US- (sadly).

Since my math isn't so great maybe I can deny the 2nd rule too:).
^^^^

John, looks like you're most of the way there ;-)

I still don't understand why Americans shorten mathematics
to 'math'.


Nial.
I don't know either, but I think its because I don't ever hear the
term arithmetic used in kindergarden level like we did in UK so math
got pushed down to cover that and never got explained as being more
serious term when they grow out of it. And where did all the u's go
too:)

regards

johnjakson_usa_com
 

Welcome to EDABoard.com

Sponsor

Back
Top