Dual-stack (Forth) processors

Jeff Fox · Feb 19, 2004

jzakiya@mail.com (Jabari Zakiya) wrote in message news:<a6fa4973.0402181055.27856c3@posting.google.com>...

Corrections:

The RTX 2000 had two 16-bit 256 element deep stacks (Return & Data),
a 2-4 cycle interrupt response time, and a bit-mutiply instruction which
could perform a complete general purpose multiply in 16-cycles. It was
rated a 8 MHz (but they could easily run at 10 MHz [which meant it took
a 20 MHz clock] at least at room temperatures).

The RTX 2010 had all of the above, plus a one-cycle hardware 16-bit
multiply, a one-cycle 16-bit multiply/accumulate, and a one-cycle
32-bit barrel shift. This was the version that Harris/Intersil based
the radhard version upon, which NASA and APL (Applied Physics Lab in
Columbia, MD) used for its space missions. They both still have a stash
left, the last that I heard.

The RTX 2001 was a watered down version which was basically the 2000,
but with only 64 element deep stacks. It was intended (according to
Harris) to be a cheaper/faster alternative to the 2000, but like the
Celeron vs the Pentium, if you can get the real thing at basically the
same price, why use the neutered version? Plus, the reduction of stacks
from 256 elements to 64 element greatly reduced the ability to do
multi-tasking and stack switching.

I used the RTX 2000/2010 extensively when I worked at NASA GSFC
Goddard Space Flight Center in Greenbelt, MD) from 1979-1994.

I had two RTX boards. One was a rather expensive board six layer
board with a Meg of SRAM and a shared memory interface to a PC ISA
bus. It was from Silicon Composers. The other was one of the cheap
European Indelko Forthkits, with RTX-cmForth, that I got from Dr.
Ting. I had no experience with the 2010. I didn't remember that
the 2001 had smaller stacks than the 2000 but I seemed to recall that
the 2000 had a single cycle multiply and the 2001 had only the
multiply step instruction. I no longer have the boards or the
manuals and I don't think that Dr. Koopman's book goes into the
details of what made the various models of RTX-20xx different.

It was a long time ago, so I might have been confused about bit
level details after all of these years. I spent a lot more years
working with P21, I21 and F21 and have a much better memory of
the bit level details there, it was also more recent.

I hope this helps set the history straight with regards to the differences
between the RTX versions. Too bad Harris didn't know how to market them.

Jabari Zakiya

Harris seemed to try first marketing it as Forth chip, then failing
at that as a good realtime computer for use with C. I have often
heard that it was too bad that they didn't know how to market it
properly. Still I don't know if anyone really knows what they
should-could-would have done to market it more successfully. They
simply decided that they could easily market 80C286 that they
could make on the same fab line. It also helps date those chips,
Novix vs 8088, 8086 and RTX vs 80286. The realtime response,
fast interrupt handling (relatively) and deterministic timing
were where they won most easily, but they weren't 'backward
compatible' with PC software like the Intel compatible chips
so they were swimming upstream in their marketing efforts.

Best Wishes

Roman Pavluyk · Feb 19, 2004

Maybe

http://www.microcore.org
http://jpb.forth.free.fr/

will be of interest to you

Roman

"Davka" <mygarbagepail@hotmail.com> wrote in message
news:T%XXb.70$pM3.121810@news.uswest.net...

Is there a community that is actively involved in discussing and/or
developing FPGA-based Forth chips, or more generally, stack
machines?

rickman · Feb 19, 2004

Jeff Fox wrote:

Harris seemed to try first marketing it as Forth chip, then failing
at that as a good realtime computer for use with C. I have often
heard that it was too bad that they didn't know how to market it
properly. Still I don't know if anyone really knows what they
should-could-would have done to market it more successfully. They
simply decided that they could easily market 80C286 that they
could make on the same fab line. It also helps date those chips,
Novix vs 8088, 8086 and RTX vs 80286. The realtime response,
fast interrupt handling (relatively) and deterministic timing
were where they won most easily, but they weren't 'backward
compatible' with PC software like the Intel compatible chips
so they were swimming upstream in their marketing efforts.

I can't say for sure exactly what kind of marketing would have helped
the RTX succeed. But I can tell you that any effort to pit it against
the x86 line was misdirected. The x86 parts were not really embedded
chips and I don't recall them being used as such very often. My memory
may be failing me at this since this was long before there were chips
aimed at the embedded market. But the Z80 and 8085 would have been the
main competition for an embedded processor. The x86 line used too much
board space and cost too much for most apps.

It is likely that Harris did not understand what you do about the
significant factors in embedded, realtime work. It has been more than
once that a vendor needed to educate the engineering community about the
features that make their products are a better way to go.

--

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design URL http://www.arius.com
4 King Ave 301-682-7772 Voice
Frederick, MD 21701-3110 301-682-7666 FAX

rickman · Feb 19, 2004

Sander Vesik wrote:

In comp.arch Martin Schoeberl <martin.schoeberl@chello.at> wrote:
An instruction (except nop type) needs either read or write access to
the stack ram. Access to local variables, also residing in the stack,
need simultaneous read and write access. As an example, ld0 loads the
memory word pointed by vp on TOS:
stack[vp+0] => A
A => B,
B => stack[sp+1]
sp+1 => sp

This configuration fits perfect to the block rams with one read and
one write port, that are common in FPGAs. A standard RISC CPU needs
three data ports (two read and one write) to implement the register
file in a ram. And usually one more pipeline stage for the ALU result
to avoid adding the memory access time to the ALU delay time. And for
single cycle execution you need a lot of muxes for data forwarding.

yes, the block rams (with registers implemented in those) make FPGA-s
have an interesting tradeoff -
* you can have a large number of registers with no penalty
* you are very limited in number of read/write ports, and
adding more does not scale *AT ALL*

The older Xilinx parts have small LUT based rams that have true 3 port
memory. But how they implemented it shows that you can always make a 3
port memory from a pair of two port memories. They tied the two write
ports together so that the two RAMs always were written with the same
data. But the read ports were kept separate allowing any two words to
be read at the same time.

As summary: In my opinion a stack architecture is a perfect choice for
the limited hardware resources in an FPGA.

--

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design URL http://www.arius.com
4 King Ave 301-682-7772 Voice
Frederick, MD 21701-3110 301-682-7666 FAX

Martin Schoeberl · Feb 19, 2004

"Sander Vesik" <sander@haldjas.folklore.ee> schrieb im Newsbeitrag
news:1077208019.903795@haldjas.folklore.ee...

In comp.arch Martin Schoeberl <martin.schoeberl@chello.at> wrote:
"Sander Vesik" <sander@haldjas.folklore.ee> schrieb im Newsbeitrag
news:1077074083.882919@haldjas.folklore.ee...
In comp.arch Martin Schoeberl <martin.schoeberl@chello.at
wrote:
Is there a community that is actively involved in discussing
and/or
developing FPGA-based Forth chips, or more generally, stack
machines?

Tha Java Virtual Machine is stack based. There are some
projects
to
build a 'real' Java machine. You can find more information
about a
solution in an FPGA (with VHDL source) at:
http://www.jopdesign.com/

It is sucessfully implemented in Altera ACEX 1K50, Cyclone
(EP1C6)
and
Xilinx Spartan2.

It would be intresting to see results for a version that cached
the
top of the stackand used a more realistic memory interface

Hallo Sander,

In this design the stack is cached in a multi level hirarchy:

TOS and TOS-1 are implemented as register A and B. The next level
of
the stack is local memory that is connected as follows: data in is
connected to A and B, the output of the memory to the input of
register B.
Every arithmetic/logical operation is performed with A and B as
source
and A as destination. All load operations (local variables,
internal
register, external memory and periphery) result in the value
loaded in
A. Therefore no write back pipeline stage is necessary. A is also
the
source for store operations. Register B is never accessed
directly. It
is read as implicit operand or for stack spill on push
instructions
and written during stack spill and fill.
This configuration allows following operation in a single pipeline
stage:
ALU operation
write back result
fill from or spill to the stack memory

The dataflow for a ALU operation is:
A op B => A
stack[sp] => B
sp-1 => sp

for a 'load' operation:
data => A
A => B
B => stack[sp+1]
sp+1 => sp

An instruction (except nop type) needs either read or write access
to
the stack ram. Access to local variables, also residing in the
stack,
need simultaneous read and write access. As an example, ld0 loads
the
memory word pointed by vp on TOS:
stack[vp+0] => A
A => B,
B => stack[sp+1]
sp+1 => sp

This configuration fits perfect to the block rams with one read
and
one write port, that are common in FPGAs. A standard RISC CPU
needs
three data ports (two read and one write) to implement the
register
file in a ram. And usually one more pipeline stage for the ALU
result
to avoid adding the memory access time to the ALU delay time. And
for
single cycle execution you need a lot of muxes for data
forwarding.

yes, the block rams (with registers implemented in those) make
FPGA-s
have an interesting tradeoff -
* you can have a large number of registers with no penalty
* you are very limited in number of read/write ports, and
adding more does not scale *AT ALL*

As for a RISC processor you need two read ports and only one write
port it is easy (with a little waste) to achieve this with the typical
FPAG memories. You have to double the memory and write in both blocks,
than you can use two independent read ports.
But there are still some minor problems: Current block rams downt
allow read during write or unregistered access. This adds more
pipeline stages (with more data forwarding) to the design. With only
stack spill/fill you can calculate the stack address early in the
pipeline and than you don't need extra pipeline stages in a stack
based design.
And with a 'large' on chip stack it can be seen as a very elegant and
simple (no tag rams) data cache.

As summary: In my opinion a stack architecture is a perfect choice
for
the limited hardware resources in an FPGA.

About the 'more realistic memory interface':

I don't see the problem. The main memory interface is a separate
block
and currently there are three different implementations for
different
boards: a low cost version with slow 8 bit ram, a 32 bit interface
for
fast async. ram and Ed Anuff added a 16 bit interface for the
Xilinx
version on a BurchED board. Feel free to implement your interface
of
choice (SDRAM,...).

I see - just the benchmark numbers only used the simple 8-bit
interface.

I really should update this web page (It is soo old....)

Sorry for the long mail, but I could not resist to 'defend' my
design
;-)

Martin

--
Sander

+++ Out of cheese error +++

Sander Vesik · Feb 19, 2004

In comp.arch Martin Schoeberl <martin.schoeberl@chello.at> wrote:

"Sander Vesik" <sander@haldjas.folklore.ee> schrieb im Newsbeitrag
news:1077074083.882919@haldjas.folklore.ee...
In comp.arch Martin Schoeberl <martin.schoeberl@chello.at> wrote:
Is there a community that is actively involved in discussing
and/or
developing FPGA-based Forth chips, or more generally, stack
machines?

Tha Java Virtual Machine is stack based. There are some projects
to
build a 'real' Java machine. You can find more information about a
solution in an FPGA (with VHDL source) at:
http://www.jopdesign.com/

It is sucessfully implemented in Altera ACEX 1K50, Cyclone (EP1C6)
and
Xilinx Spartan2.

It would be intresting to see results for a version that cached the
top of the stackand used a more realistic memory interface

Hallo Sander,

In this design the stack is cached in a multi level hirarchy:

TOS and TOS-1 are implemented as register A and B. The next level of
the stack is local memory that is connected as follows: data in is
connected to A and B, the output of the memory to the input of
register B.
Every arithmetic/logical operation is performed with A and B as source
and A as destination. All load operations (local variables, internal
register, external memory and periphery) result in the value loaded in
A. Therefore no write back pipeline stage is necessary. A is also the
source for store operations. Register B is never accessed directly. It
is read as implicit operand or for stack spill on push instructions
and written during stack spill and fill.
This configuration allows following operation in a single pipeline
stage:
ALU operation
write back result
fill from or spill to the stack memory

The dataflow for a ALU operation is:
A op B => A
stack[sp] => B
sp-1 => sp

for a 'load' operation:
data => A
A => B
B => stack[sp+1]
sp+1 => sp

An instruction (except nop type) needs either read or write access to
the stack ram. Access to local variables, also residing in the stack,
need simultaneous read and write access. As an example, ld0 loads the
memory word pointed by vp on TOS:
stack[vp+0] => A
A => B,
B => stack[sp+1]
sp+1 => sp

This configuration fits perfect to the block rams with one read and
one write port, that are common in FPGAs. A standard RISC CPU needs
three data ports (two read and one write) to implement the register
file in a ram. And usually one more pipeline stage for the ALU result
to avoid adding the memory access time to the ALU delay time. And for
single cycle execution you need a lot of muxes for data forwarding.

yes, the block rams (with registers implemented in those) make FPGA-s
have an interesting tradeoff -
* you can have a large number of registers with no penalty
* you are very limited in number of read/write ports, and
adding more does not scale *AT ALL*

As summary: In my opinion a stack architecture is a perfect choice for
the limited hardware resources in an FPGA.

About the 'more realistic memory interface':

I don't see the problem. The main memory interface is a separate block
and currently there are three different implementations for different
boards: a low cost version with slow 8 bit ram, a 32 bit interface for
fast async. ram and Ed Anuff added a 16 bit interface for the Xilinx
version on a BurchED board. Feel free to implement your interface of
choice (SDRAM,...).

I see - just the benchmark numbers only used the simple 8-bit interface.

Sorry for the long mail, but I could not resist to 'defend' my design
;-)

Martin

--
Sander

+++ Out of cheese error +++

Jabari Zakiya · Feb 19, 2004

fox@ultratechnology.com (Jeff Fox) wrote in message news:<4fbeeb5a.0402182345.3a2f3fa0@posting.google.com>...

jzakiya@mail.com (Jabari Zakiya) wrote in message news:<a6fa4973.0402181055.27856c3@posting.google.com>...
Corrections:

The RTX 2000 had two 16-bit 256 element deep stacks (Return & Data),
a 2-4 cycle interrupt response time, and a bit-mutiply instruction which
could perform a complete general purpose multiply in 16-cycles. It was
rated a 8 MHz (but they could easily run at 10 MHz [which meant it took
a 20 MHz clock] at least at room temperatures).

The RTX 2010 had all of the above, plus a one-cycle hardware 16-bit
multiply, a one-cycle 16-bit multiply/accumulate, and a one-cycle
32-bit barrel shift. This was the version that Harris/Intersil based
the radhard version upon, which NASA and APL (Applied Physics Lab in
Columbia, MD) used for its space missions. They both still have a stash
left, the last that I heard.

The RTX 2001 was a watered down version which was basically the 2000,
but with only 64 element deep stacks. It was intended (according to
Harris) to be a cheaper/faster alternative to the 2000, but like the
Celeron vs the Pentium, if you can get the real thing at basically the
same price, why use the neutered version? Plus, the reduction of stacks
from 256 elements to 64 element greatly reduced the ability to do
multi-tasking and stack switching.

I used the RTX 2000/2010 extensively when I worked at NASA GSFC
Goddard Space Flight Center in Greenbelt, MD) from 1979-1994.

I had two RTX boards. One was a rather expensive board six layer
board with a Meg of SRAM and a shared memory interface to a PC ISA
bus. It was from Silicon Composers. The other was one of the cheap
European Indelko Forthkits, with RTX-cmForth, that I got from Dr.
Ting. I had no experience with the 2010. I didn't remember that
the 2001 had smaller stacks than the 2000 but I seemed to recall that
the 2000 had a single cycle multiply and the 2001 had only the
multiply step instruction. I no longer have the boards or the
manuals and I don't think that Dr. Koopman's book goes into the
details of what made the various models of RTX-20xx different.

It was a long time ago, so I might have been confused about bit
level details after all of these years. I spent a lot more years
working with P21, I21 and F21 and have a much better memory of
the bit level details there, it was also more recent.

I hope this helps set the history straight with regards to the differences
between the RTX versions. Too bad Harris didn't know how to market them.

Jabari Zakiya

Harris seemed to try first marketing it as Forth chip, then failing
at that as a good realtime computer for use with C. I have often
heard that it was too bad that they didn't know how to market it
properly. Still I don't know if anyone really knows what they
should-could-would have done to market it more successfully. They
simply decided that they could easily market 80C286 that they
could make on the same fab line. It also helps date those chips,
Novix vs 8088, 8086 and RTX vs 80286. The realtime response,
fast interrupt handling (relatively) and deterministic timing
were where they won most easily, but they weren't 'backward
compatible' with PC software like the Intel compatible chips
so they were swimming upstream in their marketing efforts.

Best Wishes

I know it was some time ago now, but only the RTX 2010 had the
hardware
one-cylce mutiply/accumulate, etc instructions. The RTX 2000/1 had the
one-bit 16-cycle multiply/divide instructions.

Someone recently posted (Jan or Feb 04) the urls to get the pdfs of
the
RTX 2000 manual and RTX 2010 Intersil users guide. I have both of
them.
If you can't find them on c.l.f. let me know and I'll email them to
you.

I also used the Silicon Composer boards, with the RTX 2000s. I used
them primarily to test/compile code for our embedded systems, and to
play around with for other projects. A good thing about the RTX
2000/10
were they were pin-for-pin compatible, and I could hand code the extra
2010 instructions with the SC development software. Piece of cake.

The 2000/10 were way ahead of their times (as was the original Novix).
Not only could you also do 32-to-16 bit squareroots, one-cycle
streaming
memory instructions, partition memory into USER segments which could
be accessed faster than regular ram space, stack partitioning, 3
counter
timers, fast external interrupt response, the clock could be
completely
turned off, and back on again without losing instructions, to run on
as
littel power as you could get away with.

If only somebody like Motorola had taken hold of these chips. With
their
fabrication capabilities, and cpu marketing savy, they couldl have
really
done something with these desigsn. And think about just a 500 MHz RTX
type
chip know. Even then, the RTX performed at mulitple MIPs times its
operating
frequency (because of instructions compression - multiple Forth words
being performed by one RTX instruction)

But of course, I'm just engaging in ex post factor wishful thinking.

Yours

Jabari

Martin Euredjian · Feb 19, 2004

Jabari,

What's annoying about bottom posting?

Bottom-posting-gone-wild with a full thread quoted in the top of the
message. Like yours.

At least with top posting I can very quickly click through a bunch of
messages and read the new post. With bottom-posting-gone-wild you have to
scroll to the bottom of every darn message to find the new text. Not only a
waste of bandwidth but annoying as hell. See below for an example.

With top posting, if you are reading through a thread, you can quickly
navigate up/down the thread and read it without further calisthenics.

If you want to bottom post (which is appropriate when you need paragraphs to
be in context, or to address specific statements) at least take the time to
clip and snip the relevant parts of the message you are replying to.

--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Martin Euredjian

To send private email:
0_0_0_0_@pacbell.net
where
"0_0_0_0_" = "martineu"

"Jabari Zakiya" <jzakiya@mail.com> wrote in message
news:a6fa4973.0402191126.5e7326ba@posting.google.com...

fox@ultratechnology.com (Jeff Fox) wrote in message
news:<4fbeeb5a.0402182345.3a2f3fa0@posting.google.com>...
jzakiya@mail.com (Jabari Zakiya) wrote in message
news:<a6fa4973.0402181055.27856c3@posting.google.com>...
Corrections:

The RTX 2000 had two 16-bit 256 element deep stacks (Return & Data),
a 2-4 cycle interrupt response time, and a bit-mutiply instruction
which
could perform a complete general purpose multiply in 16-cycles. It was
rated a 8 MHz (but they could easily run at 10 MHz [which meant it
took
a 20 MHz clock] at least at room temperatures).

The RTX 2010 had all of the above, plus a one-cycle hardware 16-bit
multiply, a one-cycle 16-bit multiply/accumulate, and a one-cycle
32-bit barrel shift. This was the version that Harris/Intersil based
the radhard version upon, which NASA and APL (Applied Physics Lab in
Columbia, MD) used for its space missions. They both still have a
stash
left, the last that I heard.

The RTX 2001 was a watered down version which was basically the 2000,
but with only 64 element deep stacks. It was intended (according to
Harris) to be a cheaper/faster alternative to the 2000, but like the
Celeron vs the Pentium, if you can get the real thing at basically the
same price, why use the neutered version? Plus, the reduction of
stacks
from 256 elements to 64 element greatly reduced the ability to do
multi-tasking and stack switching.

I used the RTX 2000/2010 extensively when I worked at NASA GSFC
Goddard Space Flight Center in Greenbelt, MD) from 1979-1994.

I had two RTX boards. One was a rather expensive board six layer
board with a Meg of SRAM and a shared memory interface to a PC ISA
bus. It was from Silicon Composers. The other was one of the cheap
European Indelko Forthkits, with RTX-cmForth, that I got from Dr.
Ting. I had no experience with the 2010. I didn't remember that
the 2001 had smaller stacks than the 2000 but I seemed to recall that
the 2000 had a single cycle multiply and the 2001 had only the
multiply step instruction. I no longer have the boards or the
manuals and I don't think that Dr. Koopman's book goes into the
details of what made the various models of RTX-20xx different.

It was a long time ago, so I might have been confused about bit
level details after all of these years. I spent a lot more years
working with P21, I21 and F21 and have a much better memory of
the bit level details there, it was also more recent.

I hope this helps set the history straight with regards to the
differences
between the RTX versions. Too bad Harris didn't know how to market
them.

Jabari Zakiya

Harris seemed to try first marketing it as Forth chip, then failing
at that as a good realtime computer for use with C. I have often
heard that it was too bad that they didn't know how to market it
properly. Still I don't know if anyone really knows what they
should-could-would have done to market it more successfully. They
simply decided that they could easily market 80C286 that they
could make on the same fab line. It also helps date those chips,
Novix vs 8088, 8086 and RTX vs 80286. The realtime response,
fast interrupt handling (relatively) and deterministic timing
were where they won most easily, but they weren't 'backward
compatible' with PC software like the Intel compatible chips
so they were swimming upstream in their marketing efforts.

Best Wishes

I know it was some time ago now, but only the RTX 2010 had the
hardware
one-cylce mutiply/accumulate, etc instructions. The RTX 2000/1 had the
one-bit 16-cycle multiply/divide instructions.

Someone recently posted (Jan or Feb 04) the urls to get the pdfs of
the
RTX 2000 manual and RTX 2010 Intersil users guide. I have both of
them.
If you can't find them on c.l.f. let me know and I'll email them to
you.

I also used the Silicon Composer boards, with the RTX 2000s. I used
them primarily to test/compile code for our embedded systems, and to
play around with for other projects. A good thing about the RTX
2000/10
were they were pin-for-pin compatible, and I could hand code the extra
2010 instructions with the SC development software. Piece of cake.

The 2000/10 were way ahead of their times (as was the original Novix).
Not only could you also do 32-to-16 bit squareroots, one-cycle
streaming
memory instructions, partition memory into USER segments which could
be accessed faster than regular ram space, stack partitioning, 3
counter
timers, fast external interrupt response, the clock could be
completely
turned off, and back on again without losing instructions, to run on
as
littel power as you could get away with.

If only somebody like Motorola had taken hold of these chips. With
their
fabrication capabilities, and cpu marketing savy, they couldl have
really
done something with these desigsn. And think about just a 500 MHz RTX
type
chip know. Even then, the RTX performed at mulitple MIPs times its
operating
frequency (because of instructions compression - multiple Forth words
being performed by one RTX instruction)

But of course, I'm just engaging in ex post factor wishful thinking.

Yours

Jabari

Imagine if you had to scroll down to here on every new message!!!

Jerry Avins · Feb 19, 2004

Martin Euredjian wrote:

Jabari,

What's annoying about bottom posting?

Bottom-posting-gone-wild with a full thread quoted in the top of the
message. Like yours.

At least with top posting I can very quickly click through a bunch of
messages and read the new post. With bottom-posting-gone-wild you have to
scroll to the bottom of every darn message to find the new text. Not only a
waste of bandwidth but annoying as hell. See below for an example.

With top posting, if you are reading through a thread, you can quickly
navigate up/down the thread and read it without further calisthenics.

If you want to bottom post (which is appropriate when you need paragraphs to
be in context, or to address specific statements) at least take the time to
clip and snip the relevant parts of the message you are replying to.

When I reply to a top-posted message with a sig -- like yours --
everything below the sig is gone, as here. The _)(*&^%$#@!! news program
snips it all silently.

Jerry
--
Engineering is the art of making what you want from things you can get.
ŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻ

Jerry Avins · Feb 19, 2004

Martin Euredjian wrote:

Jabari,

What's annoying about bottom posting?

Bottom-posting-gone-wild with a full thread quoted in the top of the
message. Like yours.

At least with top posting I can very quickly click through a bunch of
messages and read the new post. With bottom-posting-gone-wild you have to
scroll to the bottom of every darn message to find the new text. Not only a
waste of bandwidth but annoying as hell. See below for an example.

With top posting, if you are reading through a thread, you can quickly
navigate up/down the thread and read it without further calisthenics.

If you want to bottom post (which is appropriate when you need paragraphs to
be in context, or to address specific statements) at least take the time to
clip and snip the relevant parts of the message you are replying to.

Of course, I can restore what the )(*&^%$#@!! browser snipped by
copying and pasting, but I lose a level of quote indentation. Here's a
hint to ease your pain: <ctrl>+<end> brings you right to the end in many
news readers.

Jerry
--
Engineering is the art of making what you want from things you can get.
ŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻ

-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Martin Euredjian To send
private email: 0_0_0_0_@pacbell.net where "0_0_0_0_" = "martineu"
"Jabari Zakiya" <jzakiya@mail.com> wrote in message
news:a6fa4973.0402191126.5e7326ba@posting.google.com...

fox@ultratechnology.com (Jeff Fox) wrote in message

news:<4fbeeb5a.0402182345.3a2f3fa0@posting.google.com>...

jzakiya@mail.com (Jabari Zakiya) wrote in message

news:<a6fa4973.0402181055.27856c3@posting.google.com>...

Corrections:

The RTX 2000 had two 16-bit 256 element deep stacks (Return &
Data),
a 2-4 cycle interrupt response time, and a bit-mutiply instruction

which

could perform a complete general purpose multiply in
16-cycles. It was
rated a 8 MHz (but they could easily run at 10 MHz [which meant it

took

at least at room temperatures).

The RTX 2010 had all of the above, plus a one-cycle hardware
16-bit
multiply, a one-cycle 16-bit multiply/accumulate, and a one-cycle
32-bit barrel shift. This was the version that Harris/Intersil
based
the radhard version upon, which NASA and APL (Applied Physics
Lab in
Columbia, MD) used for its space missions. They both still have a

stash

left, the last that I heard.

The RTX 2001 was a watered down version which was basically
the 2000,
but with only 64 element deep stacks. It was intended
(according to
Harris) to be a cheaper/faster alternative to the 2000, but
like the
Celeron vs the Pentium, if you can get the real thing at
basically the
same price, why use the neutered version? Plus, the reduction of

stacks

from 256 elements to 64 element greatly reduced the ability to do
multi-tasking and stack switching.

I used the RTX 2000/2010 extensively when I worked at NASA GSFC
Goddard Space Flight Center in Greenbelt, MD) from 1979-1994.

I had two RTX boards. One was a rather expensive board six layer
board with a Meg of SRAM and a shared memory interface to a PC ISA
bus. It was from Silicon Composers. The other was one of the cheap
European Indelko Forthkits, with RTX-cmForth, that I got from Dr.
Ting. I had no experience with the 2010. I didn't remember that
the 2001 had smaller stacks than the 2000 but I seemed to recall that
the 2000 had a single cycle multiply and the 2001 had only the
multiply step instruction. I no longer have the boards or the
manuals and I don't think that Dr. Koopman's book goes into the
details of what made the various models of RTX-20xx different.

It was a long time ago, so I might have been confused about bit
level details after all of these years. I spent a lot more years
working with P21, I21 and F21 and have a much better memory of
the bit level details there, it was also more recent.

I hope this helps set the history straight with regards to the

differences

between the RTX versions. Too bad Harris didn't know how to market

them.

Jabari Zakiya

Harris seemed to try first marketing it as Forth chip, then failing
at that as a good realtime computer for use with C. I have often
heard that it was too bad that they didn't know how to market it
properly. Still I don't know if anyone really knows what they
should-could-would have done to market it more successfully. They
simply decided that they could easily market 80C286 that they
could make on the same fab line. It also helps date those chips,
Novix vs 8088, 8086 and RTX vs 80286. The realtime response,
fast interrupt handling (relatively) and deterministic timing
were where they won most easily, but they weren't 'backward
compatible' with PC software like the Intel compatible chips
so they were swimming upstream in their marketing efforts.

Best Wishes

I know it was some time ago now, but only the RTX 2010 had the
hardware
one-cylce mutiply/accumulate, etc instructions. The RTX 2000/1 had the
one-bit 16-cycle multiply/divide instructions.

Someone recently posted (Jan or Feb 04) the urls to get the pdfs of
the
RTX 2000 manual and RTX 2010 Intersil users guide. I have both of
them.
If you can't find them on c.l.f. let me know and I'll email them to
you.

I also used the Silicon Composer boards, with the RTX 2000s. I used
them primarily to test/compile code for our embedded systems, and to
play around with for other projects. A good thing about the RTX
2000/10
were they were pin-for-pin compatible, and I could hand code the extra
2010 instructions with the SC development software. Piece of cake.

The 2000/10 were way ahead of their times (as was the original Novix).
Not only could you also do 32-to-16 bit squareroots, one-cycle
streaming
memory instructions, partition memory into USER segments which could
be accessed faster than regular ram space, stack partitioning, 3
counter
timers, fast external interrupt response, the clock could be
completely
turned off, and back on again without losing instructions, to run on
as
littel power as you could get away with.

If only somebody like Motorola had taken hold of these chips. With
their
fabrication capabilities, and cpu marketing savy, they couldl have
really
done something with these desigsn. And think about just a 500 MHz RTX
type
chip know. Even then, the RTX performed at mulitple MIPs times its
operating
frequency (because of instructions compression - multiple Forth words
being performed by one RTX instruction)

But of course, I'm just engaging in ex post factor wishful thinking.

Yours

Jabari

Imagine if you had to scroll down to here on every new message!!!

Jeff Fox · Feb 19, 2004

rickman <spamgoeshere4@yahoo.com> wrote in message news:<4034D754.C3D2264@yahoo.com>...

I can't say for sure exactly what kind of marketing would have helped
the RTX succeed. But I can tell you that any effort to pit it against
the x86 line was misdirected. The x86 parts were not really embedded
chips and I don't recall them being used as such very often. My memory
may be failing me at this since this was long before there were chips
aimed at the embedded market. But the Z80 and 8085 would have been the
main competition for an embedded processor. The x86 line used too much
board space and cost too much for most apps.

I never used the 80186 but I have heard from people who did use
it in embedded work and later the 386e for embedded was clearly
aimed for and used for some embedded work. (though not be me.)
Technically comparing RTX to 8085 or Z80 seemed compelling but
didn't seem to account for much. (Rocket scientists excepted.)

It is likely that Harris did not understand what you do about the
significant factors in embedded, realtime work. It has been more than
once that a vendor needed to educate the engineering community about the
features that make their products are a better way to go.

While you might be right, as we can only guess about such things,
I think that since we know that at least Phillip Koopman was
working there that they did have people with deep understanding
of the significant factors in embedded, realtime work. What I
see as a bigger issue is not if they understood such things but
if their intended customers understood such things. I suspect
that they didn't and that Harris was not able to educate them
with their marketing effort. I think that is the real and bigger
problem. It was not that they didn't understand, but that they were up
against and overwhelmed by pervasive marketing information that
was counter to their message.

Perhaps if they had spent billions on marketing to even the
playing field they would-could-should have sold more chips,
but it would have been a big gamble and one that was unlikely
to return sufficient profit to justify it. Instead they
could just switch the fab line over to 80C286 and ride the
marketing wave created by so many other companies as long
as it lasted.

I often hear that Harris just didn't understand what they
should have done but have yet to hear a reasonable suggestion
of what they would-could-should have done to successfully
market the RTX instead of 286. I keep asking people and
have yet to have a marketing expert give a good answer.
Maybe there is one and I would like to know it if there is.

Best Wishes

Jerry Avins · Feb 19, 2004

Jeff Fox wrote:

...

I never used the 80186 but I have heard from people who did use
it in embedded work and later the 386e for embedded was clearly
aimed for and used for some embedded work. (though not be me.)
Technically comparing RTX to 8085 or Z80 seemed compelling but
didn't seem to account for much. (Rocket scientists excepted.)

...

The 80186 was indeed designed for embedded work, including some on-chip
peripherals. I forget what about it made it awkward to use compares to
Z-80 and 6809, but I remember feeling that way. Probably my lack of
familiarity bordering on ignorance.

Jerry
--
Engineering is the art of making what you want from things you can get.
ŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻ

rickman · Feb 19, 2004

Martin Euredjian wrote:

To send private email:
0_0_0_0_@pacbell.net
where
"0_0_0_0_" = "martineu"

"Jabari Zakiya" <jzakiya@mail.com> wrote in message
news:a6fa4973.0402191126.5e7326ba@posting.google.com...
fox@ultratechnology.com (Jeff Fox) wrote in message
news:<4fbeeb5a.0402182345.3a2f3fa0@posting.google.com>...
jzakiya@mail.com (Jabari Zakiya) wrote in message
news:<a6fa4973.0402181055.27856c3@posting.google.com>...
Corrections:

The RTX 2000 had two 16-bit 256 element deep stacks (Return & Data),
a 2-4 cycle interrupt response time, and a bit-mutiply instruction
which
could perform a complete general purpose multiply in 16-cycles. It was
rated a 8 MHz (but they could easily run at 10 MHz [which meant it
took
a 20 MHz clock] at least at room temperatures).

The RTX 2010 had all of the above, plus a one-cycle hardware 16-bit
multiply, a one-cycle 16-bit multiply/accumulate, and a one-cycle
32-bit barrel shift. This was the version that Harris/Intersil based
the radhard version upon, which NASA and APL (Applied Physics Lab in
Columbia, MD) used for its space missions. They both still have a
stash
left, the last that I heard.

The RTX 2001 was a watered down version which was basically the 2000,
but with only 64 element deep stacks. It was intended (according to
Harris) to be a cheaper/faster alternative to the 2000, but like the
Celeron vs the Pentium, if you can get the real thing at basically the
same price, why use the neutered version? Plus, the reduction of
stacks
from 256 elements to 64 element greatly reduced the ability to do
multi-tasking and stack switching.

I used the RTX 2000/2010 extensively when I worked at NASA GSFC
Goddard Space Flight Center in Greenbelt, MD) from 1979-1994.

I had two RTX boards. One was a rather expensive board six layer
board with a Meg of SRAM and a shared memory interface to a PC ISA
bus. It was from Silicon Composers. The other was one of the cheap
European Indelko Forthkits, with RTX-cmForth, that I got from Dr.
Ting. I had no experience with the 2010. I didn't remember that
the 2001 had smaller stacks than the 2000 but I seemed to recall that
the 2000 had a single cycle multiply and the 2001 had only the
multiply step instruction. I no longer have the boards or the
manuals and I don't think that Dr. Koopman's book goes into the
details of what made the various models of RTX-20xx different.

It was a long time ago, so I might have been confused about bit
level details after all of these years. I spent a lot more years
working with P21, I21 and F21 and have a much better memory of
the bit level details there, it was also more recent.

I hope this helps set the history straight with regards to the
differences
between the RTX versions. Too bad Harris didn't know how to market
them.

Jabari Zakiya

Harris seemed to try first marketing it as Forth chip, then failing
at that as a good realtime computer for use with C. I have often
heard that it was too bad that they didn't know how to market it
properly. Still I don't know if anyone really knows what they
should-could-would have done to market it more successfully. They
simply decided that they could easily market 80C286 that they
could make on the same fab line. It also helps date those chips,
Novix vs 8088, 8086 and RTX vs 80286. The realtime response,
fast interrupt handling (relatively) and deterministic timing
were where they won most easily, but they weren't 'backward
compatible' with PC software like the Intel compatible chips
so they were swimming upstream in their marketing efforts.

Best Wishes

I know it was some time ago now, but only the RTX 2010 had the
hardware
one-cylce mutiply/accumulate, etc instructions. The RTX 2000/1 had the
one-bit 16-cycle multiply/divide instructions.

Someone recently posted (Jan or Feb 04) the urls to get the pdfs of
the
RTX 2000 manual and RTX 2010 Intersil users guide. I have both of
them.
If you can't find them on c.l.f. let me know and I'll email them to
you.

I also used the Silicon Composer boards, with the RTX 2000s. I used
them primarily to test/compile code for our embedded systems, and to
play around with for other projects. A good thing about the RTX
2000/10
were they were pin-for-pin compatible, and I could hand code the extra
2010 instructions with the SC development software. Piece of cake.

The 2000/10 were way ahead of their times (as was the original Novix).
Not only could you also do 32-to-16 bit squareroots, one-cycle
streaming
memory instructions, partition memory into USER segments which could
be accessed faster than regular ram space, stack partitioning, 3
counter
timers, fast external interrupt response, the clock could be
completely
turned off, and back on again without losing instructions, to run on
as
littel power as you could get away with.

If only somebody like Motorola had taken hold of these chips. With
their
fabrication capabilities, and cpu marketing savy, they couldl have
really
done something with these desigsn. And think about just a 500 MHz RTX
type
chip know. Even then, the RTX performed at mulitple MIPs times its
operating
frequency (because of instructions compression - multiple Forth words
being performed by one RTX instruction)

But of course, I'm just engaging in ex post factor wishful thinking.

Yours

Jabari

Imagine if you had to scroll down to here on every new message!!!

Jabari,

What's annoying about bottom posting?

Bottom-posting-gone-wild with a full thread quoted in the top of the
message. Like yours.

At least with top posting I can very quickly click through a bunch of
messages and read the new post. With bottom-posting-gone-wild you have to
scroll to the bottom of every darn message to find the new text. Not only a
waste of bandwidth but annoying as hell. See below for an example.

With top posting, if you are reading through a thread, you can quickly
navigate up/down the thread and read it without further calisthenics.

If you want to bottom post (which is appropriate when you need paragraphs to
be in context, or to address specific statements) at least take the time to
clip and snip the relevant parts of the message you are replying to.

--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Martin Euredjian

--- mixed top and bottom posting fixed ---

Personally, I don't care one way or another how people post, I'm not
interested in getting into the top/bottom posting wars. But mixed
posting is even worse. I know you were trying to make a point, but it
is of no value. People will do what they want to do no matter how many
people them them to stop.

Besides, it is only a single key stroke to go to the bottom of a page.
Is that really a big problem?

--

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design URL http://www.arius.com
4 King Ave 301-682-7772 Voice
Frederick, MD 21701-3110 301-682-7666 FAX

rickman · Feb 19, 2004

Jerry Avins wrote:

Jeff Fox wrote:

...

I never used the 80186 but I have heard from people who did use
it in embedded work and later the 386e for embedded was clearly
aimed for and used for some embedded work. (though not be me.)
Technically comparing RTX to 8085 or Z80 seemed compelling but
didn't seem to account for much. (Rocket scientists excepted.)

...

The 80186 was indeed designed for embedded work, including some on-chip
peripherals. I forget what about it made it awkward to use compares to
Z-80 and 6809, but I remember feeling that way. Probably my lack of
familiarity bordering on ignorance.

I don't think it was awkward compared to any of the 8 bitters. But it
was not *PC* compatible because the IO map was different. I guess back
then everyone either wanted a lower priced PC equivalent or they wanted
more MIPs and the 186 did neither.

--

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design URL http://www.arius.com
4 King Ave 301-682-7772 Voice
Frederick, MD 21701-3110 301-682-7666 FAX

Jon Harris · Feb 19, 2004

"Jerry Avins" <jya@ieee.org> wrote in message
news:4035159f$0$3078$61fed72c@news.rcn.com...

When I reply to a top-posted message with a sig -- like yours --
everything below the sig is gone, as here. The _)(*&^%$#@!! news program
snips it all silently.

Jerry

What newsreader? Sounds like a bug, or at least something that should have
a switch to defeat it.

Randy Yates · Feb 19, 2004

rickman <spamgoeshere4@yahoo.com> writes:

Jerry Avins wrote:

Jeff Fox wrote:

...

I never used the 80186 but I have heard from people who did use
it in embedded work and later the 386e for embedded was clearly
aimed for and used for some embedded work. (though not be me.)
Technically comparing RTX to 8085 or Z80 seemed compelling but
didn't seem to account for much. (Rocket scientists excepted.)

...

The 80186 was indeed designed for embedded work, including some on-chip
peripherals. I forget what about it made it awkward to use compares to
Z-80 and 6809, but I remember feeling that way. Probably my lack of
familiarity bordering on ignorance.

I don't think it was awkward compared to any of the 8 bitters.

Are you guys kidding? It was that frickin' "segmented architecture" that
was a pain in the ?##. I believe both the Z80 and 6809 were flat memory
spaces, weren't they? Give me an 8085 any day of the 80's.

--RY

But it

was not *PC* compatible because the IO map was different. I guess back
then everyone either wanted a lower priced PC equivalent or they wanted
more MIPs and the 186 did neither.

--

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design URL http://www.arius.com
4 King Ave 301-682-7772 Voice
Frederick, MD 21701-3110 301-682-7666 FAX

--
% Randy Yates % "My Shangri-la has gone away, fading like
%% Fuquay-Varina, NC % the Beatles on 'Hey Jude'"
%%% 919-577-9882 %
%%%% <yates@ieee.org> % 'Shangri-La', *A New World Record*, ELO
http://home.earthlink.net/~yatescr

Jerry Avins · Feb 20, 2004

Randy Yates wrote:

rickman <spamgoeshere4@yahoo.com> writes:

...

was awkward compared to any of the 8 bitters.

Are you guys kidding? It was that frickin' "segmented architecture" that
was a pain in the ?##. I believe both the Z80 and 6809 were flat memory
spaces, weren't they? Give me an 8085 any day of the 80's.

--RY

Segmented architecture was a royal pain, but each segment was 64K, same
as the 8 bitters. Segments were better than external hardware-supported
banks. The galling part was that it didn't have to be that way.

Jerry
--
Engineering is the art of making what you want from things you can get.
ŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻ

Andrew Reilly · Feb 20, 2004

On Fri, 20 Feb 2004 00:34:58 +0000, Randy Yates wrote:

rickman <spamgoeshere4@yahoo.com> writes:

Jerry Avins wrote:

Jeff Fox wrote:

...

I never used the 80186 but I have heard from people who did use
it in embedded work and later the 386e for embedded was clearly
aimed for and used for some embedded work. (though not be me.)
Technically comparing RTX to 8085 or Z80 seemed compelling but
didn't seem to account for much. (Rocket scientists excepted.)

...

The 80186 was indeed designed for embedded work, including some on-chip
peripherals. I forget what about it made it awkward to use compares to
Z-80 and 6809, but I remember feeling that way. Probably my lack of
familiarity bordering on ignorance.

I don't think it was awkward compared to any of the 8 bitters.

Are you guys kidding? It was that frickin' "segmented architecture" that
was a pain in the ?##. I believe both the Z80 and 6809 were flat memory
spaces, weren't they? Give me an 8085 any day of the 80's.

Well if you were happy with 64k, then the segmentation wasn't a problem:
you could be in "tiny" mode all the time (which is how the CP/M converters
worked, I believe.) The x86 had some more instructions and better
addressing modes than either the 8085 or Z80. Probably not necessarily
nicer than the 6809 though (but I only read about the latter: never got to
actually play with one.)

Idle curiosity: why pick the 8085 over the Z80, in that time frame?

Cheers,

--
Andrew

Martin Euredjian · Feb 20, 2004

Jerry Avins wrote:

When I reply to a top-posted message with a sig -- like yours --
everything below the sig is gone, as here. The _)(*&^%$#@!! news program
snips it all silently.

Really? That's amazing. Why would they do that?
Maybe you need to try a different reader. I'm using Outlook Express and
directly off the Web when travelling with a browser.

--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Martin Euredjian

To send private email:
0_0_0_0_@pacbell.net
where
"0_0_0_0_" = "martineu"

Jerry Avins · Feb 20, 2004

Jon Harris wrote:

"Jerry Avins" <jya@ieee.org> wrote in message
news:4035159f$0$3078$61fed72c@news.rcn.com...

When I reply to a top-posted message with a sig -- like yours --
everything below the sig is gone, as here. The _)(*&^%$#@!! news program
snips it all silently.

Jerry

What newsreader? Sounds like a bug, or at least something that should have
a switch to defeat it.

If there's a switch, I'd like to know it. Netscape 7.1 It has other bugs
too.

Jerry
--
Engineering is the art of making what you want from things you can get.
ŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻ

Dual-stack (Forth) processors

Jeff Fox

Guest

Roman Pavluyk

Guest

rickman

Guest

rickman

Guest

Martin Schoeberl

Guest

Sander Vesik

Guest

Jabari Zakiya

Guest

Martin Euredjian

Guest

Jerry Avins

Guest

Jerry Avins

Guest

Jeff Fox

Guest

Jerry Avins

Guest

rickman

Guest

rickman

Guest

Jon Harris

Guest

Randy Yates

Guest

Jerry Avins

Guest

Andrew Reilly

Guest

Martin Euredjian

Guest

Jerry Avins

Guest

Log in

Welcome to EDABoard.com

Sponsor