increment or decrement one of 16, 16-bit registers

Tim Wescott · May 10, 2017

I've been geeking out on the COSMAC 1802 lately -- it was the first
processor that I owned all just for me, and that I wrote programs for (in
machine code -- not assembly).

One of the features of this chip is that while the usual ALU is 8-bit and
centered around memory fetches and the accumulator (which they call the
'D' register), there's a 16 x 16-bit register file. Any one of these
registers can be incremented or decremented, either as an explicit
instruction or as part of a fetch (basically, you can use any one of them
as an index, and you can "fetch and increment").

How would you do this most effectively today? How might it have been
done back in the mid 1970's when RCA made the chip? Would it make a
difference if you were working with a CPLD, FPGA, or some ASIC where you
were determined to minimize chip area?

I'm assuming that the original had one selectable increment/decrement
unit that wrote back numbers to the registers, but I could see them
implementing each register as a loadable counter -- I just don't have a
good idea of what might use the least real estate.

Thanks.

--

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

I'm looking for work -- see my website!

rickman · May 11, 2017

On 5/10/2017 5:42 PM, Tim Wescott wrote:

I've been geeking out on the COSMAC 1802 lately -- it was the first
processor that I owned all just for me, and that I wrote programs for (in
machine code -- not assembly).

One of the features of this chip is that while the usual ALU is 8-bit and
centered around memory fetches and the accumulator (which they call the
'D' register), there's a 16 x 16-bit register file. Any one of these
registers can be incremented or decremented, either as an explicit
instruction or as part of a fetch (basically, you can use any one of them
as an index, and you can "fetch and increment").

How would you do this most effectively today? How might it have been
done back in the mid 1970's when RCA made the chip? Would it make a
difference if you were working with a CPLD, FPGA, or some ASIC where you
were determined to minimize chip area?

I'm assuming that the original had one selectable increment/decrement
unit that wrote back numbers to the registers, but I could see them
implementing each register as a loadable counter -- I just don't have a
good idea of what might use the least real estate.

A counter is a register with an adder (although only needing half adders
at each bit), so of course the incrementer will take up more logic than
a register.

Depending on what functions can be done while the register is
incrementing, they may use the ALU for all arithmetic operations. Most
of the earlier processors conserved logic by time sequencing operations
within an instruction. That's why some instructions take so many cycles
to complete, it's shuffling data around internally.

If you provide some instructions with their descriptions and the cycle
counts I bet I can tell you how much is done sequentially and how much
is done in parallel.

--

Rick C

Tim Wescott · May 11, 2017

On Wed, 10 May 2017 18:56:36 -0400, rickman wrote:

On 5/10/2017 5:42 PM, Tim Wescott wrote:
I've been geeking out on the COSMAC 1802 lately -- it was the first
processor that I owned all just for me, and that I wrote programs for
(in machine code -- not assembly).

One of the features of this chip is that while the usual ALU is 8-bit
and centered around memory fetches and the accumulator (which they call
the 'D' register), there's a 16 x 16-bit register file. Any one of
these registers can be incremented or decremented, either as an
explicit instruction or as part of a fetch (basically, you can use any
one of them as an index, and you can "fetch and increment").

How would you do this most effectively today? How might it have been
done back in the mid 1970's when RCA made the chip? Would it make a
difference if you were working with a CPLD, FPGA, or some ASIC where
you were determined to minimize chip area?

I'm assuming that the original had one selectable increment/decrement
unit that wrote back numbers to the registers, but I could see them
implementing each register as a loadable counter -- I just don't have a
good idea of what might use the least real estate.

A counter is a register with an adder (although only needing half adders
at each bit), so of course the incrementer will take up more logic than
a register.

Depending on what functions can be done while the register is
incrementing, they may use the ALU for all arithmetic operations. Most
of the earlier processors conserved logic by time sequencing operations
within an instruction. That's why some instructions take so many cycles
to complete, it's shuffling data around internally.

If you provide some instructions with their descriptions and the cycle
counts I bet I can tell you how much is done sequentially and how much
is done in parallel.

There's a surprisingly large ecosystem of users for the processor -- I
think because it was a popular, dirt-cheap hobby system, and now there's
all these experienced digital-heads playing with their old toys. There's
even an "Olduino" project that marries a 1802 board with Arduino
shields.

The 1802 is how I got into doing deep-embedded systems (you can run an RC
servo! With a counter! In Software!!!). So I understand the enthusiasm
because I share it.

Here's the Whole Damned User's Manual:

http://datasheets.chipdb.org/RCA/MPM-201B_CDP1802_Users_Manual_Nov77.pdf

All instructions take 16 or 24 clock cycles, on a fixed program of two or
three phases (_everything_ happens on 8-cycle boundaries). A typical
instruction would load the byte pointed to by register N into D, then
increment the register pointed to by N.

I think you may be right about using the ALU for incrementing registers
-- they don't show it that way in their logical diagram, but I just now
realized that they never increment or decrement a register AND do an
arithmetic operation in the same instruction.

--

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

I'm looking for work -- see my website!

Gabor · May 11, 2017

On Wednesday, 5/10/2017 7:20 PM, Tim Wescott wrote:

On Wed, 10 May 2017 18:56:36 -0400, rickman wrote:

On 5/10/2017 5:42 PM, Tim Wescott wrote:
I've been geeking out on the COSMAC 1802 lately -- it was the first
processor that I owned all just for me, and that I wrote programs for
(in machine code -- not assembly).

One of the features of this chip is that while the usual ALU is 8-bit
and centered around memory fetches and the accumulator (which they call
the 'D' register), there's a 16 x 16-bit register file. Any one of
these registers can be incremented or decremented, either as an
explicit instruction or as part of a fetch (basically, you can use any
one of them as an index, and you can "fetch and increment").

How would you do this most effectively today? How might it have been
done back in the mid 1970's when RCA made the chip? Would it make a
difference if you were working with a CPLD, FPGA, or some ASIC where
you were determined to minimize chip area?

I'm assuming that the original had one selectable increment/decrement
unit that wrote back numbers to the registers, but I could see them
implementing each register as a loadable counter -- I just don't have a
good idea of what might use the least real estate.

A counter is a register with an adder (although only needing half adders
at each bit), so of course the incrementer will take up more logic than
a register.

Depending on what functions can be done while the register is
incrementing, they may use the ALU for all arithmetic operations. Most
of the earlier processors conserved logic by time sequencing operations
within an instruction. That's why some instructions take so many cycles
to complete, it's shuffling data around internally.

If you provide some instructions with their descriptions and the cycle
counts I bet I can tell you how much is done sequentially and how much
is done in parallel.

There's a surprisingly large ecosystem of users for the processor -- I
think because it was a popular, dirt-cheap hobby system, and now there's
all these experienced digital-heads playing with their old toys. There's
even an "Olduino" project that marries a 1802 board with Arduino
shields.

The 1802 is how I got into doing deep-embedded systems (you can run an RC
servo! With a counter! In Software!!!). So I understand the enthusiasm
because I share it.

Here's the Whole Damned User's Manual:

http://datasheets.chipdb.org/RCA/MPM-201B_CDP1802_Users_Manual_Nov77.pdf

All instructions take 16 or 24 clock cycles, on a fixed program of two or
three phases (_everything_ happens on 8-cycle boundaries). A typical
instruction would load the byte pointed to by register N into D, then
increment the register pointed to by N.

I think you may be right about using the ALU for incrementing registers
-- they don't show it that way in their logical diagram, but I just now
realized that they never increment or decrement a register AND do an
arithmetic operation in the same instruction.

No surprise on the multiple of 8 cycles. The 1802 was a
one-bit serial processor. It's ALU was therefore really
small. A bit more logic for all the sequencing, but
overall it had a very small footprint in gates.

--
Gabor

Tim Wescott · May 11, 2017

On Wed, 10 May 2017 22:28:34 -0400, Gabor wrote:

On Wednesday, 5/10/2017 7:20 PM, Tim Wescott wrote:
On Wed, 10 May 2017 18:56:36 -0400, rickman wrote:

On 5/10/2017 5:42 PM, Tim Wescott wrote:
I've been geeking out on the COSMAC 1802 lately -- it was the first
processor that I owned all just for me, and that I wrote programs for
(in machine code -- not assembly).

One of the features of this chip is that while the usual ALU is 8-bit
and centered around memory fetches and the accumulator (which they
call the 'D' register), there's a 16 x 16-bit register file. Any one
of these registers can be incremented or decremented, either as an
explicit instruction or as part of a fetch (basically, you can use
any one of them as an index, and you can "fetch and increment").

How would you do this most effectively today? How might it have been
done back in the mid 1970's when RCA made the chip? Would it make a
difference if you were working with a CPLD, FPGA, or some ASIC where
you were determined to minimize chip area?

I'm assuming that the original had one selectable increment/decrement
unit that wrote back numbers to the registers, but I could see them
implementing each register as a loadable counter -- I just don't have
a good idea of what might use the least real estate.

A counter is a register with an adder (although only needing half
adders at each bit), so of course the incrementer will take up more
logic than a register.

Depending on what functions can be done while the register is
incrementing, they may use the ALU for all arithmetic operations.
Most of the earlier processors conserved logic by time sequencing
operations within an instruction. That's why some instructions take
so many cycles to complete, it's shuffling data around internally.

If you provide some instructions with their descriptions and the cycle
counts I bet I can tell you how much is done sequentially and how much
is done in parallel.

There's a surprisingly large ecosystem of users for the processor -- I
think because it was a popular, dirt-cheap hobby system, and now
there's all these experienced digital-heads playing with their old
toys. There's even an "Olduino" project that marries a 1802 board with
Arduino shields.

The 1802 is how I got into doing deep-embedded systems (you can run an
RC servo! With a counter! In Software!!!). So I understand the
enthusiasm because I share it.

Here's the Whole Damned User's Manual:

http://datasheets.chipdb.org/RCA/
MPM-201B_CDP1802_Users_Manual_Nov77.pdf

All instructions take 16 or 24 clock cycles, on a fixed program of two
or three phases (_everything_ happens on 8-cycle boundaries). A
typical instruction would load the byte pointed to by register N into
D, then increment the register pointed to by N.

I think you may be right about using the ALU for incrementing registers
-- they don't show it that way in their logical diagram, but I just now
realized that they never increment or decrement a register AND do an
arithmetic operation in the same instruction.

No surprise on the multiple of 8 cycles. The 1802 was a one-bit serial
processor. It's ALU was therefore really small. A bit more logic for
all the sequencing, but overall it had a very small footprint in gates.

How did they manage the 16-bit register increment and decrement, then?

--

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

I'm looking for work -- see my website!

Cecil Bayona · May 11, 2017

On 5/10/2017 9:28 PM, Gabor wrote:

On Wednesday, 5/10/2017 7:20 PM, Tim Wescott wrote:
On Wed, 10 May 2017 18:56:36 -0400, rickman wrote:

On 5/10/2017 5:42 PM, Tim Wescott wrote:
I've been geeking out on the COSMAC 1802 lately -- it was the first
processor that I owned all just for me, and that I wrote programs for
(in machine code -- not assembly).

One of the features of this chip is that while the usual ALU is 8-bit
and centered around memory fetches and the accumulator (which they call
the 'D' register), there's a 16 x 16-bit register file. Any one of
these registers can be incremented or decremented, either as an
explicit instruction or as part of a fetch (basically, you can use any
one of them as an index, and you can "fetch and increment").

How would you do this most effectively today? How might it have been
done back in the mid 1970's when RCA made the chip? Would it make a
difference if you were working with a CPLD, FPGA, or some ASIC where
you were determined to minimize chip area?

I'm assuming that the original had one selectable increment/decrement
unit that wrote back numbers to the registers, but I could see them
implementing each register as a loadable counter -- I just don't have a
good idea of what might use the least real estate.

A counter is a register with an adder (although only needing half adders
at each bit), so of course the incrementer will take up more logic than
a register.

Depending on what functions can be done while the register is
incrementing, they may use the ALU for all arithmetic operations. Most
of the earlier processors conserved logic by time sequencing operations
within an instruction. That's why some instructions take so many cycles
to complete, it's shuffling data around internally.

If you provide some instructions with their descriptions and the cycle
counts I bet I can tell you how much is done sequentially and how much
is done in parallel.

There's a surprisingly large ecosystem of users for the processor -- I
think because it was a popular, dirt-cheap hobby system, and now there's
all these experienced digital-heads playing with their old toys. There's
even an "Olduino" project that marries a 1802 board with Arduino
shields.

The 1802 is how I got into doing deep-embedded systems (you can run an RC
servo! With a counter! In Software!!!). So I understand the enthusiasm
because I share it.

Here's the Whole Damned User's Manual:

http://datasheets.chipdb.org/RCA/MPM-201B_CDP1802_Users_Manual_Nov77.pdf

All instructions take 16 or 24 clock cycles, on a fixed program of two or
three phases (_everything_ happens on 8-cycle boundaries). A typical
instruction would load the byte pointed to by register N into D, then
increment the register pointed to by N.

I think you may be right about using the ALU for incrementing registers
-- they don't show it that way in their logical diagram, but I just now
realized that they never increment or decrement a register AND do an
arithmetic operation in the same instruction.

No surprise on the multiple of 8 cycles. The 1802 was a
one-bit serial processor. It's ALU was therefore really
small. A bit more logic for all the sequencing, but
overall it had a very small footprint in gates.

I believe you are incorrect, several RCA manuals shows the ALU as being
8 bits wide. In the early 70s the CMOS logic was slow, as manufacturing
improved many of the chips could get to 8Mhz but they were sold a 2MHz
parts.

Do you have a link that shows the ALU is serial instead of 8 bit parallel?

--
Cecil - k5nwa

rickman · May 11, 2017

On 5/11/2017 12:59 AM, Cecil Bayona wrote:

On 5/10/2017 9:28 PM, Gabor wrote:
On Wednesday, 5/10/2017 7:20 PM, Tim Wescott wrote:
On Wed, 10 May 2017 18:56:36 -0400, rickman wrote:

On 5/10/2017 5:42 PM, Tim Wescott wrote:
I've been geeking out on the COSMAC 1802 lately -- it was the first
processor that I owned all just for me, and that I wrote programs for
(in machine code -- not assembly).

One of the features of this chip is that while the usual ALU is 8-bit
and centered around memory fetches and the accumulator (which they
call
the 'D' register), there's a 16 x 16-bit register file. Any one of
these registers can be incremented or decremented, either as an
explicit instruction or as part of a fetch (basically, you can use any
one of them as an index, and you can "fetch and increment").

How would you do this most effectively today? How might it have been
done back in the mid 1970's when RCA made the chip? Would it make a
difference if you were working with a CPLD, FPGA, or some ASIC where
you were determined to minimize chip area?

I'm assuming that the original had one selectable increment/decrement
unit that wrote back numbers to the registers, but I could see them
implementing each register as a loadable counter -- I just don't
have a
good idea of what might use the least real estate.

A counter is a register with an adder (although only needing half
adders
at each bit), so of course the incrementer will take up more logic than
a register.

Depending on what functions can be done while the register is
incrementing, they may use the ALU for all arithmetic operations. Most
of the earlier processors conserved logic by time sequencing operations
within an instruction. That's why some instructions take so many
cycles
to complete, it's shuffling data around internally.

If you provide some instructions with their descriptions and the cycle
counts I bet I can tell you how much is done sequentially and how much
is done in parallel.

There's a surprisingly large ecosystem of users for the processor -- I
think because it was a popular, dirt-cheap hobby system, and now there's
all these experienced digital-heads playing with their old toys.
There's
even an "Olduino" project that marries a 1802 board with Arduino
shields.

The 1802 is how I got into doing deep-embedded systems (you can run
an RC
servo! With a counter! In Software!!!). So I understand the
enthusiasm
because I share it.

Here's the Whole Damned User's Manual:

http://datasheets.chipdb.org/RCA/MPM-201B_CDP1802_Users_Manual_Nov77.pdf

All instructions take 16 or 24 clock cycles, on a fixed program of
two or
three phases (_everything_ happens on 8-cycle boundaries). A typical
instruction would load the byte pointed to by register N into D, then
increment the register pointed to by N.

I think you may be right about using the ALU for incrementing registers
-- they don't show it that way in their logical diagram, but I just now
realized that they never increment or decrement a register AND do an
arithmetic operation in the same instruction.

No surprise on the multiple of 8 cycles. The 1802 was a
one-bit serial processor. It's ALU was therefore really
small. A bit more logic for all the sequencing, but
overall it had a very small footprint in gates.

I believe you are incorrect, several RCA manuals shows the ALU as being
8 bits wide. In the early 70s the CMOS logic was slow, as manufacturing
improved many of the chips could get to 8Mhz but they were sold a 2MHz
parts.

Do you have a link that shows the ALU is serial instead of 8 bit parallel?

I have looked at serializing adders and multipliers. The control logic
is large enough that it greatly mitigates the logic saving of a bit
arithmetic unit versus an 8 bit unit. Even for a multiplier a full bit
serial unit is not much smaller than a word wide add/shift design. Any
time you have bit wide logic the registers need multiplexers which are
not much different from adders.

--

Rick C

Tom Gardner · May 11, 2017

On 11/05/17 00:20, Tim Wescott wrote:

The 1802 is how I got into doing deep-embedded systems (you can run an RC
servo! With a counter! In Software!!!). So I understand the enthusiasm
because I share it.

You might like the XMOS processors for *hard* real-time systems.
Wide range available on Digikey.

Multicore, "FPGA like" I/O (I/O occurre on specified clock
cycles), xC is its CSP/Occam/Transputer event-based programming
model.

Loop and function times guaranteed by the development
environment based on its examining the binary file.

I've just started playing with them, and have already managed
to use a single-core as a "software frequency counter" that
counts the transitions in a 50Mb/s serial data stream. Replicate
that in another core and you have the basis of a frequency ratio
meter.

May 11, 2017

Den torsdag den 11. maj 2017 kl. 07.17.08 UTC+2 skrev rickman:

On 5/11/2017 12:59 AM, Cecil Bayona wrote:
On 5/10/2017 9:28 PM, Gabor wrote:
On Wednesday, 5/10/2017 7:20 PM, Tim Wescott wrote:
On Wed, 10 May 2017 18:56:36 -0400, rickman wrote:

On 5/10/2017 5:42 PM, Tim Wescott wrote:
I've been geeking out on the COSMAC 1802 lately -- it was the first
processor that I owned all just for me, and that I wrote programs for
(in machine code -- not assembly).

One of the features of this chip is that while the usual ALU is 8-bit
and centered around memory fetches and the accumulator (which they
call
the 'D' register), there's a 16 x 16-bit register file. Any one of
these registers can be incremented or decremented, either as an
explicit instruction or as part of a fetch (basically, you can use any
one of them as an index, and you can "fetch and increment").

How would you do this most effectively today? How might it have been
done back in the mid 1970's when RCA made the chip? Would it make a
difference if you were working with a CPLD, FPGA, or some ASIC where
you were determined to minimize chip area?

I'm assuming that the original had one selectable increment/decrement
unit that wrote back numbers to the registers, but I could see them
implementing each register as a loadable counter -- I just don't
have a
good idea of what might use the least real estate.

A counter is a register with an adder (although only needing half
adders
at each bit), so of course the incrementer will take up more logic than
a register.

Depending on what functions can be done while the register is
incrementing, they may use the ALU for all arithmetic operations. Most
of the earlier processors conserved logic by time sequencing operations
within an instruction. That's why some instructions take so many
cycles
to complete, it's shuffling data around internally.

If you provide some instructions with their descriptions and the cycle
counts I bet I can tell you how much is done sequentially and how much
is done in parallel.

There's a surprisingly large ecosystem of users for the processor -- I
think because it was a popular, dirt-cheap hobby system, and now there's
all these experienced digital-heads playing with their old toys.
There's
even an "Olduino" project that marries a 1802 board with Arduino
shields.

The 1802 is how I got into doing deep-embedded systems (you can run
an RC
servo! With a counter! In Software!!!). So I understand the
enthusiasm
because I share it.

Here's the Whole Damned User's Manual:

http://datasheets.chipdb.org/RCA/MPM-201B_CDP1802_Users_Manual_Nov77.pdf

All instructions take 16 or 24 clock cycles, on a fixed program of
two or
three phases (_everything_ happens on 8-cycle boundaries). A typical
instruction would load the byte pointed to by register N into D, then
increment the register pointed to by N.

I think you may be right about using the ALU for incrementing registers
-- they don't show it that way in their logical diagram, but I just now
realized that they never increment or decrement a register AND do an
arithmetic operation in the same instruction.

No surprise on the multiple of 8 cycles. The 1802 was a
one-bit serial processor. It's ALU was therefore really
small. A bit more logic for all the sequencing, but
overall it had a very small footprint in gates.

I believe you are incorrect, several RCA manuals shows the ALU as being
8 bits wide. In the early 70s the CMOS logic was slow, as manufacturing
improved many of the chips could get to 8Mhz but they were sold a 2MHz
parts.

Do you have a link that shows the ALU is serial instead of 8 bit parallel?

I have looked at serializing adders and multipliers. The control logic
is large enough that it greatly mitigates the logic saving of a bit
arithmetic unit versus an 8 bit unit. Even for a multiplier a full bit
serial unit is not much smaller than a word wide add/shift design. Any
time you have bit wide logic the registers need multiplexers which are
not much different from adders.

when you have cycles to spare you can just shift

rickman · May 11, 2017

On 5/11/2017 11:44 AM, lasselangwadtchristensen@gmail.com wrote:

Den torsdag den 11. maj 2017 kl. 07.17.08 UTC+2 skrev rickman:
On 5/11/2017 12:59 AM, Cecil Bayona wrote:
On 5/10/2017 9:28 PM, Gabor wrote:
On Wednesday, 5/10/2017 7:20 PM, Tim Wescott wrote:
On Wed, 10 May 2017 18:56:36 -0400, rickman wrote:

On 5/10/2017 5:42 PM, Tim Wescott wrote:
I've been geeking out on the COSMAC 1802 lately -- it was the first
processor that I owned all just for me, and that I wrote programs for
(in machine code -- not assembly).

One of the features of this chip is that while the usual ALU is 8-bit
and centered around memory fetches and the accumulator (which they
call
the 'D' register), there's a 16 x 16-bit register file. Any one of
these registers can be incremented or decremented, either as an
explicit instruction or as part of a fetch (basically, you can use any
one of them as an index, and you can "fetch and increment").

How would you do this most effectively today? How might it have been
done back in the mid 1970's when RCA made the chip? Would it make a
difference if you were working with a CPLD, FPGA, or some ASIC where
you were determined to minimize chip area?

I'm assuming that the original had one selectable increment/decrement
unit that wrote back numbers to the registers, but I could see them
implementing each register as a loadable counter -- I just don't
have a
good idea of what might use the least real estate.

A counter is a register with an adder (although only needing half
adders
at each bit), so of course the incrementer will take up more logic than
a register.

Depending on what functions can be done while the register is
incrementing, they may use the ALU for all arithmetic operations. Most
of the earlier processors conserved logic by time sequencing operations
within an instruction. That's why some instructions take so many
cycles
to complete, it's shuffling data around internally.

If you provide some instructions with their descriptions and the cycle
counts I bet I can tell you how much is done sequentially and how much
is done in parallel.

There's a surprisingly large ecosystem of users for the processor -- I
think because it was a popular, dirt-cheap hobby system, and now there's
all these experienced digital-heads playing with their old toys.
There's
even an "Olduino" project that marries a 1802 board with Arduino
shields.

The 1802 is how I got into doing deep-embedded systems (you can run
an RC
servo! With a counter! In Software!!!). So I understand the
enthusiasm
because I share it.

Here's the Whole Damned User's Manual:

http://datasheets.chipdb.org/RCA/MPM-201B_CDP1802_Users_Manual_Nov77.pdf

All instructions take 16 or 24 clock cycles, on a fixed program of
two or
three phases (_everything_ happens on 8-cycle boundaries). A typical
instruction would load the byte pointed to by register N into D, then
increment the register pointed to by N.

I think you may be right about using the ALU for incrementing registers
-- they don't show it that way in their logical diagram, but I just now
realized that they never increment or decrement a register AND do an
arithmetic operation in the same instruction.

No surprise on the multiple of 8 cycles. The 1802 was a
one-bit serial processor. It's ALU was therefore really
small. A bit more logic for all the sequencing, but
overall it had a very small footprint in gates.

I believe you are incorrect, several RCA manuals shows the ALU as being
8 bits wide. In the early 70s the CMOS logic was slow, as manufacturing
improved many of the chips could get to 8Mhz but they were sold a 2MHz
parts.

Do you have a link that shows the ALU is serial instead of 8 bit parallel?

I have looked at serializing adders and multipliers. The control logic
is large enough that it greatly mitigates the logic saving of a bit
arithmetic unit versus an 8 bit unit. Even for a multiplier a full bit
serial unit is not much smaller than a word wide add/shift design. Any
time you have bit wide logic the registers need multiplexers which are
not much different from adders.

when you have cycles to spare you can just shift

What does that have to do with anything?

--

Rick C

May 11, 2017

Den torsdag den 11. maj 2017 kl. 20.08.30 UTC+2 skrev rickman:

On 5/11/2017 11:44 AM, lasselangwadtchristensen@gmail.com wrote:
Den torsdag den 11. maj 2017 kl. 07.17.08 UTC+2 skrev rickman:
On 5/11/2017 12:59 AM, Cecil Bayona wrote:
On 5/10/2017 9:28 PM, Gabor wrote:
On Wednesday, 5/10/2017 7:20 PM, Tim Wescott wrote:
On Wed, 10 May 2017 18:56:36 -0400, rickman wrote:

On 5/10/2017 5:42 PM, Tim Wescott wrote:
I've been geeking out on the COSMAC 1802 lately -- it was the first
processor that I owned all just for me, and that I wrote programs for
(in machine code -- not assembly).

One of the features of this chip is that while the usual ALU is 8-bit
and centered around memory fetches and the accumulator (which they
call
the 'D' register), there's a 16 x 16-bit register file. Any one of
these registers can be incremented or decremented, either as an
explicit instruction or as part of a fetch (basically, you can use any
one of them as an index, and you can "fetch and increment").

How would you do this most effectively today? How might it have been
done back in the mid 1970's when RCA made the chip? Would it make a
difference if you were working with a CPLD, FPGA, or some ASIC where
you were determined to minimize chip area?

I'm assuming that the original had one selectable increment/decrement
unit that wrote back numbers to the registers, but I could see them
implementing each register as a loadable counter -- I just don't
have a
good idea of what might use the least real estate.

A counter is a register with an adder (although only needing half
adders
at each bit), so of course the incrementer will take up more logic than
a register.

Depending on what functions can be done while the register is
incrementing, they may use the ALU for all arithmetic operations. Most
of the earlier processors conserved logic by time sequencing operations
within an instruction. That's why some instructions take so many
cycles
to complete, it's shuffling data around internally.

If you provide some instructions with their descriptions and the cycle
counts I bet I can tell you how much is done sequentially and how much
is done in parallel.

There's a surprisingly large ecosystem of users for the processor -- I
think because it was a popular, dirt-cheap hobby system, and now there's
all these experienced digital-heads playing with their old toys.
There's
even an "Olduino" project that marries a 1802 board with Arduino
shields.

The 1802 is how I got into doing deep-embedded systems (you can run
an RC
servo! With a counter! In Software!!!). So I understand the
enthusiasm
because I share it.

Here's the Whole Damned User's Manual:

http://datasheets.chipdb.org/RCA/MPM-201B_CDP1802_Users_Manual_Nov77.pdf

All instructions take 16 or 24 clock cycles, on a fixed program of
two or
three phases (_everything_ happens on 8-cycle boundaries). A typical
instruction would load the byte pointed to by register N into D, then
increment the register pointed to by N.

I think you may be right about using the ALU for incrementing registers
-- they don't show it that way in their logical diagram, but I just now
realized that they never increment or decrement a register AND do an
arithmetic operation in the same instruction.

No surprise on the multiple of 8 cycles. The 1802 was a
one-bit serial processor. It's ALU was therefore really
small. A bit more logic for all the sequencing, but
overall it had a very small footprint in gates.

I believe you are incorrect, several RCA manuals shows the ALU as being
8 bits wide. In the early 70s the CMOS logic was slow, as manufacturing
improved many of the chips could get to 8Mhz but they were sold a 2MHz
parts.

Do you have a link that shows the ALU is serial instead of 8 bit parallel?

I have looked at serializing adders and multipliers. The control logic
is large enough that it greatly mitigates the logic saving of a bit
arithmetic unit versus an 8 bit unit. Even for a multiplier a full bit
serial unit is not much smaller than a word wide add/shift design. Any
time you have bit wide logic the registers need multiplexers which are
not much different from adders.

when you have cycles to spare you can just shift

What does that have to do with anything?

you said you needed multiplexers

Kevin Neilson · May 11, 2017

All instructions take 16 or 24 clock cycles, on a fixed program of two or
three phases (_everything_ happens on 8-cycle boundaries).

24 cycles? Holy smokes. I remember most of the 6502 instructions being 2-3 cycles.

rickman · May 12, 2017

On 5/11/2017 2:27 PM, lasselangwadtchristensen@gmail.com wrote:

Den torsdag den 11. maj 2017 kl. 20.08.30 UTC+2 skrev rickman:
On 5/11/2017 11:44 AM, lasselangwadtchristensen@gmail.com wrote:
Den torsdag den 11. maj 2017 kl. 07.17.08 UTC+2 skrev rickman:
On 5/11/2017 12:59 AM, Cecil Bayona wrote:
On 5/10/2017 9:28 PM, Gabor wrote:
On Wednesday, 5/10/2017 7:20 PM, Tim Wescott wrote:
On Wed, 10 May 2017 18:56:36 -0400, rickman wrote:

On 5/10/2017 5:42 PM, Tim Wescott wrote:
I've been geeking out on the COSMAC 1802 lately -- it was the first
processor that I owned all just for me, and that I wrote programs for
(in machine code -- not assembly).

One of the features of this chip is that while the usual ALU is 8-bit
and centered around memory fetches and the accumulator (which they
call
the 'D' register), there's a 16 x 16-bit register file. Any one of
these registers can be incremented or decremented, either as an
explicit instruction or as part of a fetch (basically, you can use any
one of them as an index, and you can "fetch and increment").

How would you do this most effectively today? How might it have been
done back in the mid 1970's when RCA made the chip? Would it make a
difference if you were working with a CPLD, FPGA, or some ASIC where
you were determined to minimize chip area?

I'm assuming that the original had one selectable increment/decrement
unit that wrote back numbers to the registers, but I could see them
implementing each register as a loadable counter -- I just don't
have a
good idea of what might use the least real estate.

A counter is a register with an adder (although only needing half
adders
at each bit), so of course the incrementer will take up more logic than
a register.

Depending on what functions can be done while the register is
incrementing, they may use the ALU for all arithmetic operations. Most
of the earlier processors conserved logic by time sequencing operations
within an instruction. That's why some instructions take so many
cycles
to complete, it's shuffling data around internally.

If you provide some instructions with their descriptions and the cycle
counts I bet I can tell you how much is done sequentially and how much
is done in parallel.

There's a surprisingly large ecosystem of users for the processor -- I
think because it was a popular, dirt-cheap hobby system, and now there's
all these experienced digital-heads playing with their old toys.
There's
even an "Olduino" project that marries a 1802 board with Arduino
shields.

The 1802 is how I got into doing deep-embedded systems (you can run
an RC
servo! With a counter! In Software!!!). So I understand the
enthusiasm
because I share it.

Here's the Whole Damned User's Manual:

http://datasheets.chipdb.org/RCA/MPM-201B_CDP1802_Users_Manual_Nov77.pdf

All instructions take 16 or 24 clock cycles, on a fixed program of
two or
three phases (_everything_ happens on 8-cycle boundaries). A typical
instruction would load the byte pointed to by register N into D, then
increment the register pointed to by N.

I think you may be right about using the ALU for incrementing registers
-- they don't show it that way in their logical diagram, but I just now
realized that they never increment or decrement a register AND do an
arithmetic operation in the same instruction.

No surprise on the multiple of 8 cycles. The 1802 was a
one-bit serial processor. It's ALU was therefore really
small. A bit more logic for all the sequencing, but
overall it had a very small footprint in gates.

I believe you are incorrect, several RCA manuals shows the ALU as being
8 bits wide. In the early 70s the CMOS logic was slow, as manufacturing
improved many of the chips could get to 8Mhz but they were sold a 2MHz
parts.

Do you have a link that shows the ALU is serial instead of 8 bit parallel?

I have looked at serializing adders and multipliers. The control logic
is large enough that it greatly mitigates the logic saving of a bit
arithmetic unit versus an 8 bit unit. Even for a multiplier a full bit
serial unit is not much smaller than a word wide add/shift design. Any
time you have bit wide logic the registers need multiplexers which are
not much different from adders.

when you have cycles to spare you can just shift

What does that have to do with anything?

you said you needed multiplexers

Do you understand how shifting happens? It uses multiplexers to switch
between loading and shifting.

--

Rick C

rickman · May 12, 2017

On 5/11/2017 5:55 PM, Kevin Neilson wrote:

All instructions take 16 or 24 clock cycles, on a fixed program of two or
three phases (_everything_ happens on 8-cycle boundaries).

24 cycles? Holy smokes. I remember most of the 6502 instructions being 2-3 cycles.

No one ever said the 1802 was fast. If you want slow, you should have
seen the 1801! lol

--

Rick C

May 12, 2017

Den fredag den 12. maj 2017 kl. 00.20.45 UTC+2 skrev rickman:

On 5/11/2017 2:27 PM, lasselangwadtchristensen@gmail.com wrote:
Den torsdag den 11. maj 2017 kl. 20.08.30 UTC+2 skrev rickman:
On 5/11/2017 11:44 AM, lasselangwadtchristensen@gmail.com wrote:
Den torsdag den 11. maj 2017 kl. 07.17.08 UTC+2 skrev rickman:
On 5/11/2017 12:59 AM, Cecil Bayona wrote:
On 5/10/2017 9:28 PM, Gabor wrote:
On Wednesday, 5/10/2017 7:20 PM, Tim Wescott wrote:
On Wed, 10 May 2017 18:56:36 -0400, rickman wrote:

On 5/10/2017 5:42 PM, Tim Wescott wrote:
I've been geeking out on the COSMAC 1802 lately -- it was the first
processor that I owned all just for me, and that I wrote programs for
(in machine code -- not assembly).

One of the features of this chip is that while the usual ALU is 8-bit
and centered around memory fetches and the accumulator (which they
call
the 'D' register), there's a 16 x 16-bit register file. Any one of
these registers can be incremented or decremented, either as an
explicit instruction or as part of a fetch (basically, you can use any
one of them as an index, and you can "fetch and increment").

How would you do this most effectively today? How might it have been
done back in the mid 1970's when RCA made the chip? Would it make a
difference if you were working with a CPLD, FPGA, or some ASIC where
you were determined to minimize chip area?

I'm assuming that the original had one selectable increment/decrement
unit that wrote back numbers to the registers, but I could see them
implementing each register as a loadable counter -- I just don't
have a
good idea of what might use the least real estate.

A counter is a register with an adder (although only needing half
adders
at each bit), so of course the incrementer will take up more logic than
a register.

Depending on what functions can be done while the register is
incrementing, they may use the ALU for all arithmetic operations. Most
of the earlier processors conserved logic by time sequencing operations
within an instruction. That's why some instructions take so many
cycles
to complete, it's shuffling data around internally.

If you provide some instructions with their descriptions and the cycle
counts I bet I can tell you how much is done sequentially and how much
is done in parallel.

There's a surprisingly large ecosystem of users for the processor -- I
think because it was a popular, dirt-cheap hobby system, and now there's
all these experienced digital-heads playing with their old toys.
There's
even an "Olduino" project that marries a 1802 board with Arduino
shields.

The 1802 is how I got into doing deep-embedded systems (you can run
an RC
servo! With a counter! In Software!!!). So I understand the
enthusiasm
because I share it.

Here's the Whole Damned User's Manual:

http://datasheets.chipdb.org/RCA/MPM-201B_CDP1802_Users_Manual_Nov77.pdf

All instructions take 16 or 24 clock cycles, on a fixed program of
two or
three phases (_everything_ happens on 8-cycle boundaries). A typical
instruction would load the byte pointed to by register N into D, then
increment the register pointed to by N.

I think you may be right about using the ALU for incrementing registers
-- they don't show it that way in their logical diagram, but I just now
realized that they never increment or decrement a register AND do an
arithmetic operation in the same instruction.

No surprise on the multiple of 8 cycles. The 1802 was a
one-bit serial processor. It's ALU was therefore really
small. A bit more logic for all the sequencing, but
overall it had a very small footprint in gates.

I believe you are incorrect, several RCA manuals shows the ALU as being
8 bits wide. In the early 70s the CMOS logic was slow, as manufacturing
improved many of the chips could get to 8Mhz but they were sold a 2MHz
parts.

Do you have a link that shows the ALU is serial instead of 8 bit parallel?

I have looked at serializing adders and multipliers. The control logic
is large enough that it greatly mitigates the logic saving of a bit
arithmetic unit versus an 8 bit unit. Even for a multiplier a full bit
serial unit is not much smaller than a word wide add/shift design. Any
time you have bit wide logic the registers need multiplexers which are
not much different from adders.

when you have cycles to spare you can just shift

What does that have to do with anything?

you said you needed multiplexers

Do you understand how shifting happens? It uses multiplexers to switch
between loading and shifting.

you could also do load by shifting

rickman · May 12, 2017

On 5/11/2017 6:46 PM, lasselangwadtchristensen@gmail.com wrote:

Den fredag den 12. maj 2017 kl. 00.20.45 UTC+2 skrev rickman:
On 5/11/2017 2:27 PM, lasselangwadtchristensen@gmail.com wrote:
Den torsdag den 11. maj 2017 kl. 20.08.30 UTC+2 skrev rickman:
On 5/11/2017 11:44 AM, lasselangwadtchristensen@gmail.com wrote:
Den torsdag den 11. maj 2017 kl. 07.17.08 UTC+2 skrev rickman:
On 5/11/2017 12:59 AM, Cecil Bayona wrote:
On 5/10/2017 9:28 PM, Gabor wrote:
On Wednesday, 5/10/2017 7:20 PM, Tim Wescott wrote:
On Wed, 10 May 2017 18:56:36 -0400, rickman wrote:

On 5/10/2017 5:42 PM, Tim Wescott wrote:
I've been geeking out on the COSMAC 1802 lately -- it was the first
processor that I owned all just for me, and that I wrote programs for
(in machine code -- not assembly).

One of the features of this chip is that while the usual ALU is 8-bit
and centered around memory fetches and the accumulator (which they
call
the 'D' register), there's a 16 x 16-bit register file. Any one of
these registers can be incremented or decremented, either as an
explicit instruction or as part of a fetch (basically, you can use any
one of them as an index, and you can "fetch and increment").

How would you do this most effectively today? How might it have been
done back in the mid 1970's when RCA made the chip? Would it make a
difference if you were working with a CPLD, FPGA, or some ASIC where
you were determined to minimize chip area?

I'm assuming that the original had one selectable increment/decrement
unit that wrote back numbers to the registers, but I could see them
implementing each register as a loadable counter -- I just don't
have a
good idea of what might use the least real estate.

A counter is a register with an adder (although only needing half
adders
at each bit), so of course the incrementer will take up more logic than
a register.

Depending on what functions can be done while the register is
incrementing, they may use the ALU for all arithmetic operations. Most
of the earlier processors conserved logic by time sequencing operations
within an instruction. That's why some instructions take so many
cycles
to complete, it's shuffling data around internally.

If you provide some instructions with their descriptions and the cycle
counts I bet I can tell you how much is done sequentially and how much
is done in parallel.

There's a surprisingly large ecosystem of users for the processor -- I
think because it was a popular, dirt-cheap hobby system, and now there's
all these experienced digital-heads playing with their old toys.
There's
even an "Olduino" project that marries a 1802 board with Arduino
shields.

The 1802 is how I got into doing deep-embedded systems (you can run
an RC
servo! With a counter! In Software!!!). So I understand the
enthusiasm
because I share it.

Here's the Whole Damned User's Manual:

http://datasheets.chipdb.org/RCA/MPM-201B_CDP1802_Users_Manual_Nov77.pdf

All instructions take 16 or 24 clock cycles, on a fixed program of
two or
three phases (_everything_ happens on 8-cycle boundaries). A typical
instruction would load the byte pointed to by register N into D, then
increment the register pointed to by N.

I think you may be right about using the ALU for incrementing registers
-- they don't show it that way in their logical diagram, but I just now
realized that they never increment or decrement a register AND do an
arithmetic operation in the same instruction.

No surprise on the multiple of 8 cycles. The 1802 was a
one-bit serial processor. It's ALU was therefore really
small. A bit more logic for all the sequencing, but
overall it had a very small footprint in gates.

I believe you are incorrect, several RCA manuals shows the ALU as being
8 bits wide. In the early 70s the CMOS logic was slow, as manufacturing
improved many of the chips could get to 8Mhz but they were sold a 2MHz
parts.

Do you have a link that shows the ALU is serial instead of 8 bit parallel?

I have looked at serializing adders and multipliers. The control logic
is large enough that it greatly mitigates the logic saving of a bit
arithmetic unit versus an 8 bit unit. Even for a multiplier a full bit
serial unit is not much smaller than a word wide add/shift design. Any
time you have bit wide logic the registers need multiplexers which are
not much different from adders.

when you have cycles to spare you can just shift

What does that have to do with anything?

you said you needed multiplexers

Do you understand how shifting happens? It uses multiplexers to switch
between loading and shifting.

you could also do load by shifting

Only if the entire CPU were 100% bit serial. I seriously doubt that is
the case with the 1802.

--

Rick C

Jecel · May 13, 2017

On Thursday, May 11, 2017 at 7:21:33 PM UTC-3, rickman wrote:

On 5/11/2017 5:55 PM, Kevin Neilson wrote:
24 cycles? Holy smokes. I remember most of the 6502 instructions
being 2-3 cycles.

No one ever said the 1802 was fast. If you want slow, you should have
seen the 1801! lol

Indeed, many early microprocessors looked a lot more impressive until you saw how many clock cycles each instruction took.

But it is important to remember that there were two different clock styles and it is complicated to compare them directly.

The 6502, 6800 and ARM2 used two non overlapping clocks. This required two pins and a more complicated external circuit but simplified the internal circuit. In a 1MHz 6502, for example, you have four different times in which things happen in each microsecond: when clock 1 is high, when both are low, when clock 2 is high and when both are low again.

Many processors had a single clock pin, which allowed you to use a simple oscillator externally. But to have the same functionality of the 1MHz 6502 this single clock would have to be 4MHz so you could do four things in each microsecond. This was the case of the 68000, for example. The Z80 only needed to do three things.

-- Jecel

Kevin Neilson · May 13, 2017

Indeed, many early microprocessors looked a lot more impressive until you saw how many clock cycles each instruction took.

But it is important to remember that there were two different clock styles and it is complicated to compare them directly.

I do remember reading some marketing on the 6502 that asserted that the 6502 could do more at 1MHz than the other duplicitous companies which had faster processors but did little per cycle. Thus began the MHz wars. (When you buy a Macbook now, do they even advertise the clock rate?) I remember the big deal they made out of the "zero page" instructions, which saved a fetch cycle when using registers in the first page (256 bytes) of RAM.

rickman · May 13, 2017

On 5/13/2017 4:07 PM, Jecel wrote:

On Thursday, May 11, 2017 at 7:21:33 PM UTC-3, rickman wrote:
On 5/11/2017 5:55 PM, Kevin Neilson wrote:
24 cycles? Holy smokes. I remember most of the 6502 instructions
being 2-3 cycles.

No one ever said the 1802 was fast. If you want slow, you should have
seen the 1801! lol

Indeed, many early microprocessors looked a lot more impressive until you saw how many clock cycles each instruction took.

But it is important to remember that there were two different clock styles and it is complicated to compare them directly.

The 6502, 6800 and ARM2 used two non overlapping clocks. This required two pins and a more complicated external circuit but simplified the internal circuit. In a 1MHz 6502, for example, you have four different times in which things happen in each microsecond: when clock 1 is high, when both are low, when clock 2 is high and when both are low again.

Many processors had a single clock pin, which allowed you to use a simple oscillator externally. But to have the same functionality of the 1MHz 6502 this single clock would have to be 4MHz so you could do four things in each microsecond. This was the case of the 68000, for example. The Z80 only needed to do three things.

I think the single vs. multiple clock issue was more of a evolutionary
thing. The early processors (including the 8080) required multiple
phases on the supplied clocks. After some time the new processors hid
that clock generation internally and allowed the user to supply just a
single clock phase. Heck, I recall my TMS9900 had four non-overlapping
clock phases and came in a huge 64 pin package. I still have that board
in the basement.

--

Rick C

Tim Wescott · May 13, 2017

On Sat, 13 May 2017 13:07:40 -0700, Jecel wrote:

On Thursday, May 11, 2017 at 7:21:33 PM UTC-3, rickman wrote:
On 5/11/2017 5:55 PM, Kevin Neilson wrote:
24 cycles? Holy smokes. I remember most of the 6502 instructions
being 2-3 cycles.

No one ever said the 1802 was fast. If you want slow, you should have
seen the 1801! lol

Indeed, many early microprocessors looked a lot more impressive until
you saw how many clock cycles each instruction took.

But it is important to remember that there were two different clock
styles and it is complicated to compare them directly.

The 6502, 6800 and ARM2 used two non overlapping clocks. This required
two pins and a more complicated external circuit but simplified the
internal circuit. In a 1MHz 6502, for example, you have four different
times in which things happen in each microsecond: when clock 1 is high,
when both are low, when clock 2 is high and when both are low again.

Many processors had a single clock pin, which allowed you to use a
simple oscillator externally. But to have the same functionality of the
1MHz 6502 this single clock would have to be 4MHz so you could do four
things in each microsecond. This was the case of the 68000, for example.
The Z80 only needed to do three things.

-- Jecel

At least the internal timing of the 1802 shows some things happening on
half-clock boundaries. I'm not sure if this reflects to a requirement
for a 50% duty cycle clock, however.

--
www.wescottdesign.com

increment or decrement one of 16, 16-bit registers

Tim Wescott

Guest

rickman

Guest

Tim Wescott

Guest

Gabor

Guest

Tim Wescott

Guest

Cecil Bayona

Guest

rickman

Guest

Tom Gardner

Guest

Guest

rickman

Guest

Guest

Kevin Neilson

Guest

rickman

Guest

rickman

Guest

Guest

rickman

Guest

Jecel

Guest

Kevin Neilson

Guest

rickman

Guest

Tim Wescott

Guest

Log in

Welcome to EDABoard.com

Sponsor