MISC - Stack Based vs. Register Based

In article <kjtjne$nfu$1@dont-email.me>, rickman <gnuarm@gmail.com> wrote:
On 4/7/2013 5:59 PM, Brian Davis wrote:
rickman wrote:

So the main memory and stacks could be accessed for reading and writing
in the same clock cycle, read/modify/write. You can't do that with today's
block RAMs, they are totally synchronous.

I had the same problem when I first moved my XC4000 based RISC
over to the newer parts with registered Block RAM.

I ended up using opposite edge clocking, with a dual port BRAM,
to get what appears to be single cycle access on the data and
instruction ports.

As this approach uses the same clock, the constraints are painless;
but you now have half a clock for address -> BRAM setup, and half
for the BRAM data<-> core data setup. The latter can cause some
some timing issues if the core is configured with a byte lane mux
so as to support 8/16/32 bit {sign extending} loads.

Yes, that was one way to solve the problem. This other I considered was
to separate the read and write on the two ports. Then the read would be
triggered from the address that was at the input to the address
register... from the previous cycle. So the read would *always* be done
and the data presented whether you used it or not. I'm not sure how
much power this would waste, but the timing impact would be small.

I looked at making the register block RAM part of the main memory
address space. This would required a minimum of three clock cycles in a
machine cycle, read address or data from register, use address to read
or write data from/to memory and then write data to register. If it
helps timing, the memory write can be done at the same time as the
register write. I'm not crazy about this approach, but I'm considering
how useful it would be to have direct address capability of the multiple
register banks.

Some of the comments about register vs. stacks and what I have seen of
the J1 has made me think about a hybrid approach using stacks in memory,
but with offset access, so items further down in the stack can be
operands, not just TOS and NOS. This has potential for saving stack
operations. The J1 has a two bit field controlling the stack pointer, I
assume that is +1 to -2 or 1 push to 2 pop. The author claims this
provides some ability to combine Forth functions into one instruction,
but doesn't provide details. I guess the compiler code would have to be
examined to find out what combinations would be useful.
This is the approach we took with the FIETS chip, about 1980, emulated
on an Osborne CPM computer, never build. The emulation could run a Forth
and it benefited from reaching 8 deep into both the return and the data
stack. It still would be interesting to build using modern FPGA.

The compiler end is not my strong suit, but I suppose I could figure out
how to take advantage of features like this.

--

Rick
Groetjes Albert
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst
 
"Albert van der Horst" <albert@spenarnc.xs4all.nl> wrote in
message news:515e262b$0$26895$e4fe514c@dreader37.news.xs4all.nl...

The existance of an XLAT instruction (to name an example)
OTOH does virtually nothing to make the life of an
assembler programmer better.
Why do you say that?

It seems good for 256 byte (or less) lookup tables, 8-bit
character translation, simple decompression algorithms, etc. You
can even use it for multiple tables at once, e.g., using XCHG to
swap BX. It's definately difficult for a compiler implementer to
determine when to use such a CISC instruction.


Rod Pemberton
 

Welcome to EDABoard.com

Sponsor

Back
Top