Home grown CPU core legal?

Followup to: <6Uasb.123195$mZ5.829826@attbi_s54>
By author: "Glen Herrmannsfeldt" <gah@ugcs.caltech.edu>
In newsgroup: comp.arch.fpga
The PDP-11 has a nice simple 16 bit architecture, not including the optional
instructions. (FIS and EIS for example.)
The PDP-11 is still very much a CISC archtecture... I think it would
require a lot more logic than necessary.

This below is my design notes for my hacked-up architecture, currently
called "NanoRISC."

I have no way to know how this is turning out. My current goal is to
make sure it implements in < 1000 LEs on Cyclone, without using
blockRAM for the register file. Fundamentally it's a personal
research hack project.

-hpa



NanoRISC goals
- Minimal hardware consumption
- Technology independent
- Free licensing

-> 16-bit addressing, data width, instruction word
-> Single issue in-order RISC
-> Short pipeline (probably 3 stages)
-> Deterministic timing (1 cycle/insn, taken branch 2 cycles?)
-> Separate ports for I and D to take advantage of dual-port RAM

0000 NNNN NNNN NNNN - IMM (supplies upper 12 bits of q or Is field)
0001 0000 SSSS DDDD - JMP Rd,Rs (PC <- Rd, Rd <- Rs)
0001 CCCC TTTT TTTT - BR cc,PC+t (cc != 0)
001I PPPP SSSS DDDD - ALU Rd,Rs/Is (P = operation, I = immediate)
01WB QQQQ BBBB RRRR - LD/ST Rr,[Rb+q] (W=ST/LD# B=16/8#)
1TTT TTTT TTTT TTTT - CALL PC+t (PC <- PC+2, r15 <- PC, PC <- PC+t)


ALU opcodes

0000 UNARY
1000 ROR
1001 ROL
1010 RCR
1011 RCL
1100 SHR
1101 SHL
1110 SAR
1111 SXL Shift left insert 1
[...more...]

0001 MOV

0010 CMP
0011 TST

0100 ANDN
0101 OR
0110 XOR
0111 AND

1000 ADD
1001 ADC
1010 SUB
1011 SBC
1100 SUBR
1101 SBCR

1110 WRSR
1111 RDSR

Condition codes

3 N = negative
2 Z = zero
1 V = overflow
0 C = carry

#e - Z
#b - ~C
#a - C & ~Z
#l - V
#g - ~V & ~Z
#s - N

+ negations

always - negation of code 0000
--
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
If you send me mail in HTML format I will assume it's spam.
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64
 
"H. Peter Anvin" wrote:
Followup to: <6Uasb.123195$mZ5.829826@attbi_s54
By author: "Glen Herrmannsfeldt" <gah@ugcs.caltech.edu
In newsgroup: comp.arch.fpga

The PDP-11 has a nice simple 16 bit architecture, not including the optional
instructions. (FIS and EIS for example.)


The PDP-11 is still very much a CISC archtecture... I think it would
require a lot more logic than necessary.

This below is my design notes for my hacked-up architecture, currently
called "NanoRISC."

I have no way to know how this is turning out. My current goal is to
make sure it implements in < 1000 LEs on Cyclone, without using
blockRAM for the register file. Fundamentally it's a personal
research hack project.
Aren't there already several open source FPGA CPUs avaiable? Anyone
have a few links handy?

--

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design URL http://www.arius.com
4 King Ave 301-682-7772 Voice
Frederick, MD 21701-3110 301-682-7666 FAX
 
You should try www.opencores.org

Erez.

"rickman" <spamgoeshere4@yahoo.com> wrote in message
news:3FB1478D.8C19CC98@yahoo.com...
"H. Peter Anvin" wrote:


Aren't there already several open source FPGA CPUs avaiable? Anyone
have a few links handy?

--

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design URL http://www.arius.com
4 King Ave 301-682-7772 Voice
Frederick, MD 21701-3110 301-682-7666 FAX
 
Your self-imposed limit of "1000 LEs without using BlockRAM for the
register file" will put you at a distinct disadvantage against
MicroBlaze which can use LUT-RAMs and SRL16s, something Altera does not have.
:-( or :) dpending on your affiliation.
Peter Alfke, Xilinx

rickman wrote:
"H. Peter Anvin" wrote:

Followup to: <6Uasb.123195$mZ5.829826@attbi_s54
By author: "Glen Herrmannsfeldt" <gah@ugcs.caltech.edu
In newsgroup: comp.arch.fpga

The PDP-11 has a nice simple 16 bit architecture, not including the optional
instructions. (FIS and EIS for example.)


The PDP-11 is still very much a CISC archtecture... I think it would
require a lot more logic than necessary.

This below is my design notes for my hacked-up architecture, currently
called "NanoRISC."

I have no way to know how this is turning out. My current goal is to
make sure it implements in < 1000 LEs on Cyclone, without using
blockRAM for the register file. Fundamentally it's a personal
research hack project.

Aren't there already several open source FPGA CPUs avaiable? Anyone
have a few links handy?

--

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design URL http://www.arius.com
4 King Ave 301-682-7772 Voice
Frederick, MD 21701-3110 301-682-7666 FAX
 
H. Peter Anvin wrote:
Followup to: <6Uasb.123195$mZ5.829826@attbi_s54
By author: "Glen Herrmannsfeldt" <gah@ugcs.caltech.edu
In newsgroup: comp.arch.fpga

The PDP-11 has a nice simple 16 bit architecture, not including the optional
instructions. (FIS and EIS for example.)


The PDP-11 is still very much a CISC archtecture... I think it would
require a lot more logic than necessary.

This below is my design notes for my hacked-up architecture, currently
called "NanoRISC."

I have no way to know how this is turning out. My current goal is to
make sure it implements in < 1000 LEs on Cyclone, without using
blockRAM for the register file. Fundamentally it's a personal
research hack project.

-hpa

NanoRISC goals
- Minimal hardware consumption
- Technology independent
- Free licensing

-> 16-bit addressing, data width, instruction word
In doing a 'clean slate' FPGA small core, there is merit in choosing
an opcode width that matches the FPGA Block RAM / Multiplier widths.
( eg I've seen 9 bit opcodes used )
Did you look at that ?

-jg
 
Followup to: <3FB15315.5D9F933E@xilinx.com>
By author: Peter Alfke <peter@xilinx.com>
In newsgroup: comp.arch.fpga
Your self-imposed limit of "1000 LEs without using BlockRAM for the
register file" will put you at a distinct disadvantage against
MicroBlaze which can use LUT-RAMs and SRL16s, something Altera does not have.
:-( or :) dpending on your affiliation.
Peter Alfke, Xilinx
Since my affiliation is "neither" (I just happen to own a Cyclone
board since that was the biggest FPGA I could get with free tools) I
guess it's more of a :-| than either of those :^)

Unless Xilinx' tools are complete crap, which I'd find unlikely, I
would expect that the tools would infer the use of LUT-RAMs for the
register file if synthesized for a Xilinx part. It's all part of "no
vendor lockin."

Also, this is mostly a project I'm doing for fun. If it happens to be
useful at some point in the future, so much the better, if not, I've
still achieved my goal of grokking FPGA synthesis better.

-hpa
--
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
If you send me mail in HTML format I will assume it's spam.
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64
 
"Erez Birenzwig" <erez_birenzwig@hotmail.com> writes:

You should try www.opencores.org
Or

http://www.fpgacpu.org/links.html


Petter

--
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
 
Followup to: <3FB1539C.160F@designtools.co.nz>
By author: jim.granville@designtools.co.nz
In newsgroup: comp.arch.fpga
In doing a 'clean slate' FPGA small core, there is merit in choosing
an opcode width that matches the FPGA Block RAM / Multiplier widths.
( eg I've seen 9 bit opcodes used )
Did you look at that ?
Some vendors have 9/18-bit blockRAMs, some don't. I'm trying to be as
generic as possible. It also makes it easier to port tools like
gas/binutils/gcc.

-hpa


--
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
If you send me mail in HTML format I will assume it's spam.
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64
 
H. Peter Anvin wrote:
Followup to: <3FB1539C.160F@designtools.co.nz
By author: jim.granville@designtools.co.nz
In newsgroup: comp.arch.fpga

In doing a 'clean slate' FPGA small core, there is merit in choosing
an opcode width that matches the FPGA Block RAM / Multiplier widths.
( eg I've seen 9 bit opcodes used )
Did you look at that ?


Some vendors have 9/18-bit blockRAMs, some don't. I'm trying to be as
generic as possible. It also makes it easier to port tools like
gas/binutils/gcc.
& Off chip memory is also easier....

As FPGAs get ever cheaper, and Block RAM gets larger, and
factoring in relative speeds, there is scope to define a
CPU that takes a coarse approach to cache, like :

- reserves a BlockRAM (or 2) for CODE for SW interrupt loops,
and Cache-locked code
This gives very fast responses, and lowers RFI and total Power
(minimum off-chip BUS/eternal memory activity)

- uses another Block RAM for code cache, where it is allowed to pause
while it loads from slower memory. Dual Port RAM would allow a FIFO
style load.
External memory could be WORD, BYTE or even serial ( FPGA_Stamp :)

- Other Block RAMS are standard DATA rams, including fast context
register switching for interrupts / param passing.

Design ends up with a single CPU, but two distinct areas of FAST and
SLOW
code and data.

Does anyone know of work using this HW focus on FPGA cores ?

- jg
 
H. Peter Anvin wrote:
I have no way to know how this is turning out. My current goal is to
make sure it implements in < 1000 LEs on Cyclone, without using
blockRAM for the register file.
Isn't some form of BlockRAM a defacto standard on all
'consider for new design' FPGAs - so not using that would
restrict your options ?

-jg
 
Jim Granville wrote:
H. Peter Anvin wrote:

I have no way to know how this is turning out. My current goal is to
make sure it implements in < 1000 LEs on Cyclone, without using
blockRAM for the register file.

Isn't some form of BlockRAM a defacto standard on all
'consider for new design' FPGAs - so not using that would
restrict your options ?

-jg
So are hardware multipliers these days. I believe all the latest chips
have them as well as multi-standard IOs.

--

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design URL http://www.arius.com
4 King Ave 301-682-7772 Voice
Frederick, MD 21701-3110 301-682-7666 FAX
 
I am convinced that a generic version would inevitably be inferior in
performance and/or price, compared to the dedicated one. I know that Ken
and Göran used many Xilinx-specific features when they designed
PicoBlaze and MicroBlaze. And I assume that the Altera guys were
operating in a comparable way when they designed Nios.
The generic ones will be the "worst of both worlds", unless you really
believe in clairvoyant synthesis.

Peter Alfke, Xilinx
 
I am convinced that a generic version would inevitably be inferior in
performance and/or price, compared to the dedicated one.
But, there's nothing like rolling your own for fun/educational
purposes and then having something useful at the end of it all. I
think it would also be easier to add new features and enhancements to
your own design since the code was developed in your own way of
thinking and coding style. In the end, maybe it will just be "yet
another RISC core" on opencores.org, but at least it's yours and
you'll have a good understanding of it's capabilities...even if all it
can do is blink an LED!

BP
 
Followup to: <3FB15BB3.5B87@designtools.co.nz>
By author: jim.granville@designtools.co.nz
In newsgroup: comp.arch.fpga
H. Peter Anvin wrote:

I have no way to know how this is turning out. My current goal is to
make sure it implements in < 1000 LEs on Cyclone, without using
blockRAM for the register file.

Isn't some form of BlockRAM a defacto standard on all
'consider for new design' FPGAs - so not using that would
restrict your options ?
Some form thereof, yes, but I tend to run out of blockram a lot faster
than running out of LUTs. Note that it's not that I'm saying you
couldn't use it, I'm saying I want to be at < 1000 LE without using
blockram. About 300-400 of that would be replacable with a blockram.

-hpa


--
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
If you send me mail in HTML format I will assume it's spam.
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64
 
Followup to: <3FB16CEB.C46327DE@xilinx.com>
By author: Peter Alfke <peter@xilinx.com>
In newsgroup: comp.arch.fpga
I am convinced that a generic version would inevitably be inferior in
performance and/or price, compared to the dedicated one. I know that Ken
and Göran used many Xilinx-specific features when they designed
PicoBlaze and MicroBlaze. And I assume that the Altera guys were
operating in a comparable way when they designed Nios.
The generic ones will be the "worst of both worlds", unless you really
believe in clairvoyant synthesis.
Of course. But it would have the advantage that it could run on
either.

-hpa


--
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
If you send me mail in HTML format I will assume it's spam.
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64
 
Peter Alfke wrote:
I am convinced that a generic version would inevitably be inferior in
performance and/or price, compared to the dedicated one. I know that Ken
and Göran used many Xilinx-specific features when they designed
PicoBlaze and MicroBlaze. And I assume that the Altera guys were
operating in a comparable way when they designed Nios.
The generic ones will be the "worst of both worlds", unless you really
believe in clairvoyant synthesis.

Peter Alfke, Xilinx
"Clairvoyant Synthesis", now that sounds like a good product! Is there
a startup somewhere working on that?

I can see product announcements touting the new FPGA CS that eliminates
the need for product planners, designers and even testing as it would
already know that the design was ready for production! No specs to
write, no coding to compile and even simulation could be skipped. Just
think of the design you want and out pops a bit file. Boy, what will
they think of next?

--

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design URL http://www.arius.com
4 King Ave 301-682-7772 Voice
Frederick, MD 21701-3110 301-682-7666 FAX
 
"Bruce P." wrote:
I am convinced that a generic version would inevitably be inferior in
performance and/or price, compared to the dedicated one.

But, there's nothing like rolling your own for fun/educational
purposes and then having something useful at the end of it all. I
think it would also be easier to add new features and enhancements to
your own design since the code was developed in your own way of
thinking and coding style. In the end, maybe it will just be "yet
another RISC core" on opencores.org, but at least it's yours and
you'll have a good understanding of it's capabilities...even if all it
can do is blink an LED!

BP
When I was in school we worked on a paper design of a microcoded
processor as a teaching tool. We had homework on it and had to design
new features on our exams. I even had a question about it on my Masters
comprehensive exam. I approached my professor about designing a
simulation of it to run on the Univac mainframe for the undergrad
students to learn from. But I guess I was ahead of my time as he did
not see the value in that. Or maybe he had the foresight to see the
complications it might create :)

I guess this is a pretty common thing at Universities now. All they
have to do is get you a FPGA design package with a simulator.


--

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design URL http://www.arius.com
4 King Ave 301-682-7772 Voice
Frederick, MD 21701-3110 301-682-7666 FAX
 
Hello Jim,

For the most part, I agree with your legal analysis regarding cloning
cores. The primary issues have to do with whether or not an
architecture is protected by patent and if not, how you market or use
the resulting clone so as not to infringe existing copyrights,
trademarks, trade dress or confidentiality/license agreements.

But in my view, this is not the main issue with cloning an existing
architecture. The main issue has to do with how you are going to
debug it once you get it into an FPGA wherein it is completely
embedded with no address or data lines coming out.

Accordingly, I'd like to take this opportunity to simply state that
I've posted developmental versions of both my 8051 and 6805
microcontroller cores for free downloading at www.quickcores.com. The
8051 is in original Verilog RTL format and includes on-chip JTAG
real-time monitoring and debug logic, including 144-channel trace
buffer.

I've successfully synthesized them using Synplify and Quartus II web
edition.

If anyone would like to start a thread about how the JTAG real-time
monitor works, I'd be happy to engage.

Regards,

Jerry D. Harthcock
QuickCores
p.s., I also have the 9-bit RISC which I'd be happy to post if anyone
is interested.

Jim Granville <jim.granville@designtools.co.nz> wrote in message news:<3FAFFAE7.2939@designtools.co.nz>...
Symon wrote:

Hi Goran,
So, playing devli's advocate, Xilinx wouldn't mind if the clean room
Microblaze was targeted at their competitors' devices? Or do you think that
no one would do this because Microblaze only efficiently fits the Xilinx
devices? Or the competitors have their own solutions for their parts?
I wonder....

Xilinx's protection does not come from attack on the clean room clone,
but rather from the protection of the Microblaze name, and tool flows.
So, anyone would be free to create an opcode compatible core,
if they wished, but not to use the brand, nor the Xilinx tool flows.

Older uC cores are easier to copy (any patents lapsed), and their tools
are widely available. Things like 80C51, 6502, Z8, and even 8085....
( someone must have done a 8048 core ? :)

-jg
 
In article <borkg7$p06$1@cesium.transmeta.com>,
H. Peter Anvin <hpa@zytor.com> wrote:
Some vendors have 9/18-bit blockRAMs, some don't. I'm trying to be as
generic as possible. It also makes it easier to port tools like
gas/binutils/gcc.
Both Brand A and Brand X have midsize (8-16 bit wide + parity, with
128 addresses in that range) memories, and any other viable FPGA will
as well.

Thus it is safe to have parameterized cache and register file with
instantiates the correct size memories, as part of your design, and
still remain vendor neutral. You WANT to use these devices for both
register file and memory.

The thing that Brand A is missing are the SRL16/LUT as RAM features
which give very small memories (16-64x1b), while Brand X all the
BlockRAMs (midsized memories) are the same size while Brand A's
memories come in different sizes.


--
Nicholas C. Weaver nweaver@cs.berkeley.edu
 
In article <3FB16CEB.C46327DE@xilinx.com>,
Peter Alfke <peter@xilinx.com> wrote:
I am convinced that a generic version would inevitably be inferior in
performance and/or price, compared to the dedicated one. I know that Ken
and Göran used many Xilinx-specific features when they designed
PicoBlaze and MicroBlaze. And I assume that the Altera guys were
operating in a comparable way when they designed Nios.
The generic ones will be the "worst of both worlds", unless you really
believe in clairvoyant synthesis.
I think generic will be inferior, but not THAT inferior, given the
register files and caches can and should be done in the "everyone has"
BlockRAMs.

But in order to make it generic, these structures will probably need
target-specific parameters and options (dual ported or not, size
range) which are instantiated.

Also, the other big disadvantage in the generic version is going to be
a lack of placement. Placement is good for 10-30% performanec
increases.
--
Nicholas C. Weaver nweaver@cs.berkeley.edu
 

Welcome to EDABoard.com

Sponsor

Back
Top