Using an FPGA to drive the 80386 CPU on a real motherboard

On 13/05/16 14:20, Rick C. Hodgin wrote:
On Friday, May 13, 2016 at 3:16:58 AM UTC-4, David Brown wrote:
[snip]

The type and manner of seeds we sow reap their harvest in due season, David.
I advise that you quickly change seeds.

What on earth does that mean? If you trying to say that you did not
like something I wrote, then it would make a lot more sense to quote the
part you didn't like, and then explain why you didn't like it. If I
have written something that is incorrect, or written in a way that is
causing upset or insult, then I would rather hear the actual complaint.
That way I can either justify or defend my viewpoint, or correct it, or
apologise, as appropriate. Suggesting that I "change my seeds" doesn't
help anyone. I can only assume that it is some weird religious
reference, and we all know how little respect anyone has for your
peculiar ideas in that area.

At a guess, your issue is my point that companies that provide
individual quotations tailored for individual customers, rather than
publicly viewable standardised price lists, prefer to keep these
quotations private. This is common practice - though I fully understand
if you don't like it. In particular, if the quotation is restricted as
confidential information, you cannot do, as you did, publish it and ask
for opinions. But it is also quite possible - if surprising - that this
supplier does not consider the quotation to be privileged information
and are happy to see it published. My comment is therefore to encourage
you to check the situation, and be sure you are not breaking any
agreements or contracts with what you post, as that would surely
jeopardise your relationships with this company.
 
On 13/05/16 13:48, Rick C. Hodgin wrote:
>[snip]

I disagree with your third point.
 
On 13/05/16 14:48, Rick C. Hodgin wrote:
On Friday, May 13, 2016 at 8:42:44 AM UTC-4, David Brown wrote:
[snip]

Have you read the New Testament as an adult? Have you considered the
possibility that you may be on the wrong path in life (not with your
career or worldly things, but with eternal things)?

I've read perhaps 30% of the NT as an adult, and yes, I have considered
my "path". I don't believe in anything supernatural, and therefore am
not religious. There is plenty of good advice in the NT, especially the
teachings of Jesus (much less so with Paul, who goes directly against
Jesus in a lot of his philosophy). There is no need to introduce a
"god" in order to appreciate that "love thy neighbour" is a good basis
for how to live your life.

There are some questions you need to ask yourself, and seek answers
on. If you're willing to do this, then you will inherently know what
I mean. But if you are unwilling, you will never understand anything
I write toward those ends.

I have asked these sorts of questions, and considered them critically,
in many aspects. And herein lies a key difference between us - you have
not considered the answers, or viewed them critically. Somewhere along
the line, you have picked up the idea that you have found /the/ answer
in a particular interpretation of Christianity. Now you no longer look
for answers, or take time to think over questions properly - when faced
with a question, you assume you already know /the/ answer, and then you
change the question to fit that answer.

I believe in seeking answers, and searching for "truth" - but I do so in
the world around me, and the people around me, rather than in imagined
ideas or books written by ignorant people from a different time and a
different culture, aimed at controlling their fellow tribesmen.

David, you're an extremely knowledgeable man. Don't miss out on this
teaching. Learn more than you know today. It will benefit all areas
of your life.

I have seen what your religious ideas have done to you (as far as you
have written and acted in Usenet groups). No, thank you very much, it
does not appeal in the slightest.

I /have/ seen people who have become happier people, or "better" people,
as a result of religious beliefs - but that is because there religion
fills a social or psychological gap in their lives. Like most people, I
don't have such gaps that need filling with a religion. (Like everyone
else, I do have gaps or flaws - just not ones that would be fixed by a
religion.)

I will never be a high-level basketball player, because I am short and
round rather than tall and thin. Baring psychological illness or brain
damage, I will never be religious (at least, not in the way you are)
because I have an open mind and think rationally and critically - I will
not accept any supernatural concepts such as "god" without /real/ proof.
That runs contrary to the "leap of faith" required for believing in a
god, or gods, with no proof - merely a trust in what other people write
or say about them.
 
On 5/13/2016 9:31 AM, David Brown wrote:
On 13/05/16 14:48, Rick C. Hodgin wrote:
On Friday, May 13, 2016 at 8:42:44 AM UTC-4, David Brown wrote:
[snip]

Have you read the New Testament as an adult? Have you considered the
possibility that you may be on the wrong path in life (not with your
career or worldly things, but with eternal things)?

I've read perhaps 30% of the NT as an adult, and yes, I have considered
my "path". I don't believe in anything supernatural, and therefore am
not religious. There is plenty of good advice in the NT, especially the
teachings of Jesus (much less so with Paul, who goes directly against
Jesus in a lot of his philosophy). There is no need to introduce a
"god" in order to appreciate that "love thy neighbour" is a good basis
for how to live your life.


There are some questions you need to ask yourself, and seek answers
on. If you're willing to do this, then you will inherently know what
I mean. But if you are unwilling, you will never understand anything
I write toward those ends.

I have asked these sorts of questions, and considered them critically,
in many aspects. And herein lies a key difference between us - you have
not considered the answers, or viewed them critically. Somewhere along
the line, you have picked up the idea that you have found /the/ answer
in a particular interpretation of Christianity. Now you no longer look
for answers, or take time to think over questions properly - when faced
with a question, you assume you already know /the/ answer, and then you
change the question to fit that answer.

I believe in seeking answers, and searching for "truth" - but I do so in
the world around me, and the people around me, rather than in imagined
ideas or books written by ignorant people from a different time and a
different culture, aimed at controlling their fellow tribesmen.


David, you're an extremely knowledgeable man. Don't miss out on this
teaching. Learn more than you know today. It will benefit all areas
of your life.


I have seen what your religious ideas have done to you (as far as you
have written and acted in Usenet groups). No, thank you very much, it
does not appeal in the slightest.

I /have/ seen people who have become happier people, or "better" people,
as a result of religious beliefs - but that is because there religion
fills a social or psychological gap in their lives. Like most people, I
don't have such gaps that need filling with a religion. (Like everyone
else, I do have gaps or flaws - just not ones that would be fixed by a
religion.)

I will never be a high-level basketball player, because I am short and
round rather than tall and thin. Baring psychological illness or brain
damage, I will never be religious (at least, not in the way you are)
because I have an open mind and think rationally and critically - I will
not accept any supernatural concepts such as "god" without /real/ proof.
That runs contrary to the "leap of faith" required for believing in a
god, or gods, with no proof - merely a trust in what other people write
or say about them.

It is bad enough that one person is polluting the group with totally off
topic discussions. Do we have to make it a group effort?

Why don't you exchange private email rather than public posts? Then the
rest of us don't need to see what is essentially a private conversation.

--

Rick C
 
On 13/05/16 15:49, rickman wrote:

It is bad enough that one person is polluting the group with totally off
topic discussions. Do we have to make it a group effort?

Why don't you exchange private email rather than public posts? Then the
rest of us don't need to see what is essentially a private conversation.

Fair enough - sorry about that. I don't usually reply much to Rick's
non-technical posts.

I don't plan on taking this any further (in the group, or in email).
 
On Sun, 01 May 2016 23:13:44 -0400, rickman wrote:

On 5/1/2016 2:24 PM, Aleksandar Kuktin wrote:

[snip]

Right now, with what I'm currently working on, I have a blessing in
that I don't have to run the circuit very fast so I can get away with
three- gate deep logic, maybe even four gate deep. But if I were going
for break-
neck speeds, I would be constrained to logic two LUT4 gates deep. Only
so many features can be crammed into a design made with that. :)

Maybe I don't know what you mean by fast and not fast. Got some
numbers?

I consider 100MHz to be fast, 10MHz to be slow. On a clock-independent
level, I consider logic 2 LUT deep to be fast and logic more than 4 LUTs
deep to be slow.

If you add a bit to the word or address size, you are not just
doubling the CPUs capabilities, you are also doubling the number,
size and scope of problems you have to deal with.

??? My CPU design did not specify the data size, only the instruction
size. I didn't have a problem adjusting the data size to suit my
application.

I suppose you can parameterize the data size, and later change the
definition of the parameter to suit.

Bad wording - I meant "parameterize" in the sense of "use language
features of verilog", assuming one uses verilog. :)

Not just parameters, but the instruction format doesn't care. Literals
are built up in 7 bit chunks from an 8 bit instruction or 8 bit chunks
with a 9 bit instruction since many FPGAs have memory and multipliers 9
bit wide multiples. The data path has no restrictions on width.

For the most part, I will acknowledge and defer - you have more
experience than I do. But I'd say that "width" does relate to "speed" and
"complexity" - the thesis I'm pushing because wide enough ... bit
chunks ... require addition of extra gates to the sides and at some point
require additional gates in the depth to ... integrate/consume ... the
wider output.

Illustration: let's assume we're making a comparator, but using only LUTs
(so no carry chain/mux magic). If we compare two 2-bit numbers, we can
use a single LUT4 to produce an output, with the whole construct being 1
gate deep. If we increase the width to 3 or 4 bits, we now need two LUT4s
to compare the numbers. But those two gates have two outputs. To further
"compress" it to a single output, we need a third LUT4 *in series* with
the other two. So the whole thing is now 2x slower and 3x bigger than it
used to be. Now, taking into account the users of the comparators output,
we can optimize. If the output used to go into a LUT with one spare
input, we don't have to add the extra series gate, we can route the extra
side gate to the spare input. But that increases the complexity, may
require a (buggy) optimizing synthesizer/place&route and may tie you down
if you ever need to change something that would affect the "spare" input
that got pressed into optimization.

This would be much easier if I knew for sure what the standard
terminology was. Self-taught and all that..

Adding a single bit to a round number can throw the synthesis results
way out of optimum. Adding one more can make the gate chain too long to
fit into a clock cycle. Changing the clock period can be impossible
because the design could have several other interlocking clocks. And on
and on.

Now you are way outside the issues of CPU design. Now you are in the
design of your application.

The way I usually do things, I have batches of verilog that gets
synthesized, placed and routed in one go. I don't floorplan. With that
setting, a change in one module can have nasty consequences somewhere
unrelated. I had this happene when I added a larger (16 bits) counter in
one part that made the whole thing run too slow. The solution was
chopping the counter into a series of smaller (4x4-bits) counters and
manually carrying the carry between them.

I infer the problem was that the big counter was taking up prime real
estate in the chip that the (big and clunky home-grown CPU) needed.
Chopping it up allowed the p&r tool to spread the counter over a wider
area, allowing the CPU bits to be closer together and work fast. The chip
was over 95% full that time.

For example - I discovered that the synthesis tool I used (Lattices
synthesizer) would produce a sub-optimal result if a unit - say a
module - had even a single odd-sized register. Changing the register
sizes to even numbers makes synthesis much better, even if it does
throw away a bit.

What was suboptimal about a register? What sort of unit?

I remember increasing a register from 2 bits to 3 bits killed the maximum
frequency the circuit can run. It became at least 2x slower. If I
remember correctly, further increasing it to 4 bits fixed most of the
problems. It was bizzare.

If I look hard enough, I may even be able to find the code. I seem to
remember it was in a routing module of some kind. Or some other glue
module that would connect the CPU core to the SRAM. That was on a MachXO2.

Exactly what is your dream computer?

A device whose design can fit in my head, that is transparent and
serviceable on all levels, free(-as-in-freedom), secure and usable for
real-world tasks. I should probably put "usable" at the top of the
list. :)

Right now, that would mean a FPGA-implemented FOSH SoC that is self-
contained. That means, which can regenerate the images and binaries of
itself, by itself (so you don't need a second computer for that).

I thought there was already a CPU design like that. RISC-V Does that
not fit your description? Check out this page for ideas...

http://www.lowrisc.org/

I am aware of lowRISC, and of Milkymist, of some x86 (soft) chip, and of
Ben Nanonote and of Novena (which doesn't fully fit the bill).

I was originally going to use Milkymist, with maybe some peripherals
swapped but I didn't like the way they implemented the DDR controler
(there wasn't a lot of it). Furthermore, lm32 - the systems CPU - had
problems. Its load/store module would block the instruction pipeline and
didn't look like could be easily converted into a non-blocking
implementation. Also, on a cache miss, the CPUs cache would block
everything, fill the entire cache line (which could be several words
long) and release the block. I felt like it would be easier to take a
simpler CPU and extend it with the requisite functionality.

A warning: the following paragraph lists the *exact* causes of my
frustration with lowRISC, and may cause you to get frustrated in turn,
especially if you have a stake in the project. :)

I took a look (back in October or November 2015) at lowRISC but two
(well, three) things put me off. First, the code is hard to find. Just
now, as I was fact-checking my post, I had to click dozens of times on
the web-site to see a link to GitHub. And, once there, it still took
another dozen clicks and some fudging to find at least some of the actual
verilog. I still don't know where to find the code for the CPU, for
example. Milkymist had no such problems. Neither did the repository for
Nyuzi CPU, or the AEMB/AEMB2 repo. The second problem was that the web
site mostly lists a lot of ... well ... fluf, but is remarkably short on
meaty details. How about a list of the peripherals? It doesn't even have
a page on Wikipedia! The wiki magic could have saved it.

The third problem it shares with OpenRISC and it is that the benchmarks
show it running at about 40MHz or less in a rough class FPGA-s I was
likely to implement it in. Meanwhile, Lattice was promising 85MHz for its
lm32. The final nail in the coffin for me and OpenRISC was this document:
http://iosrjournals.org/iosr-jvlsi/papers/vol2-issue4/G0244346.pdf
I am aware they list OpenRISC as running at 185MHz. It is also the place
where I discovered AEMB.
 
On 5/16/2016 9:18 AM, Aleksandar Kuktin wrote:
On Sun, 01 May 2016 23:13:44 -0400, rickman wrote:

On 5/1/2016 2:24 PM, Aleksandar Kuktin wrote:

[snip]

Right now, with what I'm currently working on, I have a blessing in
that I don't have to run the circuit very fast so I can get away with
three- gate deep logic, maybe even four gate deep. But if I were going
for break-
neck speeds, I would be constrained to logic two LUT4 gates deep. Only
so many features can be crammed into a design made with that. :)

Maybe I don't know what you mean by fast and not fast. Got some
numbers?

I consider 100MHz to be fast, 10MHz to be slow. On a clock-independent
level, I consider logic 2 LUT deep to be fast and logic more than 4 LUTs
deep to be slow.

If you add a bit to the word or address size, you are not just
doubling the CPUs capabilities, you are also doubling the number,
size and scope of problems you have to deal with.

??? My CPU design did not specify the data size, only the instruction
size. I didn't have a problem adjusting the data size to suit my
application.

I suppose you can parameterize the data size, and later change the
definition of the parameter to suit.

Bad wording - I meant "parameterize" in the sense of "use language
features of verilog", assuming one uses verilog. :)

Not just parameters, but the instruction format doesn't care. Literals
are built up in 7 bit chunks from an 8 bit instruction or 8 bit chunks
with a 9 bit instruction since many FPGAs have memory and multipliers 9
bit wide multiples. The data path has no restrictions on width.

For the most part, I will acknowledge and defer - you have more
experience than I do. But I'd say that "width" does relate to "speed" and
"complexity" - the thesis I'm pushing because wide enough ... bit
chunks ... require addition of extra gates to the sides and at some point
require additional gates in the depth to ... integrate/consume ... the
wider output.

Illustration: let's assume we're making a comparator, but using only LUTs
(so no carry chain/mux magic). If we compare two 2-bit numbers, we can
use a single LUT4 to produce an output, with the whole construct being 1
gate deep. If we increase the width to 3 or 4 bits, we now need two LUT4s
to compare the numbers. But those two gates have two outputs. To further
"compress" it to a single output, we need a third LUT4 *in series* with
the other two. So the whole thing is now 2x slower and 3x bigger than it
used to be. Now, taking into account the users of the comparators output,
we can optimize. If the output used to go into a LUT with one spare
input, we don't have to add the extra series gate, we can route the extra
side gate to the spare input. But that increases the complexity, may
require a (buggy) optimizing synthesizer/place&route and may tie you down
if you ever need to change something that would affect the "spare" input
that got pressed into optimization.

This would be much easier if I knew for sure what the standard
terminology was. Self-taught and all that..

It's not so much an issue of terminology, but of technology. A
comparator in an FPGA would use a carry chain. Yes, this results in a
delay that increases linearly with data width, but in general the delay
is so short that for any data size up to 64 bits it won't significantly
impact the speed. So unless you are going for very large data paths,
this is not a major factor in your CPU speed.


Adding a single bit to a round number can throw the synthesis results
way out of optimum. Adding one more can make the gate chain too long to
fit into a clock cycle. Changing the clock period can be impossible
because the design could have several other interlocking clocks. And on
and on.

Now you are way outside the issues of CPU design. Now you are in the
design of your application.

The way I usually do things, I have batches of verilog that gets
synthesized, placed and routed in one go. I don't floorplan. With that
setting, a change in one module can have nasty consequences somewhere
unrelated. I had this happene when I added a larger (16 bits) counter in
one part that made the whole thing run too slow. The solution was
chopping the counter into a series of smaller (4x4-bits) counters and
manually carrying the carry between them.

I infer the problem was that the big counter was taking up prime real
estate in the chip that the (big and clunky home-grown CPU) needed.
Chopping it up allowed the p&r tool to spread the counter over a wider
area, allowing the CPU bits to be closer together and work fast. The chip
was over 95% full that time.

For example - I discovered that the synthesis tool I used (Lattices
synthesizer) would produce a sub-optimal result if a unit - say a
module - had even a single odd-sized register. Changing the register
sizes to even numbers makes synthesis much better, even if it does
throw away a bit.

What was suboptimal about a register? What sort of unit?

I remember increasing a register from 2 bits to 3 bits killed the maximum
frequency the circuit can run. It became at least 2x slower. If I
remember correctly, further increasing it to 4 bits fixed most of the
problems. It was bizzare.

All the bits in a register run in parallel, so length doesn't directly
impact the speed. The only factor of register length that would impact
speed is the length of the routing that connected the registers. You
would need to look at the timing report to see what was causing your
routing delays. Trying to analyze it a priori really isn't practical.


If I look hard enough, I may even be able to find the code. I seem to
remember it was in a routing module of some kind. Or some other glue
module that would connect the CPU core to the SRAM. That was on a MachXO2.

In any given design there can always be issues where a small change in
design causes a huge change in results. This is due to the chaotic
behavior of the tools when a design starts to push the density or speed
of the device.

--

Rick C
 

Welcome to EDABoard.com

Sponsor

Back
Top