Wishbone interface, FPGA newbie and advice

  • Thread starter Christopher Fairbairn
  • Start date
C

Christopher Fairbairn

Guest
Hi,

I am just starting out with respect to utilising FPGAs within my designs. Up
to this point in time I've mostly been involved with microcontroller based
designs, but I'm seeing where FPGAs can help solve various types of tasks I
find myself in and hence wanting to learn how to use them properly.

I've brought a small development board and have gradually implemented
various designs for making various beeps and sirens on a speaker, and
decoders such as a binary to 7segment decoder etc etc. I term them "simple"
designs since they are designs that can be "completely wrong" but at the
same time still "work" (i.e. yes it does what I set out to do, but the
design techniques might not be the best, or it wouldn't work if I had a bit
a little more clock slew etc etc...).

I am at the stage where I think I am almost confident enough to design a
FPGA into a hobby project of my own with enough confidence that I'll be
able to develop the right VHDL source to make it "tick"...

To this end I've started attempting to simulate/develop a 68HC11 to Wishbone
interface (the idea being to graft an FPGA onto the databus of an existing
HC11 produt) and I'd like some advice on what I've got so far. Not knowing
much about data busses typically utilised within FPGA designs etc Wishbone
looked like a good idea, especially when sites such as www.opencores.org
appear to support it for their freely available cores.

My VHDL source for a wishbone master which has an HC11 data bus interface on
the other end is shown below. It's a slightly "confused" master, in that
it's not as seperted out as the Wishbone specification details, i.e. it's
got elements of an InterCon and SysCon module thrown in as well... Sorry
about the poor coding standard (the Wishbone/HC11 bus signals are named
using different conventions etc), I just thought I'd "throw it out there"
rather than worrying about tiding it up first...

entity hc11_wb_master is
port (
-- Wishbone bus interface
WB_CLK_O : out std_logic; -- Clock
WB_RST_O : out std_logic; -- Reset
WB_ADR_O : out std_logic_vector(1 downto 0);
WB_DAT_I : in std_logic_vector(7 downto 0);
WB_DAT_O : out std_logic_vector(7 downto 0);
WB_WE_O : out std_logic; -- Write enable
WB_STB_O : out std_logic; -- Strobe
-- HC11 bus interface
e_clk : in std_logic;
reset : in std_logic;
cs : in std_logic; -- chip select for FPGA (HC11's CS2)
rw : in std_logic; -- read/write* strobe
addr : in std_logic_vector(10 downto 0);
data : inout std_logic_vector(7 downto 0);
);
end hc11_wb_master;

architecture Behavioral of hc11_wb_master is
signal write_cycle : std_logic;
begin
WB_CLK_O <= not e_clk; // HC11 clocks on the opposite edge to Wishbone

-- Asserted if the HC11 bus cycle is a memory write
write_cycle = not(rw);

process (e_clk)
begin
if (falling_edge(e_clk)) then
WB_RST_O <= not reset; -- HC11 is active low reset

data <= "ZZZZZZZZ";

if (reset = '0') then
-- Reset condition - hold wishbone bus inactive
WB_STB_O <= '0';
else
-- Translate the various control signals from HC11 to wishbone
WB_WE_O <= write_cycle;
WB_ADR_O <= addr(1 downto 0);
WB_STB_O <= cs;

-- Transfer data in the correct direction...
if (cs = '1') then
if (write_cycle = '1') then
WB_DAT_O <= data;
else
data <= WB_DAT_I;
end if;
end if;
end if;
end if;
end process;
end Behavioral;


Now there are various things that with my present dismal knowledge I am not
sure about. It seems it simulates ok..., I have a testbench where I've
attached it to a UART module from www.opencores.org and I can see in the
resultant simulation a write to the UART and the resultant activity on the
UART's TX pin etc etc...

However I have questions...

For example the line

WB_CLK_O <= not e_clk;

I utilised to invert the polarity of the clock between the two busses. My
idea is to clock the wishbone bus via the HC11's E-clock to keep the
wishbone bus transactions synchronous to the HC11's bus operations (seems
logical to me). The problem is the HC11 reads/writes on the falling edge,
while the wishbone interface does it on the rising edge... But isn't this
inversion I've done inheriently evil or "bad design"?

I have read discussions about the problems of gated clocks causing problems
with respect to routing and propagation delays. Isn't this the same sort of
issue? Everything hanging off the wishbone interface is going to be clocked
via WB_CLK_O which although not "gated", it's going through a minimal
amount of logic from the actual clock signal entering the FPGA. Perhaps
it's not a concern? Perhaps the systhensis tools are smart enough to mean I
don't have to worry about it..

I am also interested with respect to how I've attempted to implement the
tri-stateable interface for the 8bit data port to the HC11's bus. Am I
heading in the right direction in this respect?

Another thing thats going to show how completely clueless I am... is with
respect to setup/hold timing. The logic I've implemented above (to my
knoweldge at least) will latch the wishbone data onto the HC11's databus on
the falling edge of the eclock (assuming the HC11 initiates a read cycle of
course).. but this is also the same time that I'm latching the incomming
address signals and obviously it's not going to have time to propogate
through the logic, cause the wishbone slave to dump the correct data onto
the databus and be latched... What am I doing wrong here, and in what way
would I go about starting to make changes to correct this... I assume the
only reason it's simulated fine so far is that I've done my simulation
before the synthesis stage, and hence there's no "timing/routing" related
info in the simulation. Am I correct in my understanding here?

Now knowing me the code I placed above is completely wrong for this and many
other reasons.. but that is why I would appreciate any advice you could
provide me. I'm doing this exercise purely as a learning exercise so the
more avanues I find for further learning the better :)

One thing this project has highlighted is the need for me to learn more
about writing better testbenches... I totally agree with the comments I've
seen in this newgroup about newbies starting out with a simulator and no
physical hardware. I sure wish I had started out that way :)

My aim with this project is to eventually get an HC11 talking to some
perhperials implemented within an FPGA that are wishbone slaves. The
perpherials are rather basic themselfs, mainly boiling down to a couple of
high speed counters at this stage.

Thanks,
Christopher Fairbairn.

PS: Can anyone suggest a good book on Simulation, in particular with
relation to VHDL? I've identified this as an area where I'm particular
weak, and as an area where improvements would make progress a bit smoother.
For example how do you go about reading in stimulius from a file so I don't
have to hardcode in bus transactions in my testbench, and can instead
simply read them in from a file containing "read 0x8000, write 0x7000" or
the equivalent...
 
Hi!

To this end I've started attempting to simulate/develop a 68HC11 to
Wishbone
interface (the idea being to graft an FPGA onto the databus of an existing
HC11 produt) and I'd like some advice on what I've got so far. Not knowing
much about data busses typically utilised within FPGA designs etc Wishbone
looked like a good idea, especially when sites such as www.opencores.org
appear to support it for their freely available cores.
I've done a similar project before: I have an async master to WB interface.
It's (as far as I can tell) working but it's not an easy task to do.

My VHDL source for a wishbone master which has an HC11 data bus interface
on
the other end is shown below. It's a slightly "confused" master, in that
it's not as seperted out as the Wishbone specification details, i.e. it's
got elements of an InterCon and SysCon module thrown in as well... Sorry
about the poor coding standard (the Wishbone/HC11 bus signals are named
using different conventions etc), I just thought I'd "throw it out there"
rather than worrying about tiding it up first...

entity hc11_wb_master is
port (
-- Wishbone bus interface
WB_CLK_O : out std_logic; -- Clock
WB_RST_O : out std_logic; -- Reset
WB_ADR_O : out std_logic_vector(1 downto 0);
WB_DAT_I : in std_logic_vector(7 downto 0);
WB_DAT_O : out std_logic_vector(7 downto 0);
WB_WE_O : out std_logic; -- Write enable
WB_STB_O : out std_logic; -- Strobe
-- HC11 bus interface
e_clk : in std_logic;
reset : in std_logic;
cs : in std_logic; -- chip select for FPGA (HC11's CS2)
rw : in std_logic; -- read/write* strobe
addr : in std_logic_vector(10 downto 0);
data : inout std_logic_vector(7 downto 0);
);
end hc11_wb_master;

architecture Behavioral of hc11_wb_master is
signal write_cycle : std_logic;
begin
WB_CLK_O <= not e_clk; // HC11 clocks on the opposite edge to
Wishbone

-- Asserted if the HC11 bus cycle is a memory write
write_cycle = not(rw);

process (e_clk)
begin
if (falling_edge(e_clk)) then
WB_RST_O <= not reset; -- HC11 is active low reset

data <= "ZZZZZZZZ";

if (reset = '0') then
-- Reset condition - hold wishbone bus
inactive
WB_STB_O <= '0';
else
-- Translate the various control signals
from HC11 to wishbone
WB_WE_O <= write_cycle;
WB_ADR_O <= addr(1 downto 0);
WB_STB_O <= cs;

-- Transfer data in the correct
direction...
if (cs = '1') then
if (write_cycle = '1') then
WB_DAT_O <= data;
else
data <= WB_DAT_I;
end if;
end if;
end if;
end if;
end process;
end Behavioral;
There are some problems with this design:
- You don't handle error and retry requests from the WB side and don't
generate WB_CYC_O.
- There's no wait-state generation. You don't detect any wait-state requests
from the WB side and don't generate wait-states for your async master
(HC11). That can cause problems if you communicate with slow devices (for
example with a FIFO which, when full, generated waits)
- More importantly, your logic is all wrong. The WB bus is syncronous and
have this feature: a cycle starts by the master asserts WB_CYC_O (which you
don't generate to begin with) and ends when the target asserts WB_ACK_I (or
WB_ERR_I or WB_RTY_I). After that, at the next clock, if WB_CYC_O remains
active, it starts a new cycle. So your logic can generate multiple writes or
reads to/from the same location depending on the timing. That can cause
serious problems with transmit or receive FIFOs even in your case of the
serial controller. (If not, than that's a bug in the serial controller...)
Even more problematic is that this 'feature' combined with the lack of
proper wait-state handling can cause invalid data to be written to any
location and invalid data to be read from any location that are not
zero-wait-state.

I'm sorry, I've been down that road too, so I know :-(...

Now there are various things that with my present dismal knowledge I am
not
sure about. It seems it simulates ok..., I have a testbench where I've
attached it to a UART module from www.opencores.org and I can see in the
resultant simulation a write to the UART and the resultant activity on the
UART's TX pin etc etc...

However I have questions...

For example the line

WB_CLK_O <= not e_clk;

I utilised to invert the polarity of the clock between the two busses. My
idea is to clock the wishbone bus via the HC11's E-clock to keep the
wishbone bus transactions synchronous to the HC11's bus operations (seems
logical to me). The problem is the HC11 reads/writes on the falling edge,
while the wishbone interface does it on the rising edge... But isn't this
inversion I've done inheriently evil or "bad design"?

I have read discussions about the problems of gated clocks causing
problems
with respect to routing and propagation delays. Isn't this the same sort
of
issue? Everything hanging off the wishbone interface is going to be
clocked
via WB_CLK_O which although not "gated", it's going through a minimal
amount of logic from the actual clock signal entering the FPGA. Perhaps
it's not a concern? Perhaps the systhensis tools are smart enough to mean
I
don't have to worry about it..
I don't think it is an issue.

I am also interested with respect to how I've attempted to implement the
tri-stateable interface for the 8bit data port to the HC11's bus. Am I
heading in the right direction in this respect?
Tri-state seems to be OK.

Another thing thats going to show how completely clueless I am... is with
respect to setup/hold timing. The logic I've implemented above (to my
knoweldge at least) will latch the wishbone data onto the HC11's databus
on
the falling edge of the eclock (assuming the HC11 initiates a read cycle
of
course).. but this is also the same time that I'm latching the incomming
address signals and obviously it's not going to have time to propogate
through the logic, cause the wishbone slave to dump the correct data onto
the databus and be latched... What am I doing wrong here, and in what way
would I go about starting to make changes to correct this... I assume the
only reason it's simulated fine so far is that I've done my simulation
before the synthesis stage, and hence there's no "timing/routing" related
info in the simulation. Am I correct in my understanding here?
Setup-hold times are device (and place-and-route) specific, so I can't
answer that without knowing more about your target architecture. The FPGA
and uP datasheet should answer most of your questions. In general FPGAs are
much faster than an HC11 so you might have setup problems on the HC11 side
but others should work fine.

I'll paste my circuit here for reference. Note, that it does not use the
CPUs clock to sync up the WB part, so it can run on much higher clock speeds
(in my case 70MHz). That can help meet the timing (with proper wait-state
handling of course) but can cause all kinds of meta-stability issues, I'm
not sure I've addressed properly either. Please note that I'm not a
professional either, I'm not claiming my design to be nice or flowless. It
at least worked in a real HW... Any comment are welcome.

Andras Tantos

================================================================

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_ARITH.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;

entity wb_async_master is
generic (
dat_width: positive := 16;
adr_width: positive := 20;
ab_rd_delay: positive := 1
);
port (
wb_clk_i: in std_logic;
wb_rst_i: in std_logic := '0';

-- interface to wb slave devices
wb_adr_o: out std_logic_vector (adr_width-1 downto 0);
wb_sel_o: out std_logic_vector ((dat_width/8)-1 downto 0);
wb_dat_i: in std_logic_vector (dat_width-1 downto 0);
wb_dat_o: out std_logic_vector (dat_width-1 downto 0);
wb_cyc_o: out std_logic;
wb_ack_i: in std_logic;
wb_err_i: in std_logic := '-';
wb_rty_i: in std_logic := '-';
wb_we_o: out std_logic;
wb_stb_o: out std_logic;

-- interface to the asyncronous master device
ab_dat: inout std_logic_vector (dat_width-1 downto 0) := (others =>
'Z');
ab_adr: in std_logic_vector (adr_width-1 downto 0) := (others =>
'U');
ab_rd_n: in std_logic := '1';
ab_wr_n: in std_logic := '1';
ab_ce_n: in std_logic := '1';
ab_byteen_n: in std_logic_vector ((dat_width/8)-1 downto 0);
ab_wait_n: out std_logic; -- wait-state request 'open-drain' output
ab_waiths: out std_logic -- handshake-type totem-pole output

);
end wb_async_master;

architecture xilinx of wb_async_master is
constant ab_wr_delay: positive := 2;
-- delay lines for rd/wr edge detection
signal rd_delay_rst: std_logic;
signal rd_delay: std_logic_vector(ab_rd_delay downto 0);
signal wr_delay: std_logic_vector(ab_wr_delay downto 0);
-- one-cycle long pulses upon rd/wr edges
signal ab_wr_pulse: std_logic;
signal ab_rd_pulse: std_logic;
-- one-cycle long pulse to latch address for writes
signal ab_wr_latch_pulse: std_logic;
-- WB data input register
signal wb_dat_reg: std_logic_vector (dat_width-1 downto 0);
-- internal copies of WB signals for feedback
signal wb_cyc_l: std_logic;
signal wb_we_l: std_logic;
-- Comb. logic for active cycles
signal ab_rd: std_logic;
signal ab_wr: std_logic;
signal ab_active: std_logic;
-- internal copies of wait signals for feedback
signal ab_wait_n_rst: std_logic;
signal ab_wait_n_l: std_logic;
signal ab_waiths_l: std_logic;
signal ab_wait_n_l_delayed: std_logic;
signal ab_waiths_l_delayed: std_logic;
-- active when WB slave terminates the cycle (for any reason)
signal wb_ack: std_logic;
-- signals a scheduled or commencing posted write
signal write_in_progress: std_logic;
begin
ab_rd <= (not ab_ce_n) and (not ab_rd_n) and ab_wr_n;
ab_wr <= (not ab_ce_n) and (not ab_wr_n) and ab_rd_n;
ab_active <= not ab_ce_n;

wb_ack <= wb_cyc_l and (wb_ack_i or wb_err_i or wb_rty_i);

write_in_progress_gen: process
begin
if (wb_rst_i = '1') then
write_in_progress <= '0';
end if;
wait until wb_clk_i'EVENT and wb_clk_i = '1';
if ab_wr = '0' and wr_delay(wr_delay'HIGH) = '1' then
write_in_progress <= '1';
end if;
if wb_ack = '1' then
write_in_progress <= '0';
end if;
end process;

-- Registers addr/data lines.
reg_bus_lines: process
begin
if (wb_rst_i = '1') then
wb_adr_o <= (others => '-');
wb_sel_o <= (others => '-');
wb_dat_o <= (others => '-');
wb_dat_reg <= (others => '0');
end if;
wait until wb_clk_i'EVENT and wb_clk_i = '1';
-- Store and sycnronize data and address lines if no (posted) write
-- is in progress and there is an active asyncronous bus cycle.
-- We store addresses for reads at the same time we sample the data
so setup and hold
-- times are the same.
if (ab_wr = '1' or ab_rd_pulse = '1') and (write_in_progress = '0'
or wb_ack = '1') then
wb_adr_o <= ab_adr;
for i in wb_sel_o'RANGE loop
wb_sel_o(i) <= not ab_byteen_n(i);
end loop;
end if;
if (ab_wr = '1') and (write_in_progress = '0' or wb_ack = '1') then
wb_dat_o <= ab_dat;
end if;

-- en-register data input at the end of a read cycle
if wb_ack = '1' then
if wb_we_l = '0' then
-- read cycle completed, store the result
wb_dat_reg <= wb_dat_i;
end if;
end if;
end process;

-- Registers asycn bus control lines for sync edge detection.
async_bus_wr_ctrl : process(wb_rst_i,wb_clk_i)
begin
if (wb_rst_i = '1') then
wr_delay <= (others => '0');
elsif wb_clk_i'EVENT and wb_clk_i = '0' then
-- delayed signals will be used in edge-detection
for i in wr_delay'HIGH downto 1 loop
wr_delay(i) <= wr_delay(i-1);-- and ab_rd;
end loop;
wr_delay(0) <= ab_wr;
end if;
end process;

rd_delay_rst <= wb_rst_i or not ab_rd;
async_bus_rd_ctrl : process(rd_delay_rst,wb_clk_i)
begin
if (rd_delay_rst = '1') then
rd_delay <= (others => '0');
-- Post-layout simulation shows glitches on the output that violates
setup times.
-- Clock on the other edge to solve this issue
elsif wb_clk_i'EVENT and wb_clk_i = '0' then
-- a sync-reset shift-register to delay read signal
for i in rd_delay'HIGH downto 1 loop
rd_delay(i) <= rd_delay(i-1) and ab_rd;
end loop;
if (wb_cyc_l = '1') then
rd_delay(0) <= rd_delay(0);
else
rd_delay(0) <= ab_rd and not write_in_progress;
end if;
end if;
end process;
-- will be one for one cycle at the proper end of the async cycle
ab_wr_pulse <= wr_delay(wr_delay'HIGH) and not
wr_delay(wr_delay'HIGH-1);
ab_wr_latch_pulse <= not wr_delay(wr_delay'HIGH) and
wr_delay(wr_delay'HIGH-1);
ab_rd_pulse <= not rd_delay(rd_delay'HIGH) and
rd_delay(rd_delay'HIGH-1);

-- Generates WishBone control signals
wb_ctrl_gen: process
begin
if (wb_rst_i = '1') then
wb_stb_o <= '0';
wb_cyc_l <= '0';
wb_we_l <= '0';
end if;
wait until wb_clk_i'EVENT and wb_clk_i = '1';
if wb_ack = '1' and ab_wr_pulse = '0' and ab_rd_pulse = '0' then
wb_stb_o <= '0';
wb_cyc_l <= '0';
wb_we_l <= '0';
end if;

if ab_wr_pulse = '1' or ab_rd_pulse = '1' then
wb_stb_o <= '1';
wb_cyc_l <= '1';
wb_we_l <= ab_wr_pulse;
end if;
end process;

-- Generate asyncronous wait signal
ab_wait_n_rst <= wb_rst_i or not ab_active;
a_wait_n_gen: process(ab_wait_n_rst, wb_clk_i)
begin
if (ab_wait_n_rst = '1') then
ab_wait_n_l <= '1';
elsif wb_clk_i'EVENT and wb_clk_i = '1' then
-- At the beginning of a read cycle, move wait low
if ab_wait_n_l = '1' and ab_rd = '1' and rd_delay(0) = '0' then
ab_wait_n_l <= '0';
end if;
-- At the beginning of any cycle, if the ss-master part is busy,
wait
if (ab_wait_n_l = '1' and (ab_rd = '1' or ab_wr = '1')) and
(wb_cyc_l = '1')
then
ab_wait_n_l <= '0';
end if;
-- At the end of an ss-master cycle, remove wait
if wb_ack = '1' and (
(wb_we_l = '1' and ab_rd = '0') or -- no pending read
wb_we_l = '0') -- was a read operation
then
ab_wait_n_l <= '1';
end if;
end if;
end process;

-- Generate handshake-type wait signal
a_waiths_gen: process(wb_rst_i,wb_clk_i)
begin
if (wb_rst_i = '1') then
ab_waiths_l <= '0';
elsif wb_clk_i'EVENT and wb_clk_i = '1' then
-- Write handling
if wb_cyc_l = '0' and ab_wr = '1' then
ab_waiths_l <= '1';
end if;
if wb_ack = '1' and ab_waiths_l = '1' then
ab_waiths_l <= '0';
end if;

-- Read handling
if wb_ack = '1' and ab_rd = '1' then
ab_waiths_l <= '1';
end if;

if wb_cyc_l = '0' and ab_rd = '0' and ab_wr = '0' and
wr_delay(wr_delay'HIGH) = '0'
then
ab_waiths_l <= '0';
end if;
end if;
end process;

-- connect local signals to external pins
wb_cyc_o <= wb_cyc_l;
wb_we_o <= wb_we_l;
ab_dat <= wb_dat_reg when ab_rd = '1' else (others => 'Z');

-- On post-layout simulation it turned out that the data is not stable
upon
-- the raising edge of these wait signals. So we delay the raising edge
with one-half clock
delay_wait: process(wb_clk_i)
begin
if wb_clk_i'EVENT and wb_clk_i = '0' then
ab_wait_n_l_delayed <= ab_wait_n_l;
ab_waiths_l_delayed <= ab_waiths_l;
end if;
end process;
ab_wait_n <= ab_wait_n_l and ab_wait_n_l_delayed;
ab_waiths <= ab_waiths_l and ab_waiths_l_delayed;
end xilinx;
 
Christopher Fairbairn wrote:

For example the line

WB_CLK_O <= not e_clk;

The problem is the HC11 reads/writes on the falling edge,
while the wishbone interface does it on the rising edge... But isn't this
inversion I've done inheriently evil or "bad design"?
Consider synchronizing the interface to a faster fpga clock and generate
your own synchronous read and write strobes in just the right places.

Perhaps the systhensis tools are smart enough to mean I
don't have to worry about it..
They aren't.

I am also interested with respect to how I've attempted to implement the
tri-stateable interface for the 8bit data port to the HC11's bus. Am I
heading in the right direction in this respect?
Looks reasonable for a first cut.
You have to compare your sim waveforms to the H11 and wishbone data sheets.
Consider making an oe signal to drive data Z between cycles.

I don't know the interfaces, but
WB_STB_O <= cs;
is suspect.

Normally cs lasts for multiple ticks,
the write strobe is one tick, somewhere in the middle.

One thing this project has highlighted is the need for me to learn more
about writing better testbenches... I totally agree with the comments I've
seen in this newgroup about newbies starting out with a simulator and no
physical hardware. I sure wish I had started out that way :)
It's not too late.

PS: Can anyone suggest a good book on Simulation, in particular with
relation to VHDL? I've identified this as an area where I'm particular
weak, and as an area where improvements would make progress a bit smoother.
You already know how to run the simulator,so get a copy of
Ashenden's guide to vhdl as a language reference, and get busy.
Consider adopting a synchronous testbench style:
http://groups.google.com/groups?q=oe_demo

For example how do you go about reading in stimulius from a file so I don't
have to hardcode in bus transactions in my testbench, and can instead
simply read them in from a file containing "read 0x8000, write 0x7000" or
the equivalent...
Use vhdl procedures to do this.
Using an intermediate text file makes it harder, not easier.

-- Mike Treseler
 
Mike Treseler wrote:

Consider synchronizing the interface to a faster fpga clock and generate
your own synchronous read and write strobes in just the right places.
The reason why I started with the idea of using the HC11 microcontroller's
E-clock to clock the wishbone interface was I imagined it would bring
simplisity, as I wouldn't need to deal with the interfacing between the two
different clock domains (i.e. the CPU's crystal derived clock and the FPGAs
clock).

In my circuit I don't have to worry about the HC11 going to a low power
state (where the e-clock signal would stop) as I'm not intending to use
such features.

Having said that it appears that things might be more flexiable if I
decouple that requirement, use a faster clock within the FPGA and deal with
the fact that each bus is now asychronous to each other.

I don't know the interfaces, but
WB_STB_O <= cs;
is suspect.

Normally cs lasts for multiple ticks,
the write strobe is one tick, somewhere in the middle.
Ok, well if I'm reading the HC11's waveforms correctly atleast on that bus
it's slightly different.

It has a single R/W* signal. A logic zero on this signal indicates that the
present bus cycle is a write operation and it can be held low for
consecutive bus cycles in cases where double-bytes are being written. The
R/W* is speced as being valid whenever the ADDRESS bus contains a valid
address (which is almost the entire bus cycle), as such it's more a "write
request" rather than a "write strobe".

With the programmable chip-selects there is a choice on when they are
asserted. Programming a register with the correct value will mean that the
chipselect will be asserted as soon as the address is placed upon the bus.
Changing that value can change the length of the chipselect strobe to make
it only occur for the second half of the bus cycle (when e-clock is high
meaning device should place data onto bus).

I know 100% that I'm not understanding basic bus interfacing at the moment.
I entered this project thinking (for the HC11 at least) that I could
basically be on the look out for a particular clock edge and then simply
look at the Read/Write signal and the chipselect to see if the transaction
was meant for me...

And I think it's almost like that.. for example looking at the HC11
datasheet indicates that on the rising edge of it's e-clock (a 4th of it's
crystal oscillator speed) the read/write strobe, the address and the
chipselect (if programmed correctly) will all be valid and hence I could
latch them into the FPGA...

But that's where I get stuck and my knowledge starts to run out. For a read
operation (HC11 reading a register from the FPGA) I think I'm fine... I can
detect the signals such as the read/write strobe on the rising edge of the
clock and output my data, then as soon as the eclock goes low I can
tristate the bus again (and this makes sense as the HC11 reference manual
has the following sentance in it.. "The E-clock can be used to enable
external devices to drive data onto the data bus during the second half of
the bus cycle (E clock high)". So I could have something like

hc11_data <= wb_data_o when (hc11_eclk = '1') else "ZZZZZZZZ";

where hc11_eclk is effectivtly the oe signal mentioned in another poster.

However the waveforms where the HC11 is writing to a register within the
FPGA is more confusing for me... When the E-clock goes high the data on the
databus isn't valid yet, however it's guarenteed to be valid for atleast
31ns after the e-clock's falling edge.

So it appears I can latch the address and read/write type signals on the
rising edge of e-clock (and place my data on the bus in case of a read
operation), but I have to wait until the falling edge before I can capture
the data the HC11 has presented for a write operation.

So does that mean I need to clock my module on both edges of the e-clock
signal? I thought that's discouraged as much as possible within FPGA
designs?

Or is this just pointing out how much more horibbly confused I have became?

Looks reasonable for a first cut.
You have to compare your sim waveforms to the H11 and wishbone data
sheets. Consider making an oe signal to drive data Z between cycles.
Reading the various replies to my initial posting and looking at the various
datasheets etc has made me appreciate excatly how little I actually know
about this... or even digitial logic in general when it comes to sequential
designs.

So at least in the mean time I've decided to concentrate on an even smaller
subcomponent of the desired goal.

Instead of attempting to simulate the entire HC11 to Wishbone bus interface
I'm going to concentrate on getting a simple HC11 bus interface designed
and simulated properly. I.e. attempting to basically simulate a 74HC series
8bit latch hanging off the HC11's databus... something I can do well with
"real" hardware :)

I think at the moment I have enough issues with respect to properly reading
the waveform timing diagrams in the HC11 reference manual to think about a
the complete design. Especially when you throw in considerations such as
how I'm going to deal with issues such as a slave wanting to extend a bus
cycle.. something which isn't as simple as asserting a signal on the HC11's
bus..

Once I get that far, then I can start worrying about interpreting the
waveforms in the wishbone spec and dealing with the "translation" of
signals between the two busses.

PS: Can anyone suggest a good book on Simulation, in particular with
relation to VHDL? I've identified this as an area where I'm particular
weak, and as an area where improvements would make progress a bit
smoother.

You already know how to run the simulator,so get a copy of
Ashenden's guide to vhdl as a language reference, and get busy.
Consider adopting a synchronous testbench style:
http://groups.google.com/groups?q=oe_demo
Thank you for this reference. The example testbench in that thread was sort
of what I was aiming for when I talked about reading stimulus from a file.
At present my test bench is physically hardcoded to perform the individual
bus cycles, i.e. I have something along the lines of

cs <= '0';
wait for ABCns;
we <= '1';
wait for XYZns;
.... blah blah blah ...

And I wanted a more "automated" way of doing it. Using a similiar techinque
to that used in the testbench in that thread, and using an array containing
the list of desired bus operations and iterating over it could just be the
ticket I'm looking for...

Thanks for the help,
Christopher Fairbairn
 
Hi,

Andras Tantos wrote:
- There's no wait-state generation. You don't detect any wait-state
requests from the WB side and don't generate wait-states for your async
master (HC11). That can cause problems if you communicate with slow
devices (for example with a FIFO which, when full, generated waits)
At the moment the three initial wishbone slaves I'm intending to interface
to are all pretty primitive and have their ACK_O simply tied to their STB_I
input as allowed by the wishbone spec if they don't require any waitstates.

As such I shouldn't need wait states, but it is something I've been concious
about and keeping in the back of my mind... knowing full well that
murphie's law will present a nice wishbone slave that I desire to use at
some stage in the future which will require waitstates... I'm just not
excatly sure how to deal with it, considering I can't stall the HC11 bus.

I'll paste my circuit here for reference. Note, that it does not use the
CPUs clock to sync up the WB part, so it can run on much higher clock
speeds (in my case 70MHz). That can help meet the timing (with proper
wait-state handling of course) but can cause all kinds of meta-stability
issues, I'm not sure I've addressed properly either. Please note that I'm
not a professional either, I'm not claiming my design to be nice or
flowless. It at least worked in a real HW... Any comment are welcome.
As part of my research and investigations today I discovered an asynchronous
wishbone master on www.opencores.org which appears to be developed by
yourself (http://www.opencores.org/cores/wb_tk/wb_async_master.shtml).

Is this correct? It appears that the version on the opencores website is a
lot simplier than the version you presented in your posting. Reading
through the source for the one on opencores.org has cleared a lot of things
up for me and "turned a couple of lightbulbs on" in my mind.. What's the
main differences between the two? I'm having difficulty following the one
in the newsgroup posting while I can pretty much follow the one on
www.opencores.org.

The testbench support code in the http://www.opencores.org/cores/wb_tk
project has also helped me out. The use of a VHDL function to wrap up the
inner workings of a bus cycle should stop me from duplicating all those
lines of code in my testbench for every read/write I perform on the bus...

Thanks,
Christopher Fairbairn.
 
Hi!

As such I shouldn't need wait states, but it is something I've been
concious
about and keeping in the back of my mind... knowing full well that
murphie's law will present a nice wishbone slave that I desire to use at
some stage in the future which will require waitstates... I'm just not
excatly sure how to deal with it, considering I can't stall the HC11 bus.
As long as you are aware of the limitations, it's OK.

As part of my research and investigations today I discovered an
asynchronous
wishbone master on www.opencores.org which appears to be developed by
yourself (http://www.opencores.org/cores/wb_tk/wb_async_master.shtml).
Yeah, I know ;-).

Is this correct? It appears that the version on the opencores website is a
lot simplier than the version you presented in your posting. Reading
through the source for the one on opencores.org has cleared a lot of
things
up for me and "turned a couple of lightbulbs on" in my mind.. What's the
main differences between the two? I'm having difficulty following the one
in the newsgroup posting while I can pretty much follow the one on
www.opencores.org.
The difference is that the one I've shown here actually works. I didn't have
time to update the cores on OpenCores, but as I started using them I've
found a lot of problems, and in the case of this core for example, I had to
completely re-write the whole thing. The old one pretty much goas along the
lines of your implementation, and as such, has the same fundamental
problems.

I know the presented core is a bit confusing, so here are some of the ideas
I've used:

You have to make sure, you generate exectly one WB cycle for each async bus
cycle. That requirement makes the handling of read and write cycles
different. For a write cycle, you have to make sure, you use the right data
from the async bus in the write cycle on the WB bus. In the general case,
the write data is valid at the rising edge of the (negated) control signals
of the async bus. This means, that you have to delay the write cycle on the
WB side until the write has been finished on the async side. For reads,
however you have to start the WB cycle in parallel with the async cycle to
make sure you have the right data available at the end of the cycle. Also,
in the general case, at the beginning of the read cycle you might not have a
valid address on the address-bus, so you might have to wait some time before
starting the read-operation. All in all, you need delayed writes and
parallel reads.

This requirement make the interface quite complicated, because writes happen
after the fact. What if the async side initiates another cycle while you're
performing that delayed write? You'll have to wait until the current WB bus
activity ends, and start the operation only afterwards. The way you handle
this wait is again, a bit different for reads and writes, but this fact
alone makes correct async-side wait-state generation a must.

Another thing to consider is what happens if the async master does not honor
your wait states and ends the cycle prematurely. That's an error, of course,
but you at least have to recover from it somehow.

And finally I've added two different type of wait-state generations: one is
a handshake-type, the one used in for example the EPP printer port
communication, and the other is the normal open-collector type wait signal
used in most uC busses.

The testbench support code in the http://www.opencores.org/cores/wb_tk
project has also helped me out. The use of a VHDL function to wrap up the
inner workings of a bus cycle should stop me from duplicating all those
lines of code in my testbench for every read/write I perform on the bus...
Just a side-note: I had test-banches for the old core, and it looked OK to
me. At the moment I've added it to a real HW, problems started. Createing
good test-benches is really hard. You can test only for what you've thought
about and chances are, you got those things right in the design. Real life
however tests your design on it's own way, and at the end of the day, that's
the test that should pass, not (only) your test-bench.

Regards,
Andras Tantos
 

Welcome to EDABoard.com

Sponsor

Back
Top