J
Jonathan Bromley
Guest
After all the recent brouhaha about signed arithmetic
and multiplication, I decided to try a little test.
I remember once getting badly burnt because I needed
an 18x18->36 UNSIGNED multiplier in a Xilinx device,
and of course the native 18x18 multipliers are SIGNED,
so I got a bunch of unexpected extra logic. So
I thought I'd try to work out for myself what it
takes to get 18x18 UNSIGNED multiplication if you
have SIGNED multipliers to work with. Apologies to
those for whom this is familiar or child's play.
With apologies to rickman, I stuck with my
preferred way of thinking about twos complement
signed numbers: the MSB represents -(2**(n-1)),
all other bits are positive. So let's consider
an 18-bit number to be in two pieces:
- an unsigned 17-bit value, call it "u"
- a 1-bit MSB, call it "s", whose value is...
- for signed, either 0 or -(2**17),
- for unsigned, either 0 or +(2**17)
So we have two unsigned numbers A, B each composed
of these two components, and we wish to multiply them
together using unsigned arithmetic to get a 36-bit
unsigned result:
uns_result = (sA+uA)*(sB+uB)
= sA*sB + sA*uB + sB*uA + uA*uB
But unfortunately we are stuck with a SIGNED multiplier
which treats these numbers as though they were:
sgn_result = (-sA+uA)*(-sB+uB)
= sA*sB - sA*uB - sB*uA + uA*uB
Woo-hoo! It's nearly the same! So we see that
uns_result = sgn_result + 2*(sA*uB + sB*uA)
But that is SO easy to do! Remember that sA and sB
are single-bit values, at the MSB end (bit 17) of the
operands. Let's do a little rearrangement:
sA = (2**17)*msbA, sB = (2**17)*msbB
so we have
uns_result = sgn_result + (2**18)*(msbA*uB + msbB*uA)
We can calculate the sum (msbA*uB + msbB*uA) in parallel
with the multiplication, and after the multiplication
we can add that value (left-shifted by 18 bits) to the
signed product, and we have the correct unsigned product.
OK, so here's the payoff. I coded this rather carefully
in VHDL to force a synthesis tool to build the
conditional adder (msbA*uB + msbB*uA) in only one LUT
per bit:
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity umult18x18 is
port
( clk : in std_logic
; a, b : in unsigned(17 downto 0)
; y : out unsigned(35 downto 0)
);
end;
architecture JB of umult18x18 is
signal ar, br: unsigned(a'range);
begin
process (clk)
variable sy, fixup: unsigned(35 downto 0);
variable sbXua, saXuB: unsigned(17 downto 0);
begin
if rising_edge(clk) then
-- Input registers
ar <= a;
br <= b;
-- Compute signed product
sy := unsigned(signed(ar) * signed(br));
-- Compute conditional operands to adder
if ar(17) = '1' then
saXub := ('0' & br(16 downto 0));
else
saXub := (others => '0');
end if;
if br(17) = '1' then
sbXua := ('0' & ar(16 downto 0));
else
sbXua := (others => '0');
end if;
-- Compute sum of conditional operands and shift it left by 18
fixup := (sbXua + saXuB) & (17 downto 0 => '0');
-- Output register is corrected product
y <= fixup + sy;
end if;
end process;
end;
and, guess what..... at least one synth tool gives significantly
better timing results on my code than on a simple A*B operation.
So the synth vendor hasn't worked out the neatest way to get
unsigned multiply from a signed multiplier block.
In fairness, the difference wasn't outrageously large. But
it was disappointing that the synth tool's built-in solution
to a standard problem was measurably sub-optimal.
Maybe hand-optimized arithmetic is not yet dead...
--
Jonathan Bromley, Consultant
DOULOS - Developing Design Know-how
VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services
Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK
jonathan.bromley@MYCOMPANY.com
http://www.MYCOMPANY.com
The contents of this message may contain personal views which
are not the views of Doulos Ltd., unless specifically stated.
and multiplication, I decided to try a little test.
I remember once getting badly burnt because I needed
an 18x18->36 UNSIGNED multiplier in a Xilinx device,
and of course the native 18x18 multipliers are SIGNED,
so I got a bunch of unexpected extra logic. So
I thought I'd try to work out for myself what it
takes to get 18x18 UNSIGNED multiplication if you
have SIGNED multipliers to work with. Apologies to
those for whom this is familiar or child's play.
With apologies to rickman, I stuck with my
preferred way of thinking about twos complement
signed numbers: the MSB represents -(2**(n-1)),
all other bits are positive. So let's consider
an 18-bit number to be in two pieces:
- an unsigned 17-bit value, call it "u"
- a 1-bit MSB, call it "s", whose value is...
- for signed, either 0 or -(2**17),
- for unsigned, either 0 or +(2**17)
So we have two unsigned numbers A, B each composed
of these two components, and we wish to multiply them
together using unsigned arithmetic to get a 36-bit
unsigned result:
uns_result = (sA+uA)*(sB+uB)
= sA*sB + sA*uB + sB*uA + uA*uB
But unfortunately we are stuck with a SIGNED multiplier
which treats these numbers as though they were:
sgn_result = (-sA+uA)*(-sB+uB)
= sA*sB - sA*uB - sB*uA + uA*uB
Woo-hoo! It's nearly the same! So we see that
uns_result = sgn_result + 2*(sA*uB + sB*uA)
But that is SO easy to do! Remember that sA and sB
are single-bit values, at the MSB end (bit 17) of the
operands. Let's do a little rearrangement:
sA = (2**17)*msbA, sB = (2**17)*msbB
so we have
uns_result = sgn_result + (2**18)*(msbA*uB + msbB*uA)
We can calculate the sum (msbA*uB + msbB*uA) in parallel
with the multiplication, and after the multiplication
we can add that value (left-shifted by 18 bits) to the
signed product, and we have the correct unsigned product.
OK, so here's the payoff. I coded this rather carefully
in VHDL to force a synthesis tool to build the
conditional adder (msbA*uB + msbB*uA) in only one LUT
per bit:
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity umult18x18 is
port
( clk : in std_logic
; a, b : in unsigned(17 downto 0)
; y : out unsigned(35 downto 0)
);
end;
architecture JB of umult18x18 is
signal ar, br: unsigned(a'range);
begin
process (clk)
variable sy, fixup: unsigned(35 downto 0);
variable sbXua, saXuB: unsigned(17 downto 0);
begin
if rising_edge(clk) then
-- Input registers
ar <= a;
br <= b;
-- Compute signed product
sy := unsigned(signed(ar) * signed(br));
-- Compute conditional operands to adder
if ar(17) = '1' then
saXub := ('0' & br(16 downto 0));
else
saXub := (others => '0');
end if;
if br(17) = '1' then
sbXua := ('0' & ar(16 downto 0));
else
sbXua := (others => '0');
end if;
-- Compute sum of conditional operands and shift it left by 18
fixup := (sbXua + saXuB) & (17 downto 0 => '0');
-- Output register is corrected product
y <= fixup + sy;
end if;
end process;
end;
and, guess what..... at least one synth tool gives significantly
better timing results on my code than on a simple A*B operation.
So the synth vendor hasn't worked out the neatest way to get
unsigned multiply from a signed multiplier block.
In fairness, the difference wasn't outrageously large. But
it was disappointing that the synth tool's built-in solution
to a standard problem was measurably sub-optimal.
Maybe hand-optimized arithmetic is not yet dead...
--
Jonathan Bromley, Consultant
DOULOS - Developing Design Know-how
VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services
Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK
jonathan.bromley@MYCOMPANY.com
http://www.MYCOMPANY.com
The contents of this message may contain personal views which
are not the views of Doulos Ltd., unless specifically stated.