24 bit signed multiplier

N

Nemesis

Guest
I'm trying to write a signed Multiplier in VHDL.
I wrote the code and synthesized it with ISE6.3 but it is to slow
for what I need, the synth. report says that the maximum clock speed
is 84MHz on a xcv2p50-5 target, I'd need something close to 128 MHz.
I also found some odd things, the report say that the XST inferred
4 MULT18x18s blocks, but I would have expected that 2 block were
sufficient.
Moreover the clock pin of the MULT18x18s block is unconnected (or at
least it seems to be looking at the "RTL Schematic View"), so why it
didn't used a simple MULT18x18 (asynchronous)?

Here is the VHDL code I wrote, please let me know if you have ideas
to improve the speed.

**********************multiplier.vhd*******************************
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;

entity multiplier is
port (
A : in std_logic_vector(23 downto 0);
B : in std_logic_vector(23 downto 0);
CLK : in std_logic;
RESET : in std_logic;
MULT : out std_logic_vector(47 downto 0)
);
end multiplier;

architecture Behavioral of multiplier is
signal A_signed : signed (A'high downto 0);
signal B_signed : signed (B'high downto 0);
begin
----------------------------------------------------------------
process (CLK,RESET,A,B)
--variable A_signed : signed (A'high downto 0);
--variable B_signed : signed (B'high downto 0);
variable MULT_signed : signed (A'high+B'high+1 downto 0);
begin
if RESET='1' then
MULT <= ( others => '0');
elsif rising_edge(CLK) then
A_signed <=signed(A);
B_signed <=signed(B);
MULT_signed := A_signed * B_signed;
MULT <= std_logic_vector(MULT_signed);
end if;
end process;
----------------------------------------------------------------
end Behavioral;
**********************multiplier.vhd*******************************
 
Think about it. in any multipliction if you factor the multiplcand and
the multiplier in then you'll have four effective multiplications to be
done. thats why you have 4 mutipliers. The timing is not what you want
because you have no pipelining there. the multiplication is done in a
single clock cycle. instead if you split it into say, 3 clock cyles
then you will drastically improve the frequency. you can specify this
while instantiating the block multipliers.
 
Neo wrote:

Think about it. in any multipliction if you factor the multiplcand and
the multiplier in then you'll have four effective multiplications to be
done. thats why you have 4 mutipliers.
Thanks, now it's clear.

The timing is not what you want
because you have no pipelining there. the multiplication is done in a
single clock cycle. instead if you split it into say, 3 clock cyles
then you will drastically improve the frequency. you can specify this
while instantiating the block multipliers.
That's not true, even if I didn't specified the pipiling XST has
automatically created a multiplier with 3 pipeline stage. Now I'm doing
tests with the Multiplier Core, with the basic settings I obtain the
same timings (84MHz), if I set the core to use the "Maximum Pipelin" at
least I reach 149MHz.
But if I don't select the "Asynchronous Clear" then the spead goes up
(but only in the case of "Maximum Pipeline"), I got 234MHz.
I also saw that if I use luts instead of MULT18x18 the speed is higher,
is there a way to obtain a lut based multiplicator without using the
Core?
I'd like to keep the VHDL code for portability issues.
 
Alvin Andries wrote:

Hi,

Just a note: the asynchrounous clear of your multiplication will prevent the
use of the pipelined Xilinx multipliers because their register has only a
synchrounous reset.
I tried also with a synchronous reset and without the reset, but
nothing changed.
 

Welcome to EDABoard.com

Sponsor

Back
Top