Dividing by 48

A

ALuPin@web.de

Guest
Hi,

I have a signal (integer). How can I describe synthesizable code
for dividing that signal by 48 ? Result (ls_rowaddr) should
be whole-number that is integer.

SIGNAL ls_pos : integer RANGE 0 TO 8191;
SIGNAL ls_rowaddr : integer RANGE 0 TO 191;

PROCESS(Reset, Clk)
BEGIN
IF Reset='1' THEN
ls_pos <= 0;

ELSIF rising_edge(Clk) THEN
IF load='1' THEN
ls_pos <= LoadAddr;
END IF;

END IF;
END PROCESS;

-- synthesis ???
PROCESS(ls_pos)
BEGIN
ls_rowaddr <= ls_pos / 48;
END IF;

How can 32 (2^5) and 16 (2^4) be combined ?

Thank you for your comments



Rgds
André
 
factor 48 into (16 * 3)
you can first bit shift to the right by four places then
divide by three using division (division by 3 is much faster than by 48
and requires less logic i suspect)

PROCESS(clk, ls_pos)
variable unsigned_ls_pos : unsigned(12 downto 0);
variable shifted_ls_pos : unsigned(7 downto 0);
BEGIN
if rising_edge(clk) then
unsigned_ls_pos := to_unsigned(ls_pos, 13); --convert to
unsigned
shifted_ls_pos := unsigned_ls_pos(12 downto 4); --chop off
bottom 4 bits same as x/16
ls_rowaddr <= to_integer(shifted_ls_pos / to_unsigned(3, 2));
--divide by 3
END IF;
end process;

use ieee.numeric_std.all as the library for arithmetic
dont use std_logic_unsigned or std_logic_arith,
these will conflict (they are bad packages anyway and being gradually
phased out)

if you have quartus or xilinx ISE this code should synthesize
(i would expect synopsys or synplify as well)



ALuPin@web.de wrote:
Hi,

I have a signal (integer). How can I describe synthesizable code
for dividing that signal by 48 ? Result (ls_rowaddr) should
be whole-number that is integer.

SIGNAL ls_pos : integer RANGE 0 TO 8191;
SIGNAL ls_rowaddr : integer RANGE 0 TO 191;

PROCESS(Reset, Clk)
BEGIN
IF Reset='1' THEN
ls_pos <= 0;

ELSIF rising_edge(Clk) THEN
IF load='1' THEN
ls_pos <= LoadAddr;
END IF;

END IF;
END PROCESS;

-- synthesis ???
PROCESS(ls_pos)
BEGIN
ls_rowaddr <= ls_pos / 48;
END IF;

How can 32 (2^5) and 16 (2^4) be combined ?

Thank you for your comments



Rgds
André
 
wallge wrote:
factor 48 into (16 * 3)
you can first bit shift to the right by four places then
divide by three using division (division by 3 is much faster than by 48
and requires less logic i suspect)
Assuming the divide by 48 is to be rounded down, the divide by 3 can be
replaced by * 171 / 512 (shift right by 9) and will yield the desired
result.
 
Hi wallge,

I get the following error message with SynplifyPro 8.6.2:


Right argument must evaluate to a constant integer power of 2


library ieee;

use ieee.std_logic_1164.all;
use ieee.numeric_std.all;

ENTITY divider IS
PORT( Reset : IN std_logic;
Clk : IN std_logic;
DataIn : IN std_logic_vector(12 DOWNTO 0);
DataOut : OUT std_logic_vector(12 DOWNTO 0)
);
END divider;

ARCHITECTURE rtl OF divider IS


SIGNAL ls_data_out : unsigned(12 DOWNTO 0);

BEGIN

DataOut <= std_logic_vector(ls_data_out);

PROCESS(Reset, Clk)
variable v_data : unsigned(12 DOWNTO 0);
variable v_data_shift : unsigned(8 DOWNTO 0);
BEGIN
IF Reset='1' THEN
ls_data_out <= (OTHERS => '0');

ELSIF rising_edge(Clk) THEN
v_data := unsigned(DataIn);
v_data_shift := unsigned(v_data(12 DOWNTO 4));

ls_data_out <= (v_data_shift / to_unsigned(3,2));

END IF;
END PROCESS;

END rtl;
 
hmm...
I guess synplify wont do handle division for some reason.
if you have quartus you can instantiate a mega-function divider
component to do the division,
and even choose how many pipe stages are in the divider.

I am pretty sure that xilinx ISE has the same king of thing, i forget
the name of the tool off the top of my head, but i know xilinx also has
customizable arithmetic blocks within ISE that can be instantiated in
your VHDL code as components in your design...




ALuPin@web.de wrote:
Hi wallge,

I get the following error message with SynplifyPro 8.6.2:


Right argument must evaluate to a constant integer power of 2


library ieee;

use ieee.std_logic_1164.all;
use ieee.numeric_std.all;

ENTITY divider IS
PORT( Reset : IN std_logic;
Clk : IN std_logic;
DataIn : IN std_logic_vector(12 DOWNTO 0);
DataOut : OUT std_logic_vector(12 DOWNTO 0)
);
END divider;

ARCHITECTURE rtl OF divider IS


SIGNAL ls_data_out : unsigned(12 DOWNTO 0);

BEGIN

DataOut <= std_logic_vector(ls_data_out);

PROCESS(Reset, Clk)
variable v_data : unsigned(12 DOWNTO 0);
variable v_data_shift : unsigned(8 DOWNTO 0);
BEGIN
IF Reset='1' THEN
ls_data_out <= (OTHERS => '0');

ELSIF rising_edge(Clk) THEN
v_data := unsigned(DataIn);
v_data_shift := unsigned(v_data(12 DOWNTO 4));

ls_data_out <= (v_data_shift / to_unsigned(3,2));

END IF;
END PROCESS;

END rtl;
 
ALuPin@web.de wrote:

Hi,

I have a signal (integer). How can I describe synthesizable code
for dividing that signal by 48 ? Result (ls_rowaddr) should
be whole-number that is integer.

SIGNAL ls_pos : integer RANGE 0 TO 8191;
SIGNAL ls_rowaddr : integer RANGE 0 TO 191;

PROCESS(Reset, Clk)
BEGIN
IF Reset='1' THEN
ls_pos <= 0;

ELSIF rising_edge(Clk) THEN
IF load='1' THEN
ls_pos <= LoadAddr;
END IF;

END IF;
END PROCESS;

-- synthesis ???
PROCESS(ls_pos)
BEGIN
ls_rowaddr <= ls_pos / 48;
END IF;

How can 32 (2^5) and 16 (2^4) be combined ?

Thank you for your comments



Rgds
André
For division by a constant, instead multiply by the reciprocal of the
divisor (the reciprocal is also a constant). You'll want to scale by a
power of two, which is to say you move the position of the implied radix
point. 1/48 = 0.0208333. To make that an integer, scale it by a power
of 2 that gives an appropriate amount of precision for your task. For
example, you might scale it by 2^16 so your reciprocal is 1365*2^-16.
Then when you multiply the dividend, you wind up with a product that is
also weighted 2^-16, so you need to right shift it 16 places to restore
the scaling of the dividend. The multiply in this case is a repeating
pattern of bits (0x555), which can be done with a tree of adders rather
than a multiplier if you do not have full multipliers available.
 
Mr Andraka,

thank you for your suggestion.

The multiply in this case is a repeating
pattern of bits (0x555), which can be done with a tree of adders rather
than a multiplier if you do not have full multipliers available.
Can you elaborate on your last comment. How would those adders
be combined ?

Rgds
André
 
Do you mean the following:

multiplying with 1365 = x555

implies the following


C * 1365 = C * (2^10 + 2^8 + 2^6 + 2^4 + 2^2 + 2^0)

= (C* 2^10) + (C*2^8) + ...

where the multiplications can be achieved with shift left operations.

Rgds
André
 
Hi again,

I have found out that my concept has some fault.
Functional simulation and Timing simulation show both
that I have some kind of offset by one in my result:

I get for example : 6000/48 = 124 (instead of 125)

Is there some kind of rounding error I did not think of.

Here is the code and the corresponding testench:

library ieee;

use ieee.std_logic_1164.all;
use ieee.numeric_std.all;

ENTITY divider48 IS
PORT ( Reset : IN std_logic;
Clk : IN std_logic;
Data2DivBy48 : IN std_logic_vector (12 downto 0);
Data2DivBy48Valid : IN std_logic;
DataDividedOut : OUT std_logic_vector (12 downto 0);
DataDividedValidOut : OUT std_logic
);
END divider48;


ARCHITECTURE rtl OF divider48 IS

SIGNAL ls_shift_data_left10 : unsigned(22 DOWNTO 0);
SIGNAL ls_shift_data_left8 : unsigned(22 DOWNTO 0);
SIGNAL ls_shift_data_left6 : unsigned(22 DOWNTO 0);
SIGNAL ls_shift_data_left4 : unsigned(22 DOWNTO 0);
SIGNAL ls_shift_data_left2 : unsigned(22 DOWNTO 0);
SIGNAL ls_shift_data_left0 : unsigned(22 DOWNTO 0);

SIGNAL ls_sum1, ls_sum2, ls_sum3 : unsigned(22 DOWNTO 0);
SIGNAL ls_sum4, ls_sum5 : unsigned(22 DOWNTO 0);

SIGNAL ls_sum123_valid : std_logic;
SIGNAL ls_sum4_valid : std_logic;
SIGNAL ls_sum5_valid : std_logic;

SIGNAL ls_shift_data_right16 : unsigned(22 DOWNTO 0);
SIGNAL ls_shift_right : std_logic;

BEGIN

DataDividedOut <= std_logic_vector(ls_shift_data_right16(12 DOWNTO 0));


ls_shift_data_left0 <= ("0000000000" & unsigned(Data2DivBy48));

--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
SL10_reg: PROCESS(Reset, Clk)
BEGIN
IF Reset='1' THEN
ls_shift_data_left10 <= (OTHERS => '0');

ELSIF rising_edge(Clk) THEN

IF Data2DivBy48Valid='1' THEN
ls_shift_data_left10(22 DOWNTO 10) <= unsigned(Data2DivBy48);
ls_shift_data_left10(9 DOWNTO 0) <= (OTHERS => '0');
END IF;

END IF;
END PROCESS SL10_reg;

--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
SL8_reg: PROCESS(Reset, Clk)
BEGIN
IF Reset='1' THEN
ls_shift_data_left8 <= (OTHERS => '0');

ELSIF rising_edge(Clk) THEN

IF Data2DivBy48Valid='1' THEN
ls_shift_data_left8(22 DOWNTO 21) <= (OTHERS => '0');
ls_shift_data_left8(20 DOWNTO 8) <= unsigned(Data2DivBy48);
ls_shift_data_left8(7 DOWNTO 0) <= (OTHERS => '0');
END IF;

END IF;
END PROCESS SL8_reg;

--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
SL6_reg: PROCESS(Reset, Clk)
BEGIN
IF Reset='1' THEN
ls_shift_data_left6 <= (OTHERS => '0');

ELSIF rising_edge(Clk) THEN

IF Data2DivBy48Valid='1' THEN
ls_shift_data_left6(22 DOWNTO 19) <= (OTHERS => '0');
ls_shift_data_left6(18 DOWNTO 6) <= unsigned(Data2DivBy48);
ls_shift_data_left6(5 DOWNTO 0) <= (OTHERS => '0');
END IF;

END IF;
END PROCESS SL6_reg;

--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
SL4_reg: PROCESS(Reset, Clk)
BEGIN
IF Reset='1' THEN
ls_shift_data_left4 <= (OTHERS => '0');

ELSIF rising_edge(Clk) THEN

IF Data2DivBy48Valid='1' THEN
ls_shift_data_left4(22 DOWNTO 17) <= (OTHERS => '0');
ls_shift_data_left4(16 DOWNTO 4) <= unsigned(Data2DivBy48);
ls_shift_data_left4(3 DOWNTO 0) <= (OTHERS => '0');
END IF;

END IF;
END PROCESS SL4_reg;

--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
SL2_reg: PROCESS(Reset, Clk)
BEGIN
IF Reset='1' THEN
ls_shift_data_left2 <= (OTHERS => '0');

ELSIF rising_edge(Clk) THEN

IF Data2DivBy48Valid='1' THEN
ls_shift_data_left2(22 DOWNTO 15) <= (OTHERS => '0');
ls_shift_data_left2(14 DOWNTO 2) <= unsigned(Data2DivBy48);
ls_shift_data_left2(1 DOWNTO 0) <= (OTHERS => '0');
END IF;

END IF;
END PROCESS SL2_reg;

--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
SUM_reg: PROCESS(Reset, Clk)
BEGIN
IF Reset='1' THEN
ls_sum123_valid <= '0';
ls_sum4_valid <= '0';
ls_sum5_valid <= '0';
ls_sum1 <= (OTHERS => '0');
ls_sum2 <= (OTHERS => '0');
ls_sum3 <= (OTHERS => '0');
ls_sum4 <= (OTHERS => '0');
ls_sum5 <= (OTHERS => '0');

ELSIF rising_edge(Clk) THEN
ls_sum123_valid <= Data2DivBy48Valid;
ls_sum4_valid <= ls_sum123_valid;
ls_sum5_valid <= ls_sum4_valid;

IF ls_sum123_valid='1' THEN
ls_sum1 <= ls_shift_data_left10 + ls_shift_data_left8;
ls_sum2 <= ls_shift_data_left6 + ls_shift_data_left4;
ls_sum3 <= ls_shift_data_left2 + ls_shift_data_left0;
END IF;

IF ls_sum4_valid='1' THEN
ls_sum4 <= ls_sum1 + ls_sum2;
END IF;

IF ls_sum5_valid='1' THEN
ls_sum5 <= ls_sum4 + ls_sum3;
END IF;

END IF;
END PROCESS SUM_reg;

--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
SR16_reg: PROCESS(Reset, Clk)
BEGIN
IF Reset='1' THEN
ls_shift_data_right16 <= (OTHERS => '0');
ls_shift_right <= '0';
DataDividedValidOut <= '0';

ELSIF rising_edge(Clk) THEN
ls_shift_right <= ls_sum5_valid;
DataDividedValidOut <= '0';

IF ls_shift_right='1' THEN
DataDividedValidOut <= '1';
ls_shift_data_right16(22 DOWNTO 7) <= (OTHERS => '0');
ls_shift_data_right16(6 DOWNTO 0) <= ls_sum5(22 DOWNTO 16);
END IF;

END IF;
END PROCESS SR16_reg;

--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
END rtl;






library ieee;

use ieee.std_logic_1164.all;
--use ieee.numeric_std.all;
use ieee.std_logic_arith.all;
use ieee.std_logic_unsigned.all;

ENTITY tb_divider48 IS
END tb_divider48;

ARCHITECTURE testbench OF tb_divider48 IS

COMPONENT divider48
PORT( Reset : IN std_logic;
Clk : IN std_logic;
Data2DivBy48 : IN std_logic_vector(12 DOWNTO 0);
Data2DivBy48Valid : IN std_logic;
DataDividedOut : OUT std_logic_vector(12 DOWNTO 0);
DataDividedValidOut : OUT std_logic
);
END COMPONENT;


SIGNAL t_Reset : std_logic;
SIGNAL t_Clk : std_logic;
SIGNAL t_Clkstim : std_logic;
SIGNAL t_Data2DivBy48 : std_logic_vector(12 DOWNTO 0);
SIGNAL t_Data2DivBy48Valid : std_logic;
SIGNAL t_DataDividedOut : std_logic_vector(12 DOWNTO 0);
SIGNAL t_DataDividedValidOut : std_logic;

BEGIN

UUT : divider48
PORT MAP ( Reset => t_Reset,
Clk => t_Clk,
Data2DivBy48 => t_Data2DivBy48,
Data2DivBy48Valid => t_Data2DivBy48Valid,
DataDividedOut => t_DataDividedOut,
DataDividedValidOut => t_DataDividedValidOut
);

-----------------------------------------------------------------------------
-----------------------------------------------------------------------------
CLOCK_gen: PROCESS
BEGIN
t_Clk <= '1'; WAIT FOR 3.75 ns;
t_Clk <= '0'; WAIT FOR 3.75 ns;
END PROCESS CLOCK_gen;

-----------------------------------------------------------------------------
-----------------------------------------------------------------------------
CLOCKstim_gen: PROCESS
BEGIN
t_Clkstim <= '0'; WAIT FOR 3.75 ns;
t_Clkstim <= '1'; WAIT FOR 3.75 ns;
END PROCESS CLOCKstim_gen;
-----------------------------------------------------------------------------
-----------------------------------------------------------------------------
MAIN_gen: PROCESS
BEGIN
t_Data2DivBy48 <= (OTHERS => '0');
t_Data2DivBy48Valid <= '0';
t_Reset <= '1';

FOR i IN 0 TO 13 LOOP
WAIT UNTIL rising_edge(t_Clkstim);
END LOOP;

t_Reset <= '0';

WAIT UNTIL rising_edge(t_Clkstim);
t_Data2DivBy48Valid <= '1';
WAIT UNTIL rising_edge(t_Clkstim);
t_Data2DivBy48Valid <= '0';

FOR i IN 0 TO 170 LOOP
WAIT UNTIL t_DataDividedValidOut='1';
WAIT UNTIL rising_edge(t_Clkstim);
t_Data2DivBy48 <= t_Data2DivBy48 + 48;
t_Data2DivBy48Valid <= '1';
WAIT UNTIL rising_edge(t_Clkstim);
t_Data2DivBy48Valid <= '0';
END LOOP;

WAIT;
END PROCESS MAIN_gen;
-----------------------------------------------------------------------------

END testbench;
 
ALuPin@web.de wrote:
Mr Andraka,

thank you for your suggestion.


The multiply in this case is a repeating
pattern of bits (0x555), which can be done with a tree of adders rather
than a multiplier if you do not have full multipliers available.


Can you elaborate on your last comment. How would those adders
be combined ?

Rgds
André

x is input

a= x + x<<2 = x*5
b= a + a<<4 = a*0x11 = x*0x55
y= a+ b<<4 = a + b*0x10 = x*0x555
 
If you are targetting an FPGA, here's a different approach that you may be
able to use, depending on your current implementation and target device:

If you have Block Memory to spare, you could implement the division as a
reasonably-sized lookup table (I'll use Xilinx's BRAMs in my example - I'm
not as familiar with the Altera (or other) equivalent, but I'm sure the
basic idea can be transferred over).

An earlier poster noted that, instead of divide by 48, it would be easier to
shift right by 4 (divide by 16), then you just have to figure out how to
divide by 3.
You have 8192 possible values of ls_pos, so you'll need to represent
8192/48=171 values for ls_rowaddr (why was 191 selected as the interger
range?). So, you'll need an 8-bit output.
ls_pos is a 13 bit vector. Dividing by 16 will leave you with 9 bits, so
your address will be 9 bits wide.

You can create a memory with the following basic port assignments:
address => ls_pos(12 downto 4); --2^13 = 8192, (12 downto 4) to divide
by 16
data_out => ls_rowaddr; --ls_rowaddr =
std_logic_vector(7 downto 0);

Simply pre-initialize the memory contents with the correct values (ie:
M[0]=x00, M[1]=x00, M[2]=x00, M[3]=x01....M[422]=x8C....M[511]=xAA). The
data returned at any address is floor(address/3). Since the address is
ls_pos/16, the final equation will work out to data=floor(ls_pos/48), which
is exactly what you want.

As for the implementation, Xilinx BlockRams have a capacity of 18Kb. You
have 2^9=512 addresses and 8-bit data, so you need a capacity of 4Kb. You
can fit the entire thing in one BlockRam.

If you are targetting an FPGA, this approach will be much faster than using
an actual multiplier or divider, and because the block Memories come free
with the chip, you'll hardly use any resources. If you're targetting an
ASIC, then I'm a bit out of my league but you might be able to use an
equivelent memory structure.
 
ALuPin@web.de wrote:

I have a signal (integer). How can I describe synthesizable code
for dividing that signal by 48 ? Result (ls_rowaddr) should
be whole-number that is integer.
This is an interesting problem. You could use my general entity
for dividing two arbitrary numbers:

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity binary_division is
generic(
bits: positive := 8);
port(
dividend: in unsigned(bits-1 downto 0);
divisor: in unsigned(bits-1 downto 0);
quotient: out unsigned(bits-1 downto 0);
remainder: out unsigned(bits-1 downto 0);
division_by_zero_error: out boolean);
end entity binary_division;

architecture rtl of binary_division is
begin
divide: process(dividend, divisor)
variable temp_dividend: unsigned(bits-1 downto 0);
variable temp_divisor: unsigned(bits-1 downto 0);
variable temp_quotient: unsigned(bits-1 downto 0);
variable align_count: natural range 0 to bits;
begin
-- init
temp_dividend := dividend;
temp_divisor := divisor;
temp_quotient := (others => '0');
if temp_divisor = 0 then
division_by_zero_error <= true;
quotient <= (others => '0');
remainder <= (others => '0');
else
division_by_zero_error <= false;
if temp_divisor > temp_dividend then
quotient <= (others => '0');
remainder <= dividend;
else
-- left align
align_count := 0;
for i in 1 to bits-1 loop
exit when temp_divisor(bits-1) = '1';
temp_divisor := shift_left(temp_divisor, 1);
align_count := align_count + 1;
end loop;
-- divide
for i in 0 to bits-1 loop
if temp_divisor > temp_dividend then
temp_quotient := temp_quotient(bits-2 downto 0) & '0';
else
temp_quotient := temp_quotient(bits-2 downto 0) & '1';
temp_dividend := temp_dividend - temp_divisor;
end if;
temp_divisor := shift_right(temp_divisor, 1);
exit when align_count = 0;
align_count := align_count - 1;
end loop;
quotient <= temp_quotient;
remainder <= temp_dividend;
end if;
end if;
end process;
end architecture rtl;

But this uses a lot of LUTs, 5% of a Cyclone I (about 300 logic elements)
for 8 bit and the worst case timing between input and output is 66.6 ns
(if you tweek a bit the project settings, it can be reduced to 53 ns).

For 16 bit you can use it for benchmarking synthesizer tools, but I
would not recommend it in real designs, because it uses 21% (about 1,300
logic elements) and worst timing is 122 ns. For 32 bit the Quartus II
says "Current module quartus_fit ended unexpectedly", maybe because
it needs more logic elements than available.

There are faster algorithms ( http://en.wikipedia.org/wiki/Division_(digital) ),
but if you don't need high parallel speed and if you have a clock, you
can serialize the algorithm, which shouldn't need very many logic elements.

The testbench:

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity binary_division_test is
end entity binary_division_test;

architecture rtl of binary_division_test is

constant bits: positive := 8;

signal dividend: unsigned(bits-1 downto 0);
signal divisor: unsigned(bits-1 downto 0);
signal quotient: unsigned(bits-1 downto 0);
signal remainder: unsigned(bits-1 downto 0);
signal division_by_zero_error: boolean;

begin
binary_division_inst: entity binary_division
generic map(
bits => bits)
port map(
dividend => dividend,
divisor => divisor,
quotient => quotient,
remainder => remainder,
division_by_zero_error => division_by_zero_error);

test_divide: process
variable count: positive;
begin
dividend <= to_unsigned(0, bits);
divisor <= to_unsigned(0, bits);
wait for 1 ps;
count := 1;
for i in 1 to bits loop
count := 2*count;
end loop;
count := count - 1;
for i in 0 to count loop
for j in 0 to count loop
dividend <= to_unsigned(i, bits);
divisor <= to_unsigned(j, bits);
wait for 1 ps;
if j = 0 then
assert quotient = 0 report "quotient error" severity failure;
assert remainder = 0 report "remainder error" severity failure;
assert division_by_zero_error report "division_by_zero_error error" severity failure;
else
assert quotient = i / j report "quotient error" severity failure;
assert remainder = i - i / j * j report "remainder error" severity failure;
assert not division_by_zero_error report "division_by_zero_error error" severity failure;
end if;
end loop;
end loop;
wait for 1 ps;
assert false report "No failure, simulation was successful." severity failure;
end process;
end architecture rtl;

--
Frank Buss, fb@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de
 
Hi Dave,

what about order of dividing by 16 and 3. Where would you
place the look up table: after dividing by 16 or before ?

If you first place the look up table and then
divide by 16 THEN the RAM needed would increase.
I think the way you have proposed would lead to
a rounding error again. Not sure ...

Rgds
André

..
 
I have a signal (integer). How can I describe synthesizable code
for dividing that signal by 48 ? Result (ls_rowaddr) should
be whole-number that is integer.
From the original post, it seems to me that you just want the whole number
part of the division - the equivalent of, say, the following line of C:
ls_rowaddr = int(ls_pos/48)
You're not interested in the remainder because you need the result to be the
row of ls_pos, and nothing more. Suppose you want to (inefficiently) code
this as a big if-then-else statement:
if (ls_pos < 48) then ls_rowaddr = 0;
elseif (ls_pos < 48*2) then ls_rowaddr = 1;
elseif (ls_pos < 48*3) then ls_rowaddr = 2;
elseif (ls_pos < 48*4) then ls_rowaddr = 3;
elseif (ls_pos < 48*5) then ls_rowaddr = 4;
.....
elseif (ls_pos < 48*169) then ls_rowaddr = 168;
elseif (ls_pos < 48*170) then ls_rowaddr = 169;
else ls_rowaddr = 170;

I am working on the assumption that this would result in the desired value
of ls_rowaddr. Correct me if I'm wrong.

Now, we can shift both sides by 4, and the inequality will still hold, even
if (ls_pos>>4 != ls_pos/16). Try to make up some corner cases if you think
there will be a rounding error...
if ((ls_pos>>4) < 3) then ls_rowaddr = 0;
elseif ((ls_pos>>4) < 3*2) then ls_rowaddr = 1;
elseif ((ls_pos>>4) < 3*3) then ls_rowaddr = 2;
elseif ((ls_pos>>4) < 3*4) then ls_rowaddr = 3;
elseif ((ls_pos>>4) < 3*5) then ls_rowaddr = 4;
.....
elseif ((ls_pos>>4) < 3*169) then ls_rowaddr = 168;
elseif ((ls_pos>>4) < 3*170) then ls_rowaddr = 169;
else ls_rowaddr = 170;

The worst-case for a rounding error would occur at the end of the range of
ls_pos. Try a few values...
ls_pos = 48*169 = 8112.
With full precision, we'd expect ls_rowaddr=int(8112/48)=169.
8112>>4 = 507
(507 > 3*169) && (507 < 3*170), so ls_rowaddr = 169.

ls_pos = 48*169+47 = 8159:
With full precision, we'd expect ls_rowaddr =
int(8159/48)=int(169.97916)=169.
8159>>4 = 509.
(509 > 3*169) && (509 < 170), so ls_rowaddr = 169.

You're going to want to divide by 16 before the RAM, or, as you noted, the
RAM will increase in size. There won't really be any "rounding errors"
because you're not really performing any division - you're simply figuring
out which ls_rowaddr a value of ls_pos will map to. If you want to be sure,
run all 8192 possible values of ls_pos through a spreadsheet or VHDL
simulator, and be sure that all the resulting values of ls_rowaddr are
correct.
 
My typo - the inequality below should have read:

(507 >= 3*169) && (507 < 3*170), so ls_rowaddr = 169.


"Dave Dean" <dave.dean@xilinx.com> wrote in message
news:ej2fu2$qr33@cnn.xsj.xilinx.com...
I have a signal (integer). How can I describe synthesizable code
for dividing that signal by 48 ? Result (ls_rowaddr) should
be whole-number that is integer.

From the original post, it seems to me that you just want the whole number
part of the division - the equivalent of, say, the following line of C:
ls_rowaddr = int(ls_pos/48)
You're not interested in the remainder because you need the result to be
the row of ls_pos, and nothing more. Suppose you want to (inefficiently)
code this as a big if-then-else statement:
if (ls_pos < 48) then ls_rowaddr = 0;
elseif (ls_pos < 48*2) then ls_rowaddr = 1;
elseif (ls_pos < 48*3) then ls_rowaddr = 2;
elseif (ls_pos < 48*4) then ls_rowaddr = 3;
elseif (ls_pos < 48*5) then ls_rowaddr = 4;
....
elseif (ls_pos < 48*169) then ls_rowaddr = 168;
elseif (ls_pos < 48*170) then ls_rowaddr = 169;
else ls_rowaddr = 170;

I am working on the assumption that this would result in the desired value
of ls_rowaddr. Correct me if I'm wrong.

Now, we can shift both sides by 4, and the inequality will still hold,
even if (ls_pos>>4 != ls_pos/16). Try to make up some corner cases if you
think there will be a rounding error...
if ((ls_pos>>4) < 3) then ls_rowaddr = 0;
elseif ((ls_pos>>4) < 3*2) then ls_rowaddr = 1;
elseif ((ls_pos>>4) < 3*3) then ls_rowaddr = 2;
elseif ((ls_pos>>4) < 3*4) then ls_rowaddr = 3;
elseif ((ls_pos>>4) < 3*5) then ls_rowaddr = 4;
....
elseif ((ls_pos>>4) < 3*169) then ls_rowaddr = 168;
elseif ((ls_pos>>4) < 3*170) then ls_rowaddr = 169;
else ls_rowaddr = 170;

The worst-case for a rounding error would occur at the end of the range of
ls_pos. Try a few values...
ls_pos = 48*169 = 8112.
With full precision, we'd expect ls_rowaddr=int(8112/48)=169.
8112>>4 = 507
(507 > 3*169) && (507 < 3*170), so ls_rowaddr = 169.

ls_pos = 48*169+47 = 8159:
With full precision, we'd expect ls_rowaddr =
int(8159/48)=int(169.97916)=169.
8159>>4 = 509.
(509 > 3*169) && (509 < 170), so ls_rowaddr = 169.

You're going to want to divide by 16 before the RAM, or, as you noted, the
RAM will increase in size. There won't really be any "rounding errors"
because you're not really performing any division - you're simply figuring
out which ls_rowaddr a value of ls_pos will map to. If you want to be
sure, run all 8192 possible values of ls_pos through a spreadsheet or VHDL
simulator, and be sure that all the resulting values of ls_rowaddr are
correct.
 
Frank Buss wrote:

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity binary_division is
generic(
bits: positive := 8);
port(
dividend: in unsigned(bits-1 downto 0);
divisor: in unsigned(bits-1 downto 0);
quotient: out unsigned(bits-1 downto 0);
remainder: out unsigned(bits-1 downto 0);
division_by_zero_error: out boolean);
end entity binary_division;
BTW: If you don't use a port for the divisor, but define

constant divisor: unsigned(bits-1 downto 0) := to_unsigned(48, bits);

with bits=13, Quartus can optimize it down to 94 logic elements, which is
2% for my Cyclone and timing analyzer says, worst-case input to output time
is 28.1 ns, so it could be clocked with about 35 MHz.

--
Frank Buss, fb@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de
 
Hi Dave,

From the original post, it seems to me that you just want the whole number
part of the division - the equivalent of, say, the following line of C:
ls_rowaddr = int(ls_pos/48)
Yes, correct!

Trying different things out ...

Rgds
André
 

Welcome to EDABoard.com

Sponsor

Back
Top