avoiding division

  • Thread starter nisheethg@gmail.com
  • Start date
N

nisheethg@gmail.com

Guest
Hi
I got to implement (A*x + B*y)/(x+y).
A and B can take value from 0 to 255. x and y are non-negative
integer i.e. 0 and above.
(x+y) can be max. of 8.
For example if (x+y) is 7. and x is 2
(A*2 + B*5)/7

Is there any way to avoid division by scaling using left right
shift or somethin...??

thanks,
 
<nisheethg@gmail.com> wrote in message
news:1169350387.805503.116830@a75g2000cwd.googlegroups.com...
Hi
I got to implement (A*x + B*y)/(x+y).
A and B can take value from 0 to 255. x and y are non-negative
integer i.e. 0 and above.
(x+y) can be max. of 8.
For example if (x+y) is 7. and x is 2
(A*2 + B*5)/7

Is there any way to avoid division by scaling using left right
shift or somethin...??

thanks,
Google for restoring and non-restoring dividers, they only require shift and
addition/subtraction. If you want a faster converging method then google for
Newton-Raphson and Goldschmidt, they do require multiplication.

Hans
www.ht-lab.com
 
<nisheethg@gmail.com> wrote in message
news:1169350387.805503.116830@a75g2000cwd.googlegroups.com...
Hi
I got to implement (A*x + B*y)/(x+y).
A and B can take value from 0 to 255. x and y are non-negative
integer i.e. 0 and above.
(x+y) can be max. of 8.
For example if (x+y) is 7. and x is 2
(A*2 + B*5)/7
So x, y, and x+y are extremely small integers (0-8), about 3/4 bits, right?

Why not simply do a reciprocal lookup for (x+y), which is a ROM with 3-input
address, then multiply?

Or perhaps even better, re-arrange your formula to give:

A * (x/(x+y)) + B * (y/(x+y))

Given x and y you can have a lookup table for (x/(x+y)) and (y/(x+y)), just
6-8 address bits. If you want a multi-cycle resource shared implementation
you only need one multiplier, one adder and one lookup ROM. You'll have to
decide (i.e. work out) what precision you need for the lookup though.

Good luck,

-Ben-
 
Ben Jones wrote:
nisheethg@gmail.com> wrote in message
news:1169350387.805503.116830@a75g2000cwd.googlegroups.com...

Hi
I got to implement (A*x + B*y)/(x+y).
A and B can take value from 0 to 255. x and y are non-negative
integer i.e. 0 and above.
(x+y) can be max. of 8.
For example if (x+y) is 7. and x is 2
(A*2 + B*5)/7


So x, y, and x+y are extremely small integers (0-8), about 3/4 bits, right?

Why not simply do a reciprocal lookup for (x+y), which is a ROM with 3-input
address, then multiply?

Or perhaps even better, re-arrange your formula to give:

A * (x/(x+y)) + B * (y/(x+y))

Given x and y you can have a lookup table for (x/(x+y)) and (y/(x+y)), just
6-8 address bits. If you want a multi-cycle resource shared implementation
you only need one multiplier, one adder and one lookup ROM. You'll have to
decide (i.e. work out) what precision you need for the lookup though.
While I couldn't follow Ben's proposal completely I have a possible
solution based on his LUT idea reduced to 4 address bits :)

If
1 <= x+y <= 8 and
0 <= x <= 8 and
0 <= y <= 8

there are a number of unique combinations for (a + b) with
a=min(x,y) and
b=max(x,y)

a+b: 8 7 6 5 4 3 2 1
-------------------------------------------------
1: 0+8 0+7 0+6 0+5 0+4 0+3 0+2 0+1
2: 1+7 1+6 1+5 1+4 1+3 1+2 1+1
3: 2+6 2+5 2+4 2+3 2+2
4: 3+5 3+4 3+3
5: 4+4

The number of combinations for the LUT can be further reduced:

(.) for the first row which which always results into (0, b)
(*) for the last row when the sum (a+b) is even we get
always a==b with the result of (0.5, 0.5)
(-) (1+3) can be mapped to (2+6) with the same result
(-) (1+2) can be mapped to (2+4) with the same result

thus we obtain the reduced table with 10 entries

a+b: 8 7 6 5 4 3 2 1
-------------------------------------------------
1: . . . . . . . .
2: 1+7 1+6 1+5 1+4 - - *
3: 2+6 2+5 2+4 2+3 *
4: 3+5 3+4 *
5: *

which can be mapped into a 4 bit address range.

Now I assume we have access to an 4 bit ROM based LUT with 16 possible
entries. By trial and error we find that leftshifting a for 2 postitions
and extending the bitpositions of a[1] = a[0] with '1' and a exor
operation of a and b will give 10 unique address positions

so for the LUT
a is [1, 2, 3] and
b is [3, 4, 5, 6, 7]

a_shifted and bitwise ored with 3 yields to:
a_shift_filled is [0x7, 0xb, 0xf]

A little trial and error and then we map and exor a_shift_filled and b
for the addresses and get

a+b => a_shift_filled xor b = address_for_LUT
---------------------------------------------
1+7 => 0x7^7 = 0
1+6 => 0x7^6 = 1
1+5 => 0x7^5 = 2
1+4 => 0x7^4 = 3
2+6 => 0xb^6 = 13
2+5 => 0xb^5 = 14
2+4 => 0xb^4 = 15
2+3 => 0xb^3 = 8
3+5 => 0xf^5 = 10
3+4 => 0xf^4 = 11

the address mapping for the LUT. The results c_a and c_b have to be
stored into the LUT, for (1+7) at address 0 and (2+5) at address 14 etc.

In the following the algorithm implemented in a working and verified
Python program:

------ BEGIN: division_example.py ----------------

#!/usr/bin/env python

na = "NA"

LUT = [ # address_for_LUT
(1./8., 7./8.), # 0
(1./7., 6./7.), # 1
(1./6., 5./6.), # 2
(1./5., 4./5.), # 3
( na , na ), # 4
( na , na ), # 5
( na , na ), # 6
( na , na ), # 7
(2./5., 3./5.), # 8
( na , na ), # 9
(3./8., 5./8.), # 10
(3./7., 4./7.), # 11
( na , na ), # 12
(2./8., 6./8.), # 13 also 1/4
(2./7., 5./7.), # 14
(2./6., 4./6.), # 15 also 1/3
]

def calculate(x, y):
a = min(x,y)
b = max(x,y)

if x>y:
is_xy_swapped = True

else:
is_xy_swapped = False

x_plus_y = x + y

if x_plus_y == 0:
c_a = ZeroDivisionError
c_b = ZeroDivisionError
is_calculated=True

elif a == 0:
c_a = 0
c_b = 1
is_calculated=True

elif (x_plus_y % 2) == 0: # is_even(x_plus_y)
if a == b:
c_a = c_b = 0.5
is_calculated = True

else:
is_calculated = False

else:
is_calculated = False

if not is_calculated:
if x_plus_y == 3: # (1/3)
a_equivalent = 2
b_equivalent = 4

elif x_plus_y == 4: # (1/4)
a_equivalent = 2
b_equivalent = 6

else:
a_equivalent = a
b_equivalent = b

address_for_LUT = ((a_equivalent<<2)+3)^b_equivalent
c_a, c_b = LUT[address_for_LUT]

if is_xy_swapped:
c_x = c_b
c_y = c_a

else:
c_x = c_a
c_y = c_b

return c_x, c_y

------ BEGIN: division_example.py ----------------

regards

wolfgang
 
"Wolfgang Grafen" <andreas.grafen@gmx.de> wrote in message
news:45B550A8.3000100@gmx.de...
Ben Jones wrote:
nisheethg@gmail.com> wrote in message
news:1169350387.805503.116830@a75g2000cwd.googlegroups.com...

Hi
I got to implement (A*x + B*y)/(x+y).
A and B can take value from 0 to 255. x and y are non-negative
integer i.e. 0 and above.
(x+y) can be max. of 8.
For example if (x+y) is 7. and x is 2
(A*2 + B*5)/7


So x, y, and x+y are extremely small integers (0-8), about 3/4 bits,
right?

Why not simply do a reciprocal lookup for (x+y), which is a ROM with
3-input address, then multiply?

Or perhaps even better, re-arrange your formula to give:

A * (x/(x+y)) + B * (y/(x+y))

Given x and y you can have a lookup table for (x/(x+y)) and (y/(x+y)),
just 6-8 address bits. If you want a multi-cycle resource shared
implementation you only need one multiplier, one adder and one lookup
ROM. You'll have to decide (i.e. work out) what precision you need for
the lookup though.


While I couldn't follow Ben's proposal completely I have a possible
solution based on his LUT idea reduced to 4 address bits :)

...snip..

wow... I hope this was not a homework assignment :)

Hans
www.ht-lab.com
 

Welcome to EDABoard.com

Sponsor

Back
Top