exponential function

blue39 · Jan 16, 2013

I need to find the value of e^(-x/y); where y is the constant and x keeps changing.

Does anyone know how to find this value on Xilinx FPGAs. Is there any core available for it? or any Algorithm would also do.

Thanks in advance.

Andy · Jan 16, 2013

On Jan 16, 3:58 am, blue39 <manjeet.rat...@gmail.com> wrote:

I need to find the value of e^(-x/y); where y is the constant and x keeps changing.

Does anyone know how to find this value on Xilinx FPGAs. Is there any core available for it? or any Algorithm would also do.

Thanks in advance.

Like all things, having an idea of what X is: complex/real, fixed/
floating point, range & resolution, bandwith and allowable latency to
get the answer (and what is its kind/range/resolution?) are going to
impact the choice of method to do the calculation.

Also, is Y REALLY a constant, or is it a parameter that changes very
infrequently, but you don't want to have to re-synthesize, place &
route to change it?

Without knowing the above, and since you only have one real-time
input, I would hesitantly suggest a look-up table, and depending on
range/resolution of the input and the result, linear interpolation
between table entries. Depending on the target architecture (whether
block ROMs are available), you may be able to populate the table with
a VHDL initialization function for the constant array that is the
table.

Andy

GaborSzakacs · Jan 16, 2013

Andy wrote:

On Jan 16, 3:58 am, blue39 <manjeet.rat...@gmail.com> wrote:
I need to find the value of e^(-x/y); where y is the constant and x keeps changing.

Does anyone know how to find this value on Xilinx FPGAs. Is there any core available for it? or any Algorithm would also do.

Thanks in advance.

Like all things, having an idea of what X is: complex/real, fixed/
floating point, range & resolution, bandwith and allowable latency to
get the answer (and what is its kind/range/resolution?) are going to
impact the choice of method to do the calculation.

Also, is Y REALLY a constant, or is it a parameter that changes very
infrequently, but you don't want to have to re-synthesize, place &
route to change it?

Without knowing the above, and since you only have one real-time
input, I would hesitantly suggest a look-up table, and depending on
range/resolution of the input and the result, linear interpolation
between table entries. Depending on the target architecture (whether
block ROMs are available), you may be able to populate the table with
a VHDL initialization function for the constant array that is the
table.

Andy
Also think about re-writing this equation as 2^(-x/Z) where Z is

another constant = Y * ln(2). Then the integer portion of x/Z
(use a multiplier i.e. x * (1/Z)) indicates the binary point
position, and you only need a look-up on the fractional portion
of x/Z. This method can give good precision over a wide input
range.

-- Gabor

blue39 · Jan 17, 2013

Hi guys,

In e^(x/y), y = 0.007568, x = fraction(0.nnnnnnn). The bit width of each x and y will be 16 bits each. Similarly, I want the output also of 16 bit wide. Y is not a constant in pure sense as it will be changing very "in"frequently.

I am using Virtex-6 SX series FPGA to have more number of DSP Slices.

One more thing, I want this operation to finish in at max 2 clock cycles of frequency ranging between 300 to 400 MHz.

I guess we can forget about y = 0.007568 for a moment, as I'm thinking of getting the value of (x/y) prior to e^ operation.

Now how can I find the value of e^(0.fraction) with latency of 2 or less and frequency of 300-400 MHz.

-BLue

blue39 · Jan 17, 2013

Hi guys,

In e^(x/y), y = 0.007568, x = fraction(0.nnnnnnn). The bit width of each x and y will be 16 bits each. Similarly, I want the output also of 16 bit wide. Y is not a constant in pure sense as it will be changing very "in"frequently.

I am using Virtex-6 SX series FPGA to have more number of DSP Slices.

One more thing, I want this operation to finish in at max 2 clock cycles of frequency ranging between 300 to 400 MHz.

I guess we can forget about y = 0.007568 for a moment, as I'm thinking of getting the value of (x/y) prior to e^ operation.

How can I find the value of e^(0.fraction) with latency of 2 or less and frequency of 300-400 MHz?

Thanks in advance

-BLue

GaborSzakacs · Jan 17, 2013

blue39 wrote:

Hi guys,

In e^(x/y), y = 0.007568, x = fraction(0.nnnnnnn). The bit width of each x and y will be 16 bits each. Similarly, I want the output also of 16 bit wide. Y is not a constant in pure sense as it will be changing very "in"frequently.

I am using Virtex-6 SX series FPGA to have more number of DSP Slices.

One more thing, I want this operation to finish in at max 2 clock cycles of frequency ranging between 300 to 400 MHz.

I guess we can forget about y = 0.007568 for a moment, as I'm thinking of getting the value of (x/y) prior to e^ operation.

How can I find the value of e^(0.fraction) with latency of 2 or less and frequency of 300-400 MHz?

Thanks in advance
-BLue

That really depends on the hardware you're targetting. In a Xilinx
FPGA, I'd say your only hope is to use a look-up table (block RAM)
and then if you needed interpolation, you could use a multiplier -
but getting the interpolation to run at 300+ MHz might be tough since
after the block RAM you only have one more cycle to meet your 2 clock
latency. Do you really need such a low latency or do you really mean
that you need to start a new operation at least every two cycles.
There's a big difference, and it would certainly be easy to run
pipelined with even a new operation starting on every clock as long
as you can live with a few cycles of latency (Xilinx DSP48 multipliers
run fastest with 3 cycle latency, +1 for BRAM = 4 cycles).

If you're targetting an ASIC, then there are probably a lot more
options.

-- Gabor

rickman · Jan 19, 2013

On 1/17/2013 9:27 AM, GaborSzakacs wrote:

blue39 wrote:
Hi guys,
In e^(x/y), y = 0.007568, x = fraction(0.nnnnnnn). The bit width of
each x and y will be 16 bits each. Similarly, I want the output also
of 16 bit wide. Y is not a constant in pure sense as it will be
changing very "in"frequently.
I am using Virtex-6 SX series FPGA to have more number of DSP Slices.
One more thing, I want this operation to finish in at max 2 clock
cycles of frequency ranging between 300 to 400 MHz.
I guess we can forget about y = 0.007568 for a moment, as I'm thinking
of getting the value of (x/y) prior to e^ operation.
How can I find the value of e^(0.fraction) with latency of 2 or less
and frequency of 300-400 MHz?

Thanks in advance
-BLue

That really depends on the hardware you're targetting. In a Xilinx
FPGA, I'd say your only hope is to use a look-up table (block RAM)
and then if you needed interpolation, you could use a multiplier -
but getting the interpolation to run at 300+ MHz might be tough since
after the block RAM you only have one more cycle to meet your 2 clock
latency. Do you really need such a low latency or do you really mean
that you need to start a new operation at least every two cycles.
There's a big difference, and it would certainly be easy to run
pipelined with even a new operation starting on every clock as long
as you can live with a few cycles of latency (Xilinx DSP48 multipliers
run fastest with 3 cycle latency, +1 for BRAM = 4 cycles).

If you're targetting an ASIC, then there are probably a lot more
options.

-- Gabor

I thought your other response was more interesting. Potentially the
exponent can be broken into smaller pieces using much smaller lookup
tables, efficiently implemented in block RAM. If the value of the
exponent is broken into two 8 bit pieces and a value returned from two
separate 256 word, 16 bit lookup tables, these values can be multiplied
to get an exact result. e^(a+b) = e^a * e^b. This should be faster and
simpler than an interpolation.

How fast can a multiply be done in this part?

Rick

Gabor · Jan 20, 2013

On 1/18/2013 10:09 PM, rickman wrote:

On 1/17/2013 9:27 AM, GaborSzakacs wrote:
blue39 wrote:
Hi guys,
In e^(x/y), y = 0.007568, x = fraction(0.nnnnnnn). The bit width of
each x and y will be 16 bits each. Similarly, I want the output also
of 16 bit wide. Y is not a constant in pure sense as it will be
changing very "in"frequently.
I am using Virtex-6 SX series FPGA to have more number of DSP Slices.
One more thing, I want this operation to finish in at max 2 clock
cycles of frequency ranging between 300 to 400 MHz.
I guess we can forget about y = 0.007568 for a moment, as I'm thinking
of getting the value of (x/y) prior to e^ operation.
How can I find the value of e^(0.fraction) with latency of 2 or less
and frequency of 300-400 MHz?

Thanks in advance
-BLue

That really depends on the hardware you're targetting. In a Xilinx
FPGA, I'd say your only hope is to use a look-up table (block RAM)
and then if you needed interpolation, you could use a multiplier -
but getting the interpolation to run at 300+ MHz might be tough since
after the block RAM you only have one more cycle to meet your 2 clock
latency. Do you really need such a low latency or do you really mean
that you need to start a new operation at least every two cycles.
There's a big difference, and it would certainly be easy to run
pipelined with even a new operation starting on every clock as long
as you can live with a few cycles of latency (Xilinx DSP48 multipliers
run fastest with 3 cycle latency, +1 for BRAM = 4 cycles).

If you're targetting an ASIC, then there are probably a lot more
options.

-- Gabor

I thought your other response was more interesting. Potentially the
exponent can be broken into smaller pieces using much smaller lookup
tables, efficiently implemented in block RAM. If the value of the
exponent is broken into two 8 bit pieces and a value returned from two
separate 256 word, 16 bit lookup tables, these values can be multiplied
to get an exact result. e^(a+b) = e^a * e^b. This should be faster and
simpler than an interpolation.

How fast can a multiply be done in this part?

Rick
That's a good point. And the lower bits "b" represent a small enough

number that e^b can be approximated as 1+b so you don't need a second
lookup table.
The multipliers in the DSP blocks are quite fast, and I know they can
do 500 MHz when you use full pipelining, but I'm not sure if they can
even to 300 MHz without pipelining. You'd have to run it through the
tools to see.

-- Gabor

exponential function

blue39

Guest

Andy

Guest

GaborSzakacs

Guest

blue39

Guest

blue39

Guest

GaborSzakacs

Guest

rickman

Guest

Gabor

Guest

Welcome to EDABoard.com

Sponsor

Online statistics

Forum statistics

exponential function

blue39

Guest

Andy

Guest

GaborSzakacs

Guest

blue39

Guest

blue39

Guest

GaborSzakacs

Guest

rickman

Guest

Gabor

Guest

Log in

Welcome to EDABoard.com

Sponsor