pipelined divider

ykagarwal · Sep 9, 2003

would like to know which is the best algorithm to
make a pipelined divider in hardware. newton raphson,
goldshmit .. srt(is it possible?)
if i have space as much as to have as much as 5 radix-4
srt dividers in a xilinx v2 fpga..

thanks in advance--

Glen Herrmannsfeldt · Sep 9, 2003

"ykagarwal" <yog_aga@yahoo.co.in> wrote in message
news:4d05e2c6.0309090919.261490a1@posting.google.com...

would like to know which is the best algorithm to
make a pipelined divider in hardware. newton raphson,
goldshmit .. srt(is it possible?)
if i have space as much as to have as much as 5 radix-4
srt dividers in a xilinx v2 fpga..

Pipelined dividers have been used on machines like the IBM 360/91 and the
Cray-1, and are well described in pipelined computer architecture books for
many years after those machines were built.

Though in both cases they are used for floating point, where the
requirements are different. The 360/91, for example, rounds the low bit
instead of truncating as the architecture specifies, and would be usual in
fixed point. I don't know how hard that would be to change.

-- glen

ykagarwal · Sep 10, 2003

"Glen Herrmannsfeldt" <gah@ugcs.caltech.edu> wrote in message news:<F7q7b.408266$uu5.74285@sccrnsc04>...

"ykagarwal" <yog_aga@yahoo.co.in> wrote in message
news:4d05e2c6.0309090919.261490a1@posting.google.com...
would like to know which is the best algorithm to
make a pipelined divider in hardware. newton raphson,
goldshmit .. srt(is it possible?)
if i have space as much as to have as much as 5 radix-4
srt dividers in a xilinx v2 fpga..

Pipelined dividers have been used on machines like the IBM 360/91 and the
Cray-1, and are well described in pipelined computer architecture books for
many years after those machines were built.

Though in both cases they are used for floating point, where the
requirements are different. The 360/91, for example, rounds the low bit
instead of truncating as the architecture specifies, and would be usual in
fixed point. I don't know how hard that would be to change.

-- glen

well my requirement is too for double precision .. would u like to
suggest me a pipelined
comp arch book for this purpose.. anyway what is the best way, that's
what i want to explore first.

Xilinx coregen divider core doesn't offer that much width in its
pipelined divider .. don't know why
may be xilinx gurus can justify .. anybody knows which algorithm they
are using ?

regards
--yka

Steve Casselman · Sep 10, 2003

Look up online arithmetic.

Steve

well my requirement is too for double precision .. would u like to
suggest me a pipelined
comp arch book for this purpose.. anyway what is the best way, that's
what i want to explore first.

Xilinx coregen divider core doesn't offer that much width in its
pipelined divider .. don't know why
may be xilinx gurus can justify .. anybody knows which algorithm they
are using ?

regards
--yka

Glen Herrmannsfeldt · Sep 10, 2003

"ykagarwal" <yog_aga@yahoo.co.in> wrote in message
news:4d05e2c6.0309092246.2ead33f0@posting.google.com...

(snip regarding pipelined divider)

well my requirement is too for double precision .. would u like to
suggest me a pipelined
comp arch book for this purpose.. anyway what is the best way, that's
what i want to explore first.

The one I have here is "The Architecture of Pipelined Computers" by Kogge.

Xilinx coregen divider core doesn't offer that much width in its
pipelined divider .. don't know why
may be xilinx gurus can justify .. anybody knows which algorithm they
are using ?

I don't know that, either. It might be because they didn't imagine anyone
wanting to put something like that into an FPGA. They are likely pretty
big, but in some cases it might be worth the size.

-- glen

ykagarwal · Sep 20, 2003

Ray Andraka <ray@andraka.com> wrote in message news:<3F6B8B64.F3EBCA2B@andraka.com>...

Depends on how clever the designer is. I'd wager that better than 95%t of the
hardware engineers today couldn't design the 360/91 from scratch with 10 times
the logic resources of the original.

Glen Herrmannsfeldt wrote:

I do wonder how many Virtex devices it would take to implement a 360/91.

-- glen

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930 Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

"They that give up essential liberty to obtain a little
temporary safety deserve neither liberty nor safety."
-Benjamin Franklin, 1759
hello,

360/91 machine and associated history is really an inspiration to
younger designers like me ..
and your comments too

unnecessarily jumped
--yka

Jake Janovetz · Sep 20, 2003

Changing times... Logic resources are cheap compared to a designer's
time (and time to market considerations). Same argument can be made
with software. How many current software engineers could write a full
game (or complete programming language) that fits on an 8kbyte
cartridge?

It's certainly an interesting question.

Jake

Ray Andraka <ray@andraka.com> wrote in message news:<3F6B8B64.F3EBCA2B@andraka.com>...

Depends on how clever the designer is. I'd wager that better than 95%t of the
hardware engineers today couldn't design the 360/91 from scratch with 10 times
the logic resources of the original.

Glen Herrmannsfeldt wrote:

I do wonder how many Virtex devices it would take to implement a 360/91.

-- glen

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930 Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

"They that give up essential liberty to obtain a little
temporary safety deserve neither liberty nor safety."
-Benjamin Franklin, 1759

Simon Peacock · Sep 21, 2003

But my PC which runs the latest version of the CAD tools I brought 10 years
ago.. has the power dissipation of a small heater.. and the software runs
slower.. good thing I don't live in California where there's not enough
power

Simon

"Jake Janovetz" <jakespambox@yahoo.com> wrote in message
news:d6ad3144.0309201026.78874571@posting.google.com...

Changing times... Logic resources are cheap compared to a designer's
time (and time to market considerations). Same argument can be made
with software. How many current software engineers could write a full
game (or complete programming language) that fits on an 8kbyte
cartridge?

It's certainly an interesting question.

Jake

Ray Andraka <ray@andraka.com> wrote in message
news:<3F6B8B64.F3EBCA2B@andraka.com>...
Depends on how clever the designer is. I'd wager that better than 95%t
of the
hardware engineers today couldn't design the 360/91 from scratch with 10
times
the logic resources of the original.

Glen Herrmannsfeldt wrote:

I do wonder how many Virtex devices it would take to implement a
360/91.

-- glen

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930 Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

"They that give up essential liberty to obtain a little
temporary safety deserve neither liberty nor safety."
-Benjamin Franklin, 1759

ykagarwal · Sep 22, 2003

passing thought ~~~

there exists one ultimate natural machine,
design of which can't even be copied

philosophy is a junk isn't it.
--yka

Ray Andraka · Sep 29, 2003

Not yet, anyway.

ykagarwal wrote:

there exists one ultimate natural machine,
design of which can't even be copied

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930 Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

"They that give up essential liberty to obtain a little
temporary safety deserve neither liberty nor safety."
-Benjamin
Franklin, 1759

ykagarwal · Sep 11, 2003

"Glen Herrmannsfeldt" <gah@ugcs.caltech.edu> wrote in message news:<HHM7b.410438$Ho3.64641@sccrnsc03>...

"ykagarwal" <yog_aga@yahoo.co.in> wrote in message
news:4d05e2c6.0309092246.2ead33f0@posting.google.com...

(snip regarding pipelined divider)

well my requirement is too for double precision .. would u like to
suggest me a pipelined
comp arch book for this purpose.. anyway what is the best way, that's
what i want to explore first.

The one I have here is "The Architecture of Pipelined Computers" by Kogge.

Xilinx coregen divider core doesn't offer that much width in its
pipelined divider .. don't know why
may be xilinx gurus can justify .. anybody knows which algorithm they
are using ?

I don't know that, either. It might be because they didn't imagine anyone
wanting to put something like that into an FPGA. They are likely pretty
big, but in some cases it might be worth the size.

-- glen

fine, thanks i cud find the book (bit old edition probably)
here but there is no detail abt pipelined divider as such ..
anyway if somebody comes across the thing may suggest.
and xilinx probably shud give a sequential version at least for
larger width
(i've made it anyway)

--yka

Tom Seim · Sep 11, 2003

Check these IEEE references:

Efficient designs of unified 2's complement division and square root
algorithm and architecture
Sau-Gee Chen; Chieh-Chih Li;
TENCON '94. IEEE Region 10's Ninth Annual International Conference.
Theme: 'Frontiers of Computer Technology'. Proceedings of 1994 , 22-26
Aug. 1994
Page(s): 943 -947 vol.2

A new pipelined divider with a small lookup table
Jong-Chul Jeong; Woong Jeong; Hyun-Jae Woo; Seung-Ho Kwak; Woo-Chan
Park; Moon-Key Lee; Tak-don Han;
ASIC, 2002. Proceedings. 2002 IEEE Asia-Pacific Conference on , 6-8
Aug. 2002
Page(s): 33 -36

Efficient semisystolic architectures for finite-field arithmetic
Jain, S.K.; Song, L.; Parhi, K.K.;
Very Large Scale Integration (VLSI) Systems, IEEE Transactions on ,
Volume: 6 Issue: 1 , March 1998
Page(s): 101 -113

Glen Herrmannsfeldt · Sep 12, 2003

"ykagarwal" <yog_aga@yahoo.co.in> wrote in message
news:4d05e2c6.0309110200.71793e02@posting.google.com...

(snip)

fine, thanks i cud find the book (bit old edition probably)
here but there is no detail abt pipelined divider as such ..
anyway if somebody comes across the thing may suggest.
and xilinx probably shud give a sequential version at least for
larger width
(i've made it anyway)

The references for the 360/91 are to the IBM Research and Development
Journal, I believe Vol. 11.,
January 1967.

-- glen

ykagarwal · Sep 12, 2003

soar2morrow@yahoo.com (Tom Seim) wrote in message news:<6c71b322.0309111000.5458aeee@posting.google.com>...

Check these IEEE references:

Efficient designs of unified 2's complement division and square root
algorithm and architecture
Sau-Gee Chen; Chieh-Chih Li;
TENCON '94. IEEE Region 10's Ninth Annual International Conference.
Theme: 'Frontiers of Computer Technology'. Proceedings of 1994 , 22-26
Aug. 1994
Page(s): 943 -947 vol.2

A new pipelined divider with a small lookup table
Jong-Chul Jeong; Woong Jeong; Hyun-Jae Woo; Seung-Ho Kwak; Woo-Chan
Park; Moon-Key Lee; Tak-don Han;
ASIC, 2002. Proceedings. 2002 IEEE Asia-Pacific Conference on , 6-8
Aug. 2002
Page(s): 33 -36

Efficient semisystolic architectures for finite-field arithmetic
Jain, S.K.; Song, L.; Parhi, K.K.;
Very Large Scale Integration (VLSI) Systems, IEEE Transactions on ,
Volume: 6 Issue: 1 , March 1998
Page(s): 101 -113

thanks for the pointers .. i have found some of them. looking into the
NR and its variants .. whether it's possible to fit it into some 3000 slices
in virtex-ii .. may be i'll have to increase no of iteration per div step ..

Glen Herrmannsfeldt · Sep 12, 2003

"ykagarwal" <yog_aga@yahoo.co.in> wrote in message
news:4d05e2c6.0309112255.3dbc30e4@posting.google.com...

soar2morrow@yahoo.com (Tom Seim) wrote in message
news:<6c71b322.0309111000.5458aeee@posting.google.com>...
Check these IEEE references:

Efficient designs of unified 2's complement division and square root
algorithm and architecture
Sau-Gee Chen; Chieh-Chih Li;
TENCON '94. IEEE Region 10's Ninth Annual International Conference.
Theme: 'Frontiers of Computer Technology'. Proceedings of 1994 , 22-26
Aug. 1994
Page(s): 943 -947 vol.2

A new pipelined divider with a small lookup table
Jong-Chul Jeong; Woong Jeong; Hyun-Jae Woo; Seung-Ho Kwak; Woo-Chan
Park; Moon-Key Lee; Tak-don Han;
ASIC, 2002. Proceedings. 2002 IEEE Asia-Pacific Conference on , 6-8
Aug. 2002
Page(s): 33 -36

Efficient semisystolic architectures for finite-field arithmetic
Jain, S.K.; Song, L.; Parhi, K.K.;
Very Large Scale Integration (VLSI) Systems, IEEE Transactions on ,
Volume: 6 Issue: 1 , March 1998
Page(s): 101 -113

thanks for the pointers .. i have found some of them. looking into the
NR and its variants .. whether it's possible to fit it into some 3000
slices
in virtex-ii .. may be i'll have to increase no of iteration per div step
...

The 360/91 was built from transistors glued onto ceramic substrates, and
wired together. It did double precision floating point divide in 18 clock
cycles, though. I think it is three clock cycles per iteration, so six
iterations.

I do wonder how many Virtex devices it would take to implement a 360/91.

-- glen

ykagarwal · Sep 13, 2003

"Glen Herrmannsfeldt" <gah@ugcs.caltech.edu> wrote in message news:<bXf8b.419911$YN5.284114@sccrnsc01>...

"ykagarwal" <yog_aga@yahoo.co.in> wrote in message
news:4d05e2c6.0309112255.3dbc30e4@posting.google.com...
soar2morrow@yahoo.com (Tom Seim) wrote in message
news:<6c71b322.0309111000.5458aeee@posting.google.com>...
Check these IEEE references:

Efficient designs of unified 2's complement division and square root
algorithm and architecture
Sau-Gee Chen; Chieh-Chih Li;
TENCON '94. IEEE Region 10's Ninth Annual International Conference.
Theme: 'Frontiers of Computer Technology'. Proceedings of 1994 , 22-26
Aug. 1994
Page(s): 943 -947 vol.2

A new pipelined divider with a small lookup table
Jong-Chul Jeong; Woong Jeong; Hyun-Jae Woo; Seung-Ho Kwak; Woo-Chan
Park; Moon-Key Lee; Tak-don Han;
ASIC, 2002. Proceedings. 2002 IEEE Asia-Pacific Conference on , 6-8
Aug. 2002
Page(s): 33 -36

Efficient semisystolic architectures for finite-field arithmetic
Jain, S.K.; Song, L.; Parhi, K.K.;
Very Large Scale Integration (VLSI) Systems, IEEE Transactions on ,
Volume: 6 Issue: 1 , March 1998
Page(s): 101 -113

thanks for the pointers .. i have found some of them. looking into the
NR and its variants .. whether it's possible to fit it into some 3000
slices
in virtex-ii .. may be i'll have to increase no of iteration per div step
..

The 360/91 was built from transistors glued onto ceramic substrates, and
wired together. It did double precision floating point divide in 18 clock
cycles, though. I think it is three clock cycles per iteration, so six
iterations.

I do wonder how many Virtex devices it would take to implement a 360/91.

-- glen

hello,
just curious how much hardware did ur implementation take ?

thinking now of 3rd/4th order NR with 14/11 bit lut approximation with
unrolled loop (not independent sqr cubing units) .. giving a fully
pipelined thing with some tolerable latency don't know
whether it will fit.

pipelined divider

ykagarwal

Guest

Glen Herrmannsfeldt

Guest

ykagarwal

Guest

Steve Casselman

Guest

Glen Herrmannsfeldt

Guest

ykagarwal

Guest

Jake Janovetz

Guest

Simon Peacock

Guest

ykagarwal

Guest

Ray Andraka

Guest

ykagarwal

Guest

Tom Seim

Guest

Glen Herrmannsfeldt

Guest

ykagarwal

Guest

Glen Herrmannsfeldt

Guest

ykagarwal

Guest

Welcome to EDABoard.com

Sponsor

Online statistics

Forum statistics

pipelined divider

ykagarwal

Guest

Glen Herrmannsfeldt

Guest

ykagarwal

Guest

Steve Casselman

Guest

Glen Herrmannsfeldt

Guest

ykagarwal

Guest

Jake Janovetz

Guest

Simon Peacock

Guest

ykagarwal

Guest

Ray Andraka

Guest

ykagarwal

Guest

Tom Seim

Guest

Glen Herrmannsfeldt

Guest

ykagarwal

Guest

Glen Herrmannsfeldt

Guest

ykagarwal

Guest

Log in

Welcome to EDABoard.com

Sponsor