fast arbiters (was Re: How to design an abitration cicuit...

J

Joseph H Allen

Guest
This is the way I make arbiters:

// Concise priority arbiter
input [26:0] req; // Bit zero is highest priority
wire [26:0] gnt = req & -req; // Isolate least significant set bit

Since this method uses fast carry logic, it's quite fast- the best win is in
FPGAs, where the carry chain is much faster than regular routing. If you
can do a 32-bit add in one cycle, you can certainly do a 27-bit priority
arbiter. With ASICs or fully custom, you can pick the carry lookahead method
of your choice. Notice also that it is very easy to parameterize.

It's not hard to make a full round-robin arbiter out of this:

reg [26:0] prev; // Previous winner (saved from last cycle)
wire [26:0] req1 = req & ~((prev - 1) | prev); // Mask off previous winners
wire [26:0] gnt1 = req1 & -req1; // Select new winner
wire [26:0] winner = |gnt1 ? gnt1 : gnt; // Start from bit zero if none

// Save previous winner
always @(posedge clk) if (winner) prev <= winner;

The idea is that this expression converts from "one-hot" coding to
"thermometer" coding: ((x-1)|x))

Want more tricks? I highly recommend Henry Warren's _Hacker's Delight_:
http://www.hackersdelight.org

See also HAKMEM- MIT AI Lab Memo No. 239:
http://www.inwap.com/pdp10/hbaker/hakmem/hakmem.html
(It seems that the people at the AI lab were interested in just about
anything except AI :)

Also the unpublished (but downloadable) Volume 4A of Knuth's _The Art of
Computer Programming_:

http://www-cs-faculty.stanford.edu/~knuth/taocp.html

--
/* jhallen@world.std.com AB1GO */ /* Joseph H. Allen */
int a[1817];main(z,p,q,r){for(p=80;q+p-80;p-=2*a[p])for(z=9;z--;)q=3&(r=time(0)
+r*57)/7,q=q?q-1?q-2?1-p%79?-1:0:p%79-77?1:0:p<1659?79:0:p>158?-79:0,q?!a[p+q*2
]?a[p+=a[p+=q]=q]=q:0:0;for(;q++-1817;)printf(q%79?"%c":"%c\n"," #"[!a[q-1]]);}
 
In article <1178033628.894456.55660@p77g2000hsh.googlegroups.com>,
Quang Anh <nvqanh@gmail.com> wrote:

// Concise priority arbiter
input [26:0] req; // Bit zero is highest priority
wire [26:0] gnt = req & -req; // Isolate least significant set bit

I'm so sorry that I can NOT understand your idea. Maybe, I should read
the documents you recommend first. Anyway, it would be great for me if
you spend your ttime explaining to me again.
Try an example: suppose (for 10 bits) req is 1101011000

The definition of '-' (2's complement negate) is to invert each bit and then
add 1, so lets see what happens:

Invert: 0010100111
Now if you AND this with req you'll get zero. But look at what happens when
you add 1: the carry propogates through all of the trailing ones until the
first zero is hit:

Negate: 0010100111 + 1 --> 0010101000
Now if you AND this with req you'll get: 0000001000

Bit 3 is the highest priority requester.

--
/* jhallen@world.std.com AB1GO */ /* Joseph H. Allen */
int a[1817];main(z,p,q,r){for(p=80;q+p-80;p-=2*a[p])for(z=9;z--;)q=3&(r=time(0)
+r*57)/7,q=q?q-1?q-2?1-p%79?-1:0:p%79-77?1:0:p<1659?79:0:p>158?-79:0,q?!a[p+q*2
]?a[p+=a[p+=q]=q]=q:0:0;for(;q++-1817;)printf(q%79?"%c":"%c\n"," #"[!a[q-1]]);}
 
On Apr 28, 4:09 pm, jhal...@TheWorld.com (Joseph H Allen) wrote:
This is the way I make arbiters:

// Concise priority arbiter
input [26:0] req; // Bit zero is highest priority
wire [26:0] gnt = req & -req; // Isolate least significant set bit

Since this method uses fast carry logic, it's quite fast- the best win is in
FPGAs, where the carry chain is much faster than regular routing. If you
can do a 32-bit add in one cycle, you can certainly do a 27-bit priority
arbiter. With ASICs or fully custom, you can pick the carry lookahead method
of your choice. Notice also that it is very easy to parameterize.

It's not hard to make a full round-robin arbiter out of this:

reg [26:0] prev; // Previous winner (saved from last cycle)
wire [26:0] req1 = req & ~((prev - 1) | prev); // Mask off previous winners
wire [26:0] gnt1 = req1 & -req1; // Select new winner
wire [26:0] winner = |gnt1 ? gnt1 : gnt; // Start from bit zero if none

// Save previous winner
always @(posedge clk) if (winner) prev <= winner;

The idea is that this expression converts from "one-hot" coding to
"thermometer" coding: ((x-1)|x))

Want more tricks? I highly recommend Henry Warren's _Hacker's Delight_:
http://www.hackersdelight.org

See also HAKMEM- MIT AI Lab Memo No. 239:
http://www.inwap.com/pdp10/hbaker/hakmem/hakmem.html
(It seems that the people at the AI lab were interested in just about
anything except AI :)

Also the unpublished (but downloadable) Volume 4A of Knuth's _The Art of
Computer Programming_:

http://www-cs-faculty.stanford.edu/~knuth/taocp.html

--
/* jhal...@world.std.com AB1GO */ /* Joseph H. Allen */
int a[1817];main(z,p,q,r){for(p=80;q+p-80;p-=2*a[p])for(z=9;z--;)q=3&(r=time(0)
+r*57)/7,q=q?q-1?q-2?1-p%79?-1:0:p%79-77?1:0:p<1659?79:0:p>158?-79:0,q?!a[p+q*2
]?a[p+=a[p+=q]=q]=q:0:0;for(;q++-1817;)printf(q%79?"%c":"%c\n"," #"[!a[q-1]]);}

Hi Joseph,

Thank you so much for you instruction.

// Concise priority arbiter
input [26:0] req; // Bit zero is highest priority
wire [26:0] gnt = req & -req; // Isolate least significant set bit

Since this method uses fast carry logic, it's quite fast- the best win is in
FPGAs, where the carry chain is much faster than regular routing. If you
can do a 32-bit add in one cycle, you can certainly do a 27-bit priority
arbiter. With ASICs or fully custom, you can pick the carry lookahead method
of your choice. Notice also that it is very easy to parameterize.
I'm so sorry that I can NOT understand your idea. Maybe, I should read
the documents you recommend first. Anyway, it would be great for me if
you spend your ttime explaining to me again.

It's not hard to make a full round-robin arbiter out of this:
I've heard about it. And I understand your idea. Thanks a lots for
telling me.

Once again, thank you so much for your help.

Best regards,
Quang Anh
 
On May 2, 9:11 am, jhal...@TheWorld.com (Joseph H Allen) wrote:
In article <1178033628.894456.55...@p77g2000hsh.googlegroups.com>,
Quang Anh <nvq...@gmail.com> wrote:

// Concise priority arbiter
input [26:0] req; // Bit zero is highest priority
wire [26:0] gnt = req & -req; // Isolate least significant set bit
I'm so sorry that I can NOT understand your idea. Maybe, I should read
the documents you recommend first. Anyway, it would be great for me if
you spend your ttime explaining to me again.

Try an example: suppose (for 10 bits) req is 1101011000

The definition of '-' (2's complement negate) is to invert each bit and then
add 1, so lets see what happens:

Invert: 0010100111
Now if you AND this with req you'll get zero. But look at what happens when
you add 1: the carry propogates through all of the trailing ones until the
first zero is hit:

Negate: 0010100111 + 1 --> 0010101000
Now if you AND this with req you'll get: 0000001000

Bit 3 is the highest priority requester.

--
/* jhal...@world.std.com AB1GO */ /* Joseph H. Allen */
int a[1817];main(z,p,q,r){for(p=80;q+p-80;p-=2*a[p])for(z=9;z--;)q=3&(r=time(0)
+r*57)/7,q=q?q-1?q-2?1-p%79?-1:0:p%79-77?1:0:p<1659?79:0:p>158?-79:0,q?!a[p+q*2
]?a[p+=a[p+=q]=q]=q:0:0;for(;q++-1817;)printf(q%79?"%c":"%c\n"," #"[!a[q-1]]);}
Since gnt1 implies that there was a req1, wouldn't it be better
(faster) to use req1 in the winner selection?

wire [26:0] winner = |req1 ? gnt1 : gnt; // Start from bit zero if
none
 
On May 3, 7:41 pm, jhal...@TheWorld.com (Joseph H Allen) wrote:
In article <1178200419.714442.227...@l77g2000hsb.googlegroups.com>,

romi <webe...@gmail.com> wrote:
On May 2, 9:11 am, jhal...@TheWorld.com (Joseph H Allen) wrote:
// Concise priority arbiter
input [26:0] req; // Bit zero is highest priority
wire [26:0] gnt = req & -req; // Isolate least significant set bit
reg [26:0] prev; // Previous winner (saved from last cycle)
wire [26:0] req1 = req & ~((prev - 1) | prev); // Mask off previous winners
wire [26:0] gnt1 = req1 & -req1; // Select new winner
wire [26:0] winner = |gnt1 ? gnt1 : gnt; // Start from bit zero if none
// Save previous winner
always @(posedge clk) if (winner) prev <= winner;
Since gnt1 implies that there was a req1, wouldn't it be better
(faster) to use req1 in the winner selection?
wire [26:0] winner = |req1 ? gnt1 : gnt; // Start from bit zero if
none

Yes. Cool. :)

Here are some other variations:

If the carry chain is really fast, get rid of the mux:

wire [53:0] req1 = { req, req & ~((prev - 1) | prev) };
wire [53:0] gnt1 = req1 & -req1;
wire [26:0] winner = gnt1[53:27] | gnt1[26:0];

Or use wrap-around carry, if you don't mind the combinatorial loop:

// Rotate previous winner one bit to the left. If no previous winner,
// pretend it was prev[26].
wire [26:0] prev1 = |prev ? { prev[25:0], prev[26] } : 27'd1;

// Wrap-around two's complement where the add 1 is just to the left of the
// previous winner instead of at bit 0.
wire [27:0] tmp = { 1'b0, ~req } + ({ 1'b0, prev1 } | { 27'b0, tmp[27] });

wire winner = req & tmp[26:0];

This is probably the fastest, but you need a synthesis tool which allows
combinatorial loops.

--
/* jhal...@world.std.com AB1GO */ /* Joseph H. Allen */
int a[1817];main(z,p,q,r){for(p=80;q+p-80;p-=2*a[p])for(z=9;z--;)q=3&(r=time(0)
+r*57)/7,q=q?q-1?q-2?1-p%79?-1:0:p%79-77?1:0:p<1659?79:0:p>158?-79:0,q?!a[p+q*2
]?a[p+=a[p+=q]=q]=q:0:0;for(;q++-1817;)printf(q%79?"%c":"%c\n"," #"[!a[q-1]]);}

Hello,

This is probably the fastest, but you need a synthesis tool which allows
combinatorial loops.
Since I'm doing a real job, I need to perform STA (Static Timing
Analysis) for my design. Therefore, combinational loops are not
allowed. Now, I'm investigating some ways and follow some instructions
from someone. But it seems that I can not make thing better. However,
my project manager told me that he would probably allow me to use a
different technology library to synthesize my design if the timing was
not met finally.

By the way, thank you so much for all your helps so far.
Quang Anh
 

Welcome to EDABoard.com

Sponsor

Back
Top