Video Blob Analysis on FPGAs

B

Brad Smallridge

Guest
I have been doing some investigation into blob analysis algorithms and would
like to know if there are any designs that have been renedered into FPGAs.
Thanks for your help.

Brad Smallridge
AIVISION.COM
415-661-8068
 
Brad Smallridge wrote:

Still looking for some suggestions.
Consider google.com

http://www.itee.uq.edu.au/~damien/GuRoo/theses/Wong.pdf

-- Mike Treseler
 
What I did for Google was this:

FPGA OR ASIC "blob analysis"

which gave about 165 hits, but none of them seem to
point me in the direction of any source code or even to
any theoretical discussions. If you think I should
try another search then please do suggest something.

Someone was king enough to email me personally
and suggested looking for some articals by Sumitomo,
which produced a blob analysis chip some years ago.
I don't know if that chip still exist and I can't find
the articles mentioned. I added Sumitomo to the
search above and received only three hits, none of
them talked about blob analysis.


Still looking for some suggestions.

Consider google.com

http://www.itee.uq.edu.au/~damien/GuRoo/theses/Wong.pdf

-- Mike Treseler
 
"Brad Smallridge" <bradsmallridge@dslextreme.com> wrote in message
news:10ad0596rrmu4da@corp.supernews.com...
What I did for Google was this:

FPGA OR ASIC "blob analysis"

which gave about 165 hits, but none of them seem to
point me in the direction of any source code or even to
any theoretical discussions. If you think I should
try another search then please do suggest something.
Are you trying to what the CMU CAM does? Have you read the CMU papers? Blob
analysis doesn't mean anything to me so I'm guessing here, but perhaps you
meant center-of-mass type things?

The image processing stuff is generally DSP based as the algoritms are to
delineate an area using a standard convolver algorithm and then to track its
"center of mass" using standard graphics techniques for finding the center
of a polygon. (See Eberly's FAQ for that code).

Off the top of my head I'm guessing you could do some clever stuff whereby
you keep an array of pixels in a the block ram of an FPGA, and use some form
of iso-bar type system to connect regions with similar pixels. I'm not sure
what the logic would look like, something like

for (pixel = 2048 downto 0)
begin
if ( NEIGHBOR(0,pixel) or
NEIGHBOR(1, pixel) or
NEIGHBOR(2, pixel) or
NEIGHBOR(3, pixel) or
...
NEIGHBOR(7, pixel) ) then
group[pixel] <= group_id;
group_id++;
end if;
end for;

Like I said, I'm no expert. Basically you want logic that creates a
comparator whose inputs are the logically adjacent pixels for each
individual pixel, and a second array which is the group membership function
(in or out). If you set a specific criteria for membership (like the pixel
is a particular color) then this will probably work for you.

Then on each vertical retrace you need to compute the center of mass of
pixels in the "group."

--Chuck
 
"Brad Smallridge" <bradsmallridge@dslextreme.com> wrote...
I have been doing some investigation into blob analysis algorithms
and would like to know if there are any designs that have been
rendered into FPGAs.
Brad,

We are doing some work in this area for a robotics application. When
we first started our investigation of the literature, we found that
the existing implementations did not meet with our requirements for
implementation in an FPGA.

In our system we require that the pixels be presented to any image
processing engine only once, or we will get killed on memory
bandwidth. Most of the existing literature details algorithms that
are stack based that iterate through the image in a pattern that can
not be known a priori. Additionally, these algorithms tend to
require a variable amount of time, depending on the number of blobs
that are identified, and the shapes of these blobs.

A run length encoding approach, which I have seen detailed in the
literature, is a good starting point for the requirements that we had,
especially since we could ignore non-convex blobs in our application.
Though, it should be noted, a linescan blob finder can find non-
convex blobs, as long as it is capable of merging two or more blobs
that are found to overlap later in the image.

For any discussion to go forward, it would be helpful to understand
what your requirements are. If this is a commercial application,
please feel free to contact me offline, as we may be able to adapt
an existing core from our library.

If this is to be a public exercise, please offer the group an idea
as to the following:
performance requirement
application
maximum number of objects that you wish to find in a given image
whether you need to just find objects, or track across images
do you need to find non-convex blobs?

Blobfinding can be done in guaranteed time in a linescan fpga
implementation, the details of the implementation are highly
dependent upon the answers to the above questions.


Regards,
Erik Widding.

---
Birger Engineering, Inc. -------------------------------- 617.695.9233
100 Boylston St #1070; Boston, MA 02116 -------- http://www.birger.com
 
Hey Chuck,

Always good to hear from fellow roboticist.

I have read some CMU documentation but nothing
about blob analysis. I had believed that the CMU
can only track one blob whereas I need to collect
data about lots of blobs. Center of gravity is good.
Long axis, area, etc. If you suggest I will look
deeper into the CMU docs and web sites. Thanks.

What is iso-bar?

On one of your other post you are comparing
a std_logic_vector to another. I suggest you
repost and state what libraries you are using.
I remember a difficulty in the original VHDL
spec that didn't allow for this that may have
been patched up by a library. VHDL has
been a work in progress for many years with
most improvements added via libraries.

Brad
AiVision.com

"Chuck McManis" <devnull@mcmanis.com> wrote in message
news:IEypc.67566$HV1.30587@newssvr25.news.prodigy.com...
"Brad Smallridge" <bradsmallridge@dslextreme.com> wrote in message
news:10ad0596rrmu4da@corp.supernews.com...
What I did for Google was this:

FPGA OR ASIC "blob analysis"

which gave about 165 hits, but none of them seem to
point me in the direction of any source code or even to
any theoretical discussions. If you think I should
try another search then please do suggest something.

Are you trying to what the CMU CAM does? Have you read the CMU papers?
Blob
analysis doesn't mean anything to me so I'm guessing here, but perhaps you
meant center-of-mass type things?

The image processing stuff is generally DSP based as the algoritms are to
delineate an area using a standard convolver algorithm and then to track
its
"center of mass" using standard graphics techniques for finding the center
of a polygon. (See Eberly's FAQ for that code).

Off the top of my head I'm guessing you could do some clever stuff whereby
you keep an array of pixels in a the block ram of an FPGA, and use some
form
of iso-bar type system to connect regions with similar pixels. I'm not
sure
what the logic would look like, something like

for (pixel = 2048 downto 0)
begin
if ( NEIGHBOR(0,pixel) or
NEIGHBOR(1, pixel) or
NEIGHBOR(2, pixel) or
NEIGHBOR(3, pixel) or
...
NEIGHBOR(7, pixel) ) then
group[pixel] <= group_id;
group_id++;
end if;
end for;

Like I said, I'm no expert. Basically you want logic that creates a
comparator whose inputs are the logically adjacent pixels for each
individual pixel, and a second array which is the group membership
function
(in or out). If you set a specific criteria for membership (like the pixel
is a particular color) then this will probably work for you.

Then on each vertical retrace you need to compute the center of mass of
pixels in the "group."

--Chuck
 
Hi Erik,

Another roboticist. Thanks so much.

You are exactly right. I want an algorithm that is in sync with
the pixel clock. This is generally how I think about and build
all my algorithms. And so I would generally expect to keep
a several lines of binarized data in the FPGA but not an entire
frame. Or perhaps we could use line scan cameras and not
be thinking in terms of frames at all.

I could imagine having up to 100 blobs on any line and I
don't see why I should limit myself there. The maximum
number of active blobs would be one half the horizontal
resolution.

I have done some armchair thinking about how to keep and
combine blobs if they should merge in susbsequent lines.
This may be done by keeping start and stop points of the
various blobs in a RAM LUT. But maybe this isn't the
best approach?

I saw mention of run length encoding in a web page by
Wintress. I do not know what advantage that encoding has
over a fifo that would give you the pixel threshold on every
pixel clock. I do know that some camera manufacturers like
Basler make cameras that threshold and run length encode
their outputs presumably to cut down on cable bandwidth.

I am not sure what you mean by the term non-convex. To
a newbie that is concave but you don't use the word concave
so perhaps this has a different meaning.

I have also come across 4 neighbor or 8 neighbor terminology
which I assume represent the rules for making a connection,
based on a tic-tac-toe arrangement of pixels. I would suppose
that the 8 neighbor approach would be a tad more robust but
may lead to narrowly connected blobs with only a point in
common. Whis is better?

I do not need to "track across images" if you are talking about
some sort of temporal tracking from one frame to the next.

On my web searching I have tried to refine my search to
just source code by using unique VHDL keywords like
downto and inout. However I didn't find anything with
this search: downto inout blob. Any suggestions to
better search techniques or Verilog keywords will
be appreciated.

Do you want to share with the group what Birger does?

Brad Smallridge
AiVision.com


"Erik Widding" <widding@birger.com> wrote in message
news:afe40eec.0405161128.75b173ca@posting.google.com...
"Brad Smallridge" <bradsmallridge@dslextreme.com> wrote...
I have been doing some investigation into blob analysis algorithms
and would like to know if there are any designs that have been
rendered into FPGAs.

Brad,

We are doing some work in this area for a robotics application. When
we first started our investigation of the literature, we found that
the existing implementations did not meet with our requirements for
implementation in an FPGA.

In our system we require that the pixels be presented to any image
processing engine only once, or we will get killed on memory
bandwidth. Most of the existing literature details algorithms that
are stack based that iterate through the image in a pattern that can
not be known a priori. Additionally, these algorithms tend to
require a variable amount of time, depending on the number of blobs
that are identified, and the shapes of these blobs.

A run length encoding approach, which I have seen detailed in the
literature, is a good starting point for the requirements that we had,
especially since we could ignore non-convex blobs in our application.
Though, it should be noted, a linescan blob finder can find non-
convex blobs, as long as it is capable of merging two or more blobs
that are found to overlap later in the image.

For any discussion to go forward, it would be helpful to understand
what your requirements are. If this is a commercial application,
please feel free to contact me offline, as we may be able to adapt
an existing core from our library.

If this is to be a public exercise, please offer the group an idea
as to the following:
performance requirement
application
maximum number of objects that you wish to find in a given image
whether you need to just find objects, or track across images
do you need to find non-convex blobs?

Blobfinding can be done in guaranteed time in a linescan fpga
implementation, the details of the implementation are highly
dependent upon the answers to the above questions.


Regards,
Erik Widding.

---
Birger Engineering, Inc. -------------------------------- 617.695.9233
100 Boylston St #1070; Boston, MA 02116 -------- http://www.birger.com
 
"Brad Smallridge" <bradsmallridge@dslextreme.com> wrote in message
news:10ai4se7jrp67d7@corp.supernews.com...
I have read some CMU documentation but nothing
about blob analysis. I had believed that the CMU
can only track one blob whereas I need to collect
data about lots of blobs.
That is basically true, but I expect it was a limitation of the compute
cycles on the scenix processor more than anything else.

Center of gravity is good.
Long axis, area, etc. If you suggest I will look
deeper into the CMU docs and web sites. Thanks.

What is iso-bar?
Comes from the weather service, imagine that you have a "cloud" of pixels
that you characterise by color (or some other attribute, intensity perhaps).
Now all of the pixels that fall in the desired "range" get mapped with a
line that defines the edge of the group (where a neighbor is not in the
range) On a weather map you use pressure ranges, on a topographic map you
use altitude, but the effect is the same, you define a blob by creating a
closed line around all pixels that are "in range" This algorithm can convert
raw video into something with "blob" semantics (a number of closed
polygons), note however you can still get polygons within polygons (donuts).

Back in the day at the Image Processing Institute at USC a couple of the
grad students were using techniques like this to help generate maps from
arial photographs. ETAK picked up the technology and did some of the first
digital maps (actually DARPA used it first, but ETAK was the commercial
application).

Until you posted I hadn't really thought about building pixel engines in
FPGAs for image processing type applications. (Which is silly considering I
did some simulation work on Intel's 82786 Graphics chip which had a built in
BLIT engine). The thing I can't visualize is how you could get the
information out of the FPGA fast enough to act on it.

On one of your other post you are comparing
a std_logic_vector to another. I suggest you
repost and state what libraries you are using.
I remember a difficulty in the original VHDL
spec that didn't allow for this that may have
been patched up by a library. VHDL has
been a work in progress for many years with
most improvements added via libraries.
This is one of those times where leaving it for a year or so and coming back
to it has given me the opportunity to use better tools. Xilinx ships the
Coolrunner II in a 56 ball grid array that is just 6 mm x 6 mm (yes about
1/4" on a side) and it occurred to me that you could use that to build a
kick-arse PWM unit that was tuned to robotic motor applications. Mounting it
on an 14 pin DIP carrier with appropriate 5v tolerance stuff should make it
possible for non-SMT capable hobbiests to use it. Then with the left over
gates I was thinking quadarature decoder/counter and it seems like a nice
little bundle.

So I did a quick layout in latest WebPack which insists on making inputs and
outputs to VHDL modules Std_logic_vectors but my counter code was using
unsigned.

--Chuck
 
On Fri, 14 May 2004 09:04:19 -0700, "Brad Smallridge"
<bradsmallridge@dslextreme.com> wrote:

Still looking for some suggestions.
hi Brad,

yeah, it's the robot vision community you need to talk with.
A few years ago I did a blob analysis module that needed only
one pass over the pixels. It used a kind of runlength
encoding, although in fact you can do the runlength stuff
on-the-fly quite easily. My technique coped correctly with
arbitrary blob topologies, BUT at the expense of building
a fairly complicated tree as the data structure. It was
VERY quick (I got a 30x speedup by recoding someone else's
blob finder, and mine was more general!). The application
was a bit messy - identifying fish on a conveyor belt,
so that the de-heading machine could pick them up by the
correct end. We used a line-scan camera, with conveyor motion
providing the other direction of scan, but the technique
is just as good on a 2-dimensional image.

The basic algorithm is easy to map onto FPGA because it
simply looks once at each pixel, but the problem is that
it needs a very variable amount of working storage and
FPGAs generally aren't very good at that sort of thing.
It's easy only if you can plan in advance what features
you want to extract - centre of gravity? centre of
gravity of the hull (outermost enclosing blob, including
all its holes and any blobs within those holes)?
centre of gravity of convex hull (quite a bit harder)?
bounding box? It's also VERY helpful if you can
provide a nice clean margin to the picture, so that
there's an unambiguous "background" area extending
to the limits of the useful image. If any blobs
overlap the image edges, you need to work out what
that means and how to handle it.

I'll try to look out the old papers I did on it, although
I think the C code is long gone - it wasn't mine to keep.
You can guess the vintage because it ran on one of the
embedded AMD29K processors (29500?????)...

Nice to know that people are still messing with that
kind of stuff.
--
Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how
VHDL, Verilog, SystemC, Perl, Tcl/Tk, Verification, Project Services

Doulos Ltd. Church Hatch, 22 Market Place, Ringwood, BH24 1AW, UK
Tel: +44 (0)1425 471223 mail:jonathan.bromley@doulos.com
Fax: +44 (0)1425 471573 Web: http://www.doulos.com

The contents of this message may contain personal views which
are not the views of Doulos Ltd., unless specifically stated.
 
"Brad Smallridge" <bradsmallridge@dslextreme.com> wrote...
I have done some armchair thinking about how to keep and
combine blobs if they should merge in susbsequent lines.
This may be done by keeping start and stop points of the
various blobs in a RAM LUT. But maybe this isn't the
best approach?
We identified the option that you described, and a second approach
that involved keeping track of an ID for each blob that was in
process and storing with each pixel in a line the blob ID that the
pixel corresponded to. We did this as we had to be able to both
fill the voids in blobs to get an accurate centroid of the entire
blob, and also find an accurate mass and centroid for the voids in
the blobs as well. We are basically looking for black circular
blobs, that contain six white circular voids of varying mass. By
decoding the locations of the voids, we identify one of 729
possible marks. This is used for localization of a robot, as an
interim step to scene based localization.

I saw mention of run length encoding in a web page by
Wintress. I do not know what advantage that encoding has
over a fifo that would give you the pixel threshold on every
pixel clock. I do know that some camera manufacturers like
Basler make cameras that threshold and run length encode
their outputs presumably to cut down on cable bandwidth.
Exactly. In some vision applications, very few features need to be
found, and this can drastically reduce the bandwidth.

I am not sure what you mean by the term non-convex. To
a newbie that is concave but you don't use the word concave
so perhaps this has a different meaning.
The term evolved from the notion that blobs are inherently convex.
A non-convex blob has a concave feature somewhere on its boundary.

I have also come across 4 neighbor or 8 neighbor terminology
which I assume represent the rules for making a connection,
based on a tic-tac-toe arrangement of pixels. I would suppose
that the 8 neighbor approach would be a tad more robust but
may lead to narrowly connected blobs with only a point in
common. Whis is better?
Which mode of connection you choose is entirely application
dependent. Four-connected blobs are sufficient for convex objects,
eight-connected are more important if the feature size to pixel
size ratio is quite small.

On my web searching I have tried to refine my search to
just source code by using unique VHDL keywords like
downto and inout. However I didn't find anything with
this search: downto inout blob. Any suggestions to
better search techniques or Verilog keywords will
be appreciated.
I do not suspect you will find any source code, but please don't use
my skepticism as a reason not to look.

Another roboticist. Thanks so much.
Do you want to share with the group what Birger does?
Actually, our involvement with robotics has been limited to the
development of a vision system for a robot. Birger is primarily
involved in the development of vision and compression algorithms on
FPGAs. The vision system that we have devloped has the following
characteristics:
PC/104+ form
Xilinx XC2VP20-FG676 FPGA (up to a XC2VP40 will fit)
64MB DDR SDRAM (32 bit datapath)
Xilinx SystemAceCF
VGA LCD interface w/HV inverter for backlight
Stereo Audio CODEC
Nine VGA camera inputs (one color, eight mono)
MJPEG compression, and color converion for color camera
Stereo disparity search on four QVGA camera pairs

Features that we are currently working on:
Dewarp - 8 x VGA input, 8 x QVGA output
Blobfinder/FiducialFinder
MJPEG decompression

Everything runs at 15FPS. Many of the engines run at 30+FPS. Due
to power considerations, we developed the VGA cameras from scratch,
which burn approximately 200mW each, and are approximately 20mm on a
side. Is this hardware/IP that other robotcs developers would find
useful? This has been developed for a government agency, and we are
investigating how we might productize it.


Regards,
Erik Widding.

Birger Engineering, Inc. -------------------------------- 617.695.9233
100 Boylston St #1070; Boston, MA 02116 -------- http://www.birger.com
 

Welcome to EDABoard.com

Sponsor

Back
Top