Rhonda Software

Highest quality full cycle software development.

Expert in areas of Computer Vision, Multimedia, Messaging, Networking and others. Focused on embedded software development. Competent in building cross-platform solutions and distributed SW systems.

Offer standalone custom solutions as well as integration of existing products. Opened for outsourcing services.

Visit us at: http://www.rhondasoftware.com

“Fixing” the OpenCV’s implementation of Viola-Jones algorithm

Posted on : 10-04-2009 | By : rhondasw | In : OpenCV

10

Today’s story is about improving performance of OpenCV library on the ARM-based platforms.

As you already know (from here or from here or may be even from here), face detection algorithm implemented by OpenCV library doesn’t work perfectly on ARM processors. Science doesn’t know for certain why this happens. There might be several possible reasons. One of our assumption was missing of hardware support for floating point operations. So we tried to translate Viola-Jones algorithm from floating point to fixed point. And that’s how we did this…

First of all, to make life easier it was decided to limit functionality of cvHaarDetectObjects a little:

  • Our version will work with one specific cascade for frontal face detection from the bundle distributed with OpenCV
  • We will not support Canny pruning
  • No image scaling

Actually, it is not necessary to use only frontal face cascade – any cascade without trees (stump based cascade) can be used.

Second. We made few observations:

  • Sizes of rectangles, which are used as classifiers, in the frontal face cascade are integer numbers. In OpenCV implementations they are defined as float though.
  • Rectangles weights are also integers.
  • “Left” and “right” values for rectangles are floating-point numbers in the range (0, 1). Their sum is not bigger than few thousands. Thus, they can be easily converted to integer by multiplying them by some constant. We choose 65536. This provides good enough accuracy (you will still loose some precision).
  • The same is valid for stage threshold.

Also we replaced using of sqrt with its integral version. This let us calculate variance in integer numbers.

Thus, with all above it became possible to convert entire detection algorithm to integer numbers. The only floating point calculation, we still have, are calculations of current search window according to the scale factor. But their input in the overall execution time is very insignificant.

So, what are the numbers you’ll ask. And I tell you.

With OpenCV Lena’s face (everybody knows Lena, right?) was found on the iPhone in about 3-16 seconds (depending on the parameters).
With our fixed-point based algorithm the same face was found in 0.5-3 seconds (depends on parameters).

Thus switching from floating- to fixed-point improves detection performance up to 5 times. Search will not be as precise as in the original OpenCV version (it might not find face on some pictures), but it will be accurate enough for the most of the tasks.

Precision problem can be solved if OpenCV implementation is not reworked as radically as in our version – some of floating-point calculations can be left as is. This will keep original accuracy of Viola-Jones, but increase performance (in about 2 times).

Comments (10)

Hi Sergey,

Great work!!! Is it possible to have the sources of your fixed point library.

Thanks,
Eyal

Hi Eyal,

The style of the reports are rather science with some confidential restrictions like Paul Viola, Michael Jones’style they are working for Mitsubishi Electric Research Laboratories and don’t share their code but share their results. So, our strategy are publish only results. Although in other post comments, the mans was able to repeat the same result with additional comments.

Thanks,
Aleksey

It is really a great job. Recently, I try to implement haar by fixed point. I had done some the observations as you done in the article. After analyzing some methods, I found it is hard to find a suitable one to compute integral of square of the image pixels. I found in your article in which you had mentioned:

“Also we replaced using of sqrt with its integral version. This let us calculate variance in integer numbers”.

It might be the solution to the problem. But I do not know what it means. Could you give me some hints?

BR,

CM

Hi Ching-Min Wang,

“Also we replaced using of sqrt with its *integral* version.’

It is wrong comment from Sergey Kislov. He meant that He used fixed point sqrt function which is calculated in integer numbers. The calculation of square integral image was not changed essentially.

Aleksey

Hi, Aleksey,

Thanks for your reply. My target processor only supports 32-bit integer arithmetic, and amount of memory is also critical. A straightforword method to compute square integral image is to use 64-bit arithmetic. Compared with sum integral image, it would cost two times of memory. So I wonder if there is any smart and efficient method to calcaulte and store square integral image.

CM

We used 64-bit DWORD for saving square sums. As I remember calculation of integral image takes fixed time and it is 5-7% of all time. Hope it will useful.

Its a useful software but their input in the overall execution time is very insignificant.

Hi,

Personally I didn’t agree with your comment . I understand that you would like to expect more significant reduce of execution time and at least saving quality on the same level. If so, then most probably you would read our other post:

http://www.computer-vision-software.com/blog/2009/06/fastfurious-face-detection-with-opencv/

where execution time is increased up to 2-3 times even without fixed point re-implementation of Viola-Jones algorithm, that implementation saves quality on the same level.

Let me know if you have any more question.

Aleksey

Hi,
I tried to do the steps you cover here in your article, and added a few of my own.
I posted an article of my own about it:
http://www.morethantechnical.com/2009/08/09/near-realtime-face-detection-on-the-iphone-w-opencv-port-wcodevideo/

It also includes code, and a sample video.
Roy.

Roys,

Thank you for your feedback. It is very important for us to know that developement community is using our results.

Aleksey

Write a comment