Profiling OpenCV
Posted on : 18-03-2009 | By : Yuri Vashchenko | In : OpenCV
5
Compiling OpenCV part 2
Compiling without profiling
This time I compiled OpenCV library on Debian Linux installation.
I used VMWare 6.5 and Debian OS linux distribution.
I setup Virtual Machine to allocate 10GB hard drive space and 512 MB of system memory.
I setup the latest version of CodeSourcery G++ toolchain (version arm-2008q3-72-arm-none-linux-gnueabi) downloaded from http://www.codesourcery.com/sgpp/lite/arm. The installation dir was /opt/crosstool/codesourcery
I used the latest version of OpenCV library (opencv-1.1pre1.tar.gz). I unpacked it into my home dir.
To compile the library I run the configure script from bash command file containing the following:
#!/bin/bash export DEVROOT=/opt/crosstool/codesourcery export APP_PREFIX=arm-none-linux-gnueabi export GCC_HOST=i686-pc-linux-gnu export PATH=./:$DEVROOT/bin:$PATH ./configure \ --target=$APP_PREFIX \ --host=$APP_PREFIX \ --disable-shared \ --enable-static \ --without-imageio --without-carbon \ --without-quicktime --without-python \ --without-gtk --without-swig \ --without-v4l \ --disable-apps \ --prefix=$HOME/arm \ --exec-prefix=$HOME/arm \ CXXFLAGS="-fsigned-char"
After completing configure, run the “make” and “make install” commands. That will compile the code, create static libraries and copy needed includes and libraries into specified directory ($HOME/arm).
After that I used modified version of facedetect.c sample which now supports yml and bmp file formats (by extension), detects faces and writes resulted image as yml and bmp files.
To compile the application I used the following Makefile:
DEVROOT=/opt/crosstool/codesourcery APP_PREFIX=arm-none-linux-gnueabi INCLUDE_DIR=$(HOME)/arm/include/opencv LIB_DIR=$(HOME)/arm/lib CC = $(APP_PREFIX)-gcc CPP = $(APP_PREFIX)-g++ LD= $(APP_PREFIX)-ld CXXFLAGS = -I$(INCLUDE_DIR) CXXFLAGS+= -fsigned-char LDFLAGS = -g -pg -static -L$(LIB_DIR) -lcv -lhighgui -lcxcore -lml -lcvaux -lm -lstdc++ -lpthread -ldl all: facedetect.o $(CPP) facedetect.o $(LDFLAGS) -o facedetect facedetect.o: $(CPP) $(CXXFLAGS) -c facedetect.cpp clean: rm -f facedetect facedetect.o
The program compiled successfully and produced “facedetect” binary which I was able to successfully run on the TS-7800 Linux board.
Compiling with profiling
To enable profiling I updated the script that runs OpenCV’s configure: I added –emable-debug and -pg and -g compiler options:
#!/bin/bash export DEVROOT=/opt/crosstool/codesourcery export APP_PREFIX=arm-none-linux-gnueabi export GCC_HOST=i686-pc-linux-gnu export OFLAGS=-O0 export PATH=./:$DEVROOT/bin:$PATH ./configure \ --target=$APP_PREFIX \ --host=$APP_PREFIX \ --disable-shared \ --enable-static \ --enable-debug \ --without-imageio --without-carbon \ --without-quicktime --without-python \ --without-gtk --without-swig \ --without-v4l \ --disable-apps \ --prefix=$HOME/arm \ --exec-prefix=$HOME/arm \ CXXFLAGS="-fsigned-char -pg -g"
After that I ran “make clean” and “make uninstall” to remove old libraries and object files.
After that I rerun the configure script, make and make install. As a result, new versions (with enabled profiling support) were generated.
I had also to slightly tweak application’s Makefile to enable profiling:
DEVROOT=/opt/crosstool/codesourcery APP_PREFIX=arm-none-linux-gnueabi INCLUDE_DIR=$(HOME)/arm/include/opencv LIB_DIR=$(HOME)/arm/lib CC = $(APP_PREFIX)-gcc CPP = $(APP_PREFIX)-g++ LD= $(APP_PREFIX)-ld CXXFLAGS = -I$(INCLUDE_DIR) CXXFLAGS+= -g -pg -fsigned-char LDFLAGS = -g -pg -static -L$(LIB_DIR) -lcv -lhighgui -lcxcore -lml -lcvaux -lm -lstdc++ -lpthread -ldl all: facedetect.o $(CPP) facedetect.o $(LDFLAGS) -o facedetect facedetect.o: $(CPP) $(CXXFLAGS) -c facedetect.cpp clean: rm -f facedetect facedetect.o
I ran make for this Makefile and got a binary executable with profiling enabled. After running it on the board the gmon.out file was generated.
Execute the “arm-none-linux-gnueabi-gprof facedetect gmon.out > gprof_result.txt” command to get readable profiling result.
Again, profiling result was not very accurate. Actual execution time was about 81 seconds, while report indicates only 40 seconds total execution time! Anyway, here is what took most of the time:
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total time seconds seconds calls s/call s/call name 39.80 15.91 15.91 429405 0.00 0.00 cvRunHaarClassifierCascade 22.62 24.95 9.04 __adddf3 5.25 27.05 2.10 __muldf3 4.30 28.77 1.72 __aeabi_f2d 4.15 30.43 1.66 mcount_internal 3.10 31.67 1.24 __aeabi_fmul 2.95 32.85 1.18 1 1.18 1.78 icvYMLParseValue(CvFileStorage*, char*, CvFileNode*, int, int) 1.50 33.45 0.60 ____strtol_l_internal 1.48 34.04 0.59 __gnu_mcount_nc 1.20 34.52 0.48 __aeabi_l2f 1.05 34.94 0.42 __cmpdf2 0.95 35.32 0.38 913927 0.00 0.00 icvYMLWrite(CvFileStorage*, char const*, char const*, char const*) 0.90 35.68 0.36 __floatsisf 0.73 35.97 0.29 2 0.14 0.14 icvIntegralImage_8u32s_C1R(unsigned char const*, int, int*, int, double*, int, int*, int, CvSize) 0.70 36.25 0.28 1 0.28 16.88 cvHaarDetectObjects 0.60 36.49 0.24 3935016 0.00 0.00 cvGetErrStatus 0.55 36.71 0.22 __aeabi_dcmplt 0.53 36.92 0.21 1827859 0.00 0.00 icvYMLSkipSpaces(CvFileStorage*, char*, int, int) 0.53 37.13 0.21 memset 0.50 37.33 0.20 __aeabi_cdcmple 0.48 37.52 0.19 __ieee754_sqrt 0.43 37.69 0.17 1 0.17 0.20 cvReadRawDataSlice 0.40 37.85 0.16 cvRound(double) 0.38 38.00 0.15 29 0.01 0.01 cvSetImagesForHaarClassifierCascade 0.35 38.14 0.14 3935016 0.00 0.00 icvGetContext() 0.35 38.28 0.14 913922 0.00 0.00 icv_itoa(int, char*, int) 0.35 38.42 0.14 1 0.14 0.86 cvWriteRawData 0.28 38.53 0.11 946095 0.00 0.00 cvSeqPush 0.28 38.64 0.11 1 0.11 0.22 cvCanny 0.28 38.75 0.11 strtol 0.23 38.84 0.09 3 0.03 0.06 icvXMLParseValue(CvFileStorage*, char*, CvFileNode*, int) 0.20 38.92 0.08 913934 0.00 0.00 icvFSResizeWriteBuffer(CvFileStorage*, char*, int) 0.20 39.00 0.08 120 0.00 0.00 icvFilterColSymm_32s16s(int const**, short*, int, int, void*) 0.18 39.07 0.07 write 0.15 39.13 0.06 Laligned 0.15 39.19 0.06 __floatsidf 0.15 39.25 0.06 memcpy 0.13 39.30 0.05 __divdf3 0.10 39.34 0.04 memchr 0.08 39.37 0.03 429409 0.00 0.00 cvPoint(int, int) 0.08 39.40 0.03 331649 0.00 0.00 cvAlign(int, int) 0.08 39.43 0.03 60404 0.00 0.00 icvFSFlush(CvFileStorage*) 0.08 39.46 0.03 43649 0.00 0.00 icvXMLParseTag(CvFileStorage*, char*, CvStringHashNode**, CvAttrList**, int*) 0.08 39.49 0.03 13231 0.00 0.00 cvCreateSeq 0.08 39.52 0.03 1 0.03 0.03 icvBGRx2Gray_8u_CnC1R(unsigned char const*, int, unsigned char*, int, CvSize, int, int) 0.08 39.55 0.03 _wordcopy_fwd_dest_aligned 0.08 39.58 0.03 read 0.05 39.60 0.02 43658 0.00 0.00 cvGetHashedKey 0.05 39.62 0.02 26550 0.00 0.00 cvSetSeqBlockSize 0.05 39.64 0.02 1 0.02 0.02 icvLUT_Transform8u_8u_C1R(unsigned char const*, int, unsigned char*, int, CvSize, unsigned char const*) 0.05 39.66 0.02 ____strtod_l_internal 0.05 39.68 0.02 fgets 0.05 39.70 0.02 isalnum 0.05 39.72 0.02 isalpha 0.05 39.74 0.02 munmap 0.05 39.76 0.02 sqrt 0.03 39.77 0.01 57613 0.00 0.00 cvAlignLeft(int, int) 0.03 39.78 0.01 13470 0.00 0.00 icvGrowSeq(CvSeq*, int) 0.03 39.79 0.01 4299 0.00 0.00 cvSetAdd 0.03 39.80 0.01 2306 0.00 0.00 icvDefaultAlloc(unsigned int, void*) 0.03 39.81 0.01 952 0.00 0.00 icvFilterRowSymm_8u32s(unsigned char const*, int*, void*) 0.03 39.82 0.01 216 0.00 0.00 icvFillConvexPoly(CvMat*, CvPoint*, int, void const*, int, int) 0.03 39.83 0.01 120 0.00 0.00 CvBaseImageFilter::fill_cyclic_buffer(unsigned char const*, int, int, int, int) 0.03 39.84 0.01 89 0.00 0.00 icvGoNextMemBlock(CvMemStorage*) 0.03 39.85 0.01 2 0.01 0.06 CvBaseImageFilter::process(CvMat const*, CvMat*, CvRect, CvPoint, int)
0.03 39.86 0.01 1 0.01 0.01 icvCalcHist_8u_C1R(unsigned char**, int, unsigned char*, int, CvSize, CvHistogram*) 0.03 39.87 0.01 1 0.01 0.02 icvReadHaarClassifier(CvFileStorage*, CvFileNode*) 0.03 39.88 0.01 Llastword 0.03 39.89 0.01 _IO_getline_info 0.03 39.90 0.01 _IO_new_do_write 0.03 39.91 0.01 __aeabi_dcmpgt 0.03 39.92 0.01 _int_malloc 0.03 39.93 0.01 brk 0.03 39.94 0.01 isnanl 0.03 39.95 0.01 isspace 0.03 39.96 0.01 memcmp 0.03 39.97 0.01 str_to_mpn
Remaining report data lists a lot of functions which execution took 0.00 seconds.
As you can see, the following functions took most of the execution time:
% cumulative self self total time seconds seconds calls s/call s/call name 39.80 15.91 15.91 429405 0.00 0.00 cvRunHaarClassifierCascade 22.62 24.95 9.04 __adddf3 5.25 27.05 2.10 __muldf3 4.30 28.77 1.72 __aeabi_f2d 4.15 30.43 1.66 mcount_internal 3.10 31.67 1.24 __aeabi_fmul 2.95 32.85 1.18 1 1.18 1.78 icvYMLParseValue(CvFileStorage*, char*, CvFileNode*, int, int) 1.50 33.45 0.60 ____strtol_l_internal 1.48 34.04 0.59 __gnu_mcount_nc 1.20 34.52 0.48 __aeabi_l2f 1.05 34.94 0.42 __cmpdf2 0.95 35.32 0.38 913927 0.00 0.00 icvYMLWrite(CvFileStorage*, char const*, char const*, char const*) 0.90 35.68 0.36 __floatsisf 0.73 35.97 0.29 2 0.14 0.14 icvIntegralImage_8u32s_C1R(unsigned char const*, int, int*, int, double*, int, int*, int, CvSize) 0.70 36.25 0.28 1 0.28 16.88 cvHaarDetectObjects 0.60 36.49 0.24 3935016 0.00 0.00 cvGetErrStatus 0.55 36.71 0.22 __aeabi_dcmplt 0.53 36.92 0.21 1827859 0.00 0.00 icvYMLSkipSpaces(CvFileStorage*, char*, int, int) 0.53 37.13 0.21 memset 0.50 37.33 0.20 __aeabi_cdcmple 0.48 37.52 0.19 __ieee754_sqrt 0.43 37.69 0.17 1 0.17 0.20 cvReadRawDataSlice 0.40 37.85 0.16 cvRound(double) 0.38 38.00 0.15 29 0.01 0.01 cvSetImagesForHaarClassifierCascade 0.35 38.14 0.14 3935016 0.00 0.00 icvGetContext() 0.35 38.28 0.14 913922 0.00 0.00 icv_itoa(int, char*, int) 0.35 38.42 0.14 1 0.14 0.86 cvWriteRawData 0.28 38.53 0.11 946095 0.00 0.00 cvSeqPush 0.28 38.64 0.11 1 0.11 0.22 cvCanny 0.28 38.75 0.11 strtol
Functions like __adddf3, __muldf3, __muldf3, __aeabi_fmul are emulation for floating point operations, and __gnu_mcount and mcount_internal are functions required to generate profiling output tile.
Performance
Without profiling, TS-7800 showed better results than Linux phones:
sqa.bmp file (resolution 314×209) was detected within 5.8 seconds;
pic.jpg.yml file (resolution 640×476) was detected within 33.3 seconds.
With profiling enabled it took about 66-68 seconds to process pic.jpg.yml file (resolution 640×476).



Hello,
I am new user to opencv. I have successfully compiled opencv source code ver 1.1.0 on ubuntu hardy heron – intel. Now I want to compile individually cxcore for same. I tried to hack root configure and makefile. But it did not work out. I guess that is a wrong way to do?
My ultimate aim is to compile cxcore,cv,cvaux for ubuntu and then cross compile them for other platform.
Can you suggest me steps to do that?
or guide me to source where I can find it. I have dropped query on opencv forum but no response.
Vihang,
I am not sure I understand you question. What do you mean “compile individually cxcore”? What was the purpose of “hacking” the configure and makefile? Please clarify.
If you compiled the whole opencv, you already compiled cxcore, cv and cvaux. You can use generated libraries separately with your own applications.
If you ran the configure script, it should have generated Makefiles for all opencv compoinents. When you run Makefile, it first compiles cvcore, and even it fails (for any reason) later on other opencv components, it cxcore built successfully, you can use it separately. Next goes cv, then cvaux and finally, ml.
If you need opencv for a mulitiple platforms, you need a cross-compiler. Just give the “configure” script correct host, build and target platform names and it will generate you correct Makefiles. If your cross-compiler (I recommend you CodeSourcery G++ toolchain, they build gcc for different platforms) configured correctly everything should work, you do not need to modify your configure script or any Makefiles.
Thanks,
Yuri.
dear Yuri,
Sorry, I’m newbie ..
I found some source for compiling in Ms Vis Studio 2008, this source need OpenCV configuration in Ms Vis Studio 2008 and I did. Now, I dont know how to combine and integrate between the source, OpenCV and Ms Vis Studio 2008.
Any suggest and help would so appreciate ..
thankyou,
Hello daynie,
I am willing to help you, but I need more information.
As I understand, you are trying to compile some code that requires opencv and you are running Windows and Visual Studio 2008.
Do you need to make changes to existing opencv release or you just need to be able to use opencv library for your own applications?
If you do not need to do any changes to OpenCV library, the easiest way to start using opencv library is just to install it with provided installer (OpenCV_1.1pre1a.exe). When installed, you need to update your Visual Studio project properties to be able to compile your application.
First, you need to go to C/C++->General page and put paths to your opencv headers into “Additional Include Directories” field to make compiler know where to find them. It depends on where you have installed opencv, but by defaul it should read like “C:\Program Files\OpenCV\otherlibs\highgui”;”C:\Program Files\OpenCV\cxcore\include”;”C:\Program Files\OpenCV\cv\include”.
Next, you need to give linker directory on where to look for opencv libraries – go to Linker->General properties page and put the path into Additional Library Directories field. By default, the path is “C:\Program Files\OpenCV\lib”.
Finally, you need to say linker which libraries to use. Go to Linker->Input page and put “cv.lib cxcore.lib highgui.lib” (without quotes) into Additional Dependencies field. If you are using debug configuration, this should read “cvd.lib cxcored.lib highguid.lib” (without quotes)
That should make your application compile and run.
If not, you may want to send me your source code to look at. If I have it I can probably be more helpfull to you.
Thanx Yuri Vashchenko…
your last post has been very helpful….I successfully recompiled facedetect.c in Dec-C++ using this information…
thanx alot…