The Opencv User Guide: Release 2.4.13.1
The Opencv User Guide: Release 2.4.13.1
Release 2.4.13.1
CONTENTS
1
1
1
Features2d
2.1 Detectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 Descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3 Matching keypoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
5
5
5
7
7
.
.
.
.
11
11
12
12
16
19
19
Bibliography
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
21
ii
CHAPTER
ONE
1.1 Input/Output
Images
Load an image from a file:
Mat img = imread(filename)
If you read a jpg file, a 3 channel image is created by default. If you need a grayscale image, use:
Mat img = imread(filename, 0);
Note: format of the file is determined by its content (first few bytes)
Save an image to a file:
imwrite(filename, img);
XML/YAML
TBD
intensity.val[0] contains a value from 0 to 255. Note the ordering of x and y. Since in OpenCV images are
represented by the same structure as matrices, we use the same convention for both cases - the 0-based row index
(or y-coordinate) goes first and the 0-based column index (or x-coordinate) follows it. Alternatively, you can use the
following notation:
Scalar intensity = img.at<uchar>(Point(x, y));
Now let us consider a 3 channel image with BGR color ordering (the default format returned by imread):
Vec3b
uchar
uchar
uchar
You can use the same method for floating-point images (for example, you can get such an image by running Sobel on
a 3 channel image):
Vec3f
float
float
float
There are functions in OpenCV, especially from calib3d module, such as projectPoints, that take an array of 2D or
3D points in the form of Mat. Matrix should contain exactly one column, each row corresponds to a point, matrix type
should be 32FC2 or 32FC3 correspondingly. Such a matrix can be easily constructed from std::vector:
vector<Point2f> points;
//... fill the array
Mat pointsMat = Mat(points);
One can access a point in this matrix using the same method Mat::at :
Point2f point = pointsMat.at<Point2f>(i, 0);
As a result we get a 32FC1 matrix with 3 columns instead of 32FC3 matrix with 1 column. pointsMat uses data
from points and will not deallocate the memory when destroyed. In this particular instance, however, developer has
to make sure that lifetime of points is longer than of pointsMat. If we need to copy the data, this is done using, for
example, Mat::copyTo or Mat::clone:
Mat img = imread("image.jpg");
Mat img1 = img.clone();
To the contrary with C API where an output image had to be created by developer, an empty output Mat can be supplied
to each function. Each implementation calls Mat::create for a destination matrix. This method allocates data for a
matrix if it is empty. If it is not empty and has the correct size and type, the method does nothing. If, however, size or
type are different from input arguments, the data is deallocated (and lost) and a new data is allocated. For example:
Mat img = imread("image.jpg");
Mat sobelx;
Sobel(img, sobelx, CV_32F, 1, 0);
Primitive operations
There is a number of convenient operators defined on a matrix. For example, here is how we can make a black image
from an existing greyscale image img:
img = Scalar(0);
Visualizing images
It is very useful to see intermediate results of your algorithm during development process. OpenCV provides a convenient way of visualizing images. A 8U image can be shown using:
Mat img = imread("image.jpg");
namedWindow("image", CV_WINDOW_AUTOSIZE);
imshow("image", img);
waitKey();
A call to waitKey() starts a message passing cycle that waits for a key stroke in the "image" window. A 32F image
needs to be converted to 8U type. For example:
Mat img = imread("image.jpg");
Mat grey;
cvtColor(img, grey, CV_BGR2GRAY);
Mat sobelx;
Sobel(grey, sobelx, CV_32F, 1, 0);
double minVal, maxVal;
minMaxLoc(sobelx, &minVal, &maxVal); //find minimum and maximum intensities
Mat draw;
sobelx.convertTo(draw, CV_8U, 255.0/(maxVal - minVal), -minVal * 255.0/(maxVal - minVal));
namedWindow("image", CV_WINDOW_AUTOSIZE);
imshow("image", draw);
waitKey();
CHAPTER
TWO
FEATURES2D
2.1 Detectors
2.2 Descriptors
2.3 Matching keypoints
The code
We will start with a short sample opencv/samples/cpp/matcher_simple.cpp:
Mat img1 = imread(argv[1], CV_LOAD_IMAGE_GRAYSCALE);
Mat img2 = imread(argv[2], CV_LOAD_IMAGE_GRAYSCALE);
if(img1.empty() || img2.empty())
{
printf("Can't read one of the images\n");
return -1;
}
// detecting keypoints
SurfFeatureDetector detector(400);
vector<KeyPoint> keypoints1, keypoints2;
detector.detect(img1, keypoints1);
detector.detect(img2, keypoints2);
// computing descriptors
SurfDescriptorExtractor extractor;
Mat descriptors1, descriptors2;
extractor.compute(img1, keypoints1, descriptors1);
extractor.compute(img2, keypoints2, descriptors2);
// matching descriptors
BruteForceMatcher<L2<float> > matcher;
vector<DMatch> matches;
matcher.match(descriptors1, descriptors2, matches);
// drawing the results
namedWindow("matches", 1);
Mat img_matches;
drawMatches(img1, keypoints1, img2, keypoints2, matches, img_matches);
imshow("matches", img_matches);
waitKey(0);
First, we create an instance of a keypoint detector. All detectors inherit the abstract FeatureDetector interface, but
the constructors are algorithm-dependent. The first argument to each detector usually controls the balance between
the amount of keypoints and their stability. The range of values is different for different detectors (For instance, FAST
threshold has the meaning of pixel intensity difference and usually varies in the region [0,40]. SURF threshold is
applied to a Hessian of an image and usually takes on values larger than 100), so use defaults in case of doubt.
// computing descriptors
SurfDescriptorExtractor extractor;
Mat descriptors1, descriptors2;
extractor.compute(img1, keypoints1, descriptors1);
extractor.compute(img2, keypoints2, descriptors2);
We create an instance of descriptor extractor. The most of OpenCV descriptors inherit DescriptorExtractor
abstract interface.
Then we compute descriptors for each of the keypoints.
The output Mat of the
DescriptorExtractor::compute method contains a descriptor in a row i for each i-th keypoint. Note that the
method can modify the keypoints vector by removing the keypoints such that a descriptor for them is not defined
(usually these are the keypoints near image border). The method makes sure that the ouptut keypoints and descriptors
are consistent with each other (so that the number of keypoints is equal to the descriptors row count).
// matching descriptors
BruteForceMatcher<L2<float> > matcher;
vector<DMatch> matches;
matcher.match(descriptors1, descriptors2, matches);
Now that we have descriptors for both images, we can match them. First, we create a matcher that for each descriptor
from image 2 does exhaustive search for the nearest descriptor in image 1 using Euclidean metric. Manhattan distance
is also implemented as well as a Hamming distance for Brief descriptor. The output vector matches contains pairs of
corresponding points indices.
// drawing the results
namedWindow("matches", 1);
Mat img_matches;
drawMatches(img1, keypoints1, img2, keypoints2, matches, img_matches);
imshow("matches", img_matches);
waitKey(0);
The final part of the sample is about visualizing the matching results.
Chapter 2. Features2d
CHAPTER
THREE
If one or both products were installed to the other folders, the user should change corresponding CMake variables
OPENNI_LIB_DIR, OPENNI_INCLUDE_DIR or/and OPENNI_PRIME_SENSOR_MODULE_BIN_DIR.
2. Configure OpenCV with OpenNI support by setting WITH_OPENNI flag in CMake. If OpenNI is found in install
folders OpenCV will be built with OpenNI library (see a status OpenNI in CMake log) whereas PrimeSensor
Modules can not be found (see a status OpenNI PrimeSensor Modules in CMake log). Without PrimeSensor
module OpenCV will be successfully compiled with OpenNI library, but VideoCapture object will not grab
data from Kinect sensor.
3. Build OpenCV.
VideoCapture can retrieve the following data:
1. data given from depth generator:
CV_CAP_OPENNI_DEPTH_MAP - depth values in mm (CV_16UC1)
CV_CAP_OPENNI_POINT_CLOUD_MAP - XYZ in meters (CV_32FC3)
CV_CAP_OPENNI_DISPARITY_MAP - disparity in pixels (CV_8UC1)
CV_CAP_OPENNI_DISPARITY_MAP_32F - disparity in pixels (CV_32FC1)
For getting several data maps use VideoCapture::grab and VideoCapture::retrieve, e.g.
VideoCapture capture(0); // or CV_CAP_OPENNI
for(;;)
{
Mat depthMap;
Mat bgrImage;
capture.grab();
capture.retrieve( depthMap, CV_CAP_OPENNI_DEPTH_MAP );
capture.retrieve( bgrImage, CV_CAP_OPENNI_BGR_IMAGE );
if( waitKey( 30 ) >= 0 )
break;
}
For setting and getting some property of sensor data generators use VideoCapture::set and VideoCapture::get
methods respectively, e.g.
VideoCapture capture( CV_CAP_OPENNI );
capture.set( CV_CAP_OPENNI_IMAGE_GENERATOR_OUTPUT_MODE, CV_CAP_OPENNI_VGA_30HZ );
cout << "FPS
" << capture.get( CV_CAP_OPENNI_IMAGE_GENERATOR+CV_CAP_PROP_FPS ) << endl;
Since two types of sensors data generators are supported (image generator and depth generator), there are two flags
that should be used to set/get property of the needed generator:
CV_CAP_OPENNI_IMAGE_GENERATOR A flag for access to the image generator properties.
CV_CAP_OPENNI_DEPTH_GENERATOR A flag for access to the depth generator properties. This flag
value is assumed by default if neither of the two possible values of the property is not set.
Some depth sensors (for example XtionPRO) do not have image generator. In order to check it you can get
CV_CAP_OPENNI_IMAGE_GENERATOR_PRESENT property.
bool isImageGeneratorPresent = capture.get( CV_CAP_PROP_OPENNI_IMAGE_GENERATOR_PRESENT ) != 0; // or == 1
Flags specifing the needed generator type must be used in combination with particular generator property. The following properties of cameras available through OpenNI interfaces are supported:
For image generator:
CV_CAP_OPENNI_DEPTH_GENERATOR_BASELINE = CV_CAP_OPENNI_DEPTH_GENERATOR +
CV_CAP_PROP_OPENNI_BASELINE
CV_CAP_OPENNI_DEPTH_GENERATOR_FOCAL_LENGTH = CV_CAP_OPENNI_DEPTH_GENERATOR +
CV_CAP_PROP_OPENNI_FOCAL_LENGTH
CV_CAP_OPENNI_DEPTH_GENERATOR_REGISTRATION = CV_CAP_OPENNI_DEPTH_GENERATOR +
CV_CAP_PROP_OPENNI_REGISTRATION
For more information please refer to the example of usage openni_capture.cpp in opencv/samples/cpp folder.
10
CHAPTER
FOUR
4.1 Introduction
The work with a cascade classifier includes two major stages: training and detection. Detection stage is described
in a documentation of objdetect module of general OpenCV documentation. Documentation gives some basic
information about cascade classifier. Current guide is describing how to train a cascade classifier: preparation of a
training data and running the training application.
Important notes
are two applications in OpenCV to train cascade classifier:
opencv_haartraining and
_
_
opencv traincascade. opencv traincascade is a newer version, written in C++ in accordance to OpenCV
2.x API. But the main difference between this two applications is that opencv_traincascade supports both Haar
There
[Viola2001] and LBP [Liao2007] (Local Binary Patterns) features. LBP features are integer in contrast to Haar
features, so both training and detection with LBP are several times faster then with Haar features. Regarding the LBP
and Haar detection quality, it depends on training: the quality of training dataset first of all and training parameters
too. Its possible to train a LBP-based classifier that will provide almost the same quality as Haar-based one.
opencv_traincascade and opencv_haartraining store the trained classifier in different file formats. Note, the
newer cascade detection interface (see CascadeClassifier class in objdetect module) support both formats.
opencv_traincascade can save (export) a trained cascade in the older format. But opencv_traincascade and
opencv_haartraining can not load (import) a classifier in another format for the further training after interruption.
Note that opencv_traincascade application can use TBB for multi-threading. To use it in multicore mode OpenCV
must be built with TBB.
Also there are some auxiliary utilities related to the training.
opencv_createsamples is used to prepare a training dataset of positive and test samples.
opencv_createsamples produces dataset of positive samples in a format that is supported by both
opencv_haartraining and opencv_traincascade applications. The output is a file with *.vec extension, it
is a binary format which contains images.
opencv_performance may be used to evaluate the quality of classifiers, but for trained by
opencv_haartraining only. It takes a collection of marked up images, runs the classifier and reports the
performance, i.e. number of found objects, number of missed objects, number of false alarms and other information.
Since opencv_haartraining is an obsolete application, only opencv_traincascade will be described further.
opencv_createsamples utility is needed to prepare a training data for opencv_traincascade, so it will be described too.
11
Negative Samples
Negative samples are taken from arbitrary images. These images must not contain detected objects. Negative samples
are enumerated in a special file. It is a text file in which each line contains an image filename (relative to the directory
of the description file) of negative sample image. This file must be created manually. Note that negative samples and
sample images are also called background samples or background samples images, and are used interchangeably in
this document. Described images may be of different sizes. But each image should be (but not necessarily) larger then
a training window size, because these images are used to subsample negative image to the training size.
An example of description file:
Directory structure:
/img
img1.jpg
img2.jpg
bg.txt
File bg.txt:
img/img1.jpg
img/img2.jpg
Positive Samples
Positive samples are created by opencv_createsamples utility. They may be created from a single image with object
or from a collection of previously marked up images.
Please note that you need a large dataset of positive samples before you give it to the mentioned utility, because it only
applies perspective transformation. For example you may need only one positive sample for absolutely rigid object
like an OpenCV logo, but you definitely need hundreds and even thousands of positive samples for faces. In the case
of faces you should consider all the race and age groups, emotions and perhaps beard styles.
So, a single object image may contain a company logo. Then a large set of positive samples is created from the given
object image by random rotating, changing the logo intensity as well as placing the logo on arbitrary background. The
amount and range of randomness can be controlled by command line arguments of opencv_createsamples utility.
Command line arguments:
-vec <vec_file_name>
Name of the output file containing the positive samples for training.
-img <image_file_name>
12
13
Creating training set from a single image and a collection of backgrounds with a
single vec file as an output
The following procedure is used to create a sample object instance: The source image is rotated randomly
around all three axes. The chosen angle is limited my -max?angle. Then pixels having the intensity from
[bg_color-bg_color_threshold; bg_color+bg_color_threshold] range are interpreted as transparent. White
noise is added to the intensities of the foreground. If the -inv key is specified then foreground pixel intensities are
inverted. If -randinv key is specified then algorithm randomly selects whether inversion should be applied to this
sample. Finally, the obtained image is placed onto an arbitrary background from the background description file,
resized to the desired size specified by -w and -h and stored to the vec-file, specified by the -vec command line
option.
With *.txt files in annotations directory containing information about object bounding box on the sample in a next
format:
Image filename : "/home/user/pos/0002_0107_0115_0195_0139.png"
Bounding box for object 1 "PASperson" (Xmin, Ymin) - (Xmax, Ymax) : (107, 115) - (302, 254)
14
opencv_createsamples -img /home/user/logo.png -bg /home/user/bg.txt -info annotations.lst -maxxangle 0.1 -maxyang
Directory structure:
info.dat
img1.jpg
img2.jpg
File info.dat:
img1.jpg
img2.jpg
1
2
140 100 45 45
100 200 50 50
50 30 25 25
File info.dat:
img/img1.jpg
img/img2.jpg
1
2
140 100 45 45
100 200 50 50
50 30 25 25
Image img1.jpg contains single object instance with the following coordinates of bounding rectangle: (140, 100, 45,
45). Image img2.jpg contains two object instances.
In order to create positive samples from such collection, -info argument should be specified instead of -img:
-info <collection_file_name>
Description file of marked up images collection.
The scheme of samples creation in this case is as follows. The object instances are taken from images. Then they are
resized to target samples size and stored in output vec-file. No distortion is applied, so the only affecting arguments
are -w, -h, -show and -num.
Note that for training, it does not matter how vec-files with positive samples are generated.
But
opencv_createsamples utility is the only one way to collect/create a vector file of positive samples, provided by
OpenCV.
Example of vec-file is available here opencv/data/vec_files/trainingfaces_24-24.vec. It can be used to train
a face detector with the following window size: -w 24 -h 24.
15
16
17
18
CHAPTER
FIVE
19
For getting several data maps use VideoCapture::grab and VideoCapture::retrieve, e.g.
VideoCapture capture(CV_CAP_INTELPERC);
for(;;)
{
Mat depthMap;
Mat image;
Mat irImage;
capture.grab();
capture.retrieve( depthMap, CV_CAP_INTELPERC_DEPTH_MAP );
capture.retrieve(
image, CV_CAP_INTELPERC_IMAGE );
capture.retrieve( irImage, CV_CAP_INTELPERC_IR_MAP);
if( waitKey( 30 ) >= 0 )
break;
}
For setting and getting some property of sensor data generators use VideoCapture::set and VideoCapture::get
methods respectively, e.g.
VideoCapture capture( CV_CAP_INTELPERC );
capture.set( CV_CAP_INTELPERC_DEPTH_GENERATOR | CV_CAP_PROP_INTELPERC_PROFILE_IDX, 0 );
cout << "FPS
" << capture.get( CV_CAP_INTELPERC_DEPTH_GENERATOR+CV_CAP_PROP_FPS ) << endl;
Since two types of sensors data generators are supported (image generator and depth generator), there are two flags
that should be used to set/get property of the needed generator:
CV_CAP_INTELPERC_IMAGE_GENERATOR a flag for access to the image generator properties.
CV_CAP_INTELPERC_DEPTH_GENERATOR a flag for access to the depth generator properties. This flag
value is assumed by default if neither of the two possible values of the property is set.
For more information please refer to the example of usage intelperc_capture.cpp in opencv/samples/cpp folder.
20
BIBLIOGRAPHY
[Viola2001] Paul Viola, Michael J. Jones. Rapid Object Detection using a Boosted Cascade of Simple Features.
Conference on Computer Vision and Pattern Recognition (CVPR), 2001, pp. 511-518.
[Viola2004] Paul Viola, Michael J. Jones. Robust real-time face detection. International Journal of Computer Vision,
57(2):137154, 2004.
[Rainer2002] Rainer Lienhart and Jochen Maydt. An Extended Set of Haar-like Features for Rapid Object Detection.
Submitted to ICIP2002.
[Liao2007] Shengcai Liao, Xiangxin Zhu, Zhen Lei, Lun Zhang and Stan Z. Li. Learning Multi-scale Block Local
Binary Patterns for Face Recognition. International Conference on Biometrics (ICB), 2007, pp. 828-837.
21