In this paper, we propose a new way to apply Haar-like
patterns as well as how to use integral image technique for
computing feature values. For the same feature, much more
informative values can be extracted and hence, detection
rate is better as well.
The more important contribution is how we apply
Gaussian probability distribution to AdaBoost to improve
its performance. By utilizing this function, we can avoid
the difficulty to choose optimal thresholds for each round
of AdaBoost. Classification problem becomes simpler and
more straightforward by determining how far to the mean
positive and negative distributions are. Besides, the
detection speed is also faster because of classification rate.
Those two contributions have been characterized into
the success of our paper. By applying this system or this
idea about implementation, face detection system can be
run in a normal computer or machine. From the advance in
performance, this method can be used in other real-time
detection systems in practice.
6 trang |
Chia sẻ: huongthu9 | Lượt xem: 637 | Lượt tải: 0
Bạn đang xem nội dung tài liệu Fast gaussian distribution based adaboost algorithm for face detection, để tải tài liệu về máy bạn click vào nút DOWNLOAD ở trên
ISSN 1859-1531 - THE UNIVERSITY OF DANANG, JOURNAL OF SCIENCE AND TECHNOLOGY, NO. 6(127).2018 45
FAST GAUSSIAN DISTRIBUTION BASED ADABOOST ALGORITHM
FOR FACE DETECTION
Tuan M. Pham1, Hao P. Do2, Danh C. Doan2, Hoang V. Nguyen2
1University of Science and Technology - The University of Danang; pmtuan@dut.udn.vn
2Hippo Tech Vietnam; {haodophuc, danhdoan.es, nguyenviethoang.25}@gmail.com
Abstract - In the past few years, Paul Viola and Michael J. Jones
have successfully developed a new face detection approach which
has been widely applied to many detection systems. Even though
the efficiency and robustness are proved in both performance and
accuracy, there is still a number of improvements that we can apply
to enhance their algorithm. This paper inherits face detection
framework of Viola-Jones and introduces two key contributions.
First, the modification is used to apply integral image so that
features are more informative and help increase detection
performance. The second contribution is the new approach to
utilize AdaBoost that uses Gaussian Probability Distribution to
compute how close to the mean positive and negative distributions
are, then classify them more efficiently. Furthermore, by
experiments, we also prove that a small fraction of a feature set is
far enough to develop a good strong classifier instead of the whole
feature set. As a result, the memory required as well as the time for
training is minimized.
Key words - face detection; Gaussian distribution; AdaBoost;
Haar-like pattern; weak classifier
1. Introduction
In recent decades, along with the rapidly advanced
improvement in technology, face detection has now
become the most popular topic that can be applied to many
fields in industries or in real life. Algorithms for face
detections are developed quickly and become more
enhanced to support complicated applications like multi-
view face detection [1-4], occluded face detection [3, 5],
pedestrian detection [6, 7], ... In this paper, we inherit the
work of Viola-Jones [8] which has been proved successful
in accuracy as well as in performance.
Thanks to their great work, the number of practical
real-time applications and systems are built for face
detection or related topics. We will propose some new
methods for feature extraction and implementation so that
the system can train and detect images faster than previous
one from Viola-Jones and it also utilizes less memory
storage. Besides, new Haar-like patterns are proposed to
improve the efficiency of detection.
There are two main contributions in our face detection
systems and they are briefly introduced below and in detail
in next sections.
First, we utilize integral image representation as the
main component to quickly compute feature values.
Nevertheless, it is more efficient when applying non
integer-sized pattern as described in section 3. In this way,
given a feature, we can obtain much more information that
is necessary for classification process. In this step, Viola-
Jones system pre-calculates feature values and stores them
in hard drive. This method is useful in training process
because all information is already computed. In contrast,
the significantly long time is used for this calculation and
working with hard drive. Instead of following their method,
we introduce another approach. Given the size of a data set,
size of image as well as the features patterns, a lookup table
used for feature indexing can be generated separately
before training procedure proceeds. We have to compute a
specific feature value when needed. It is more efficient than
the previous work [8] not only in performance but also in
memory consumption.
The second enhancement is how AdaBoost [9] is
applied. Viola-Jones used threshold to classify positive and
negative example images. Multiple weaker classifiers are
combined to form a strong one which can divide data
perfectly when learning, but in testing, it might fail. The
detection quality depends much on the correctness of the
AdaBoost function to classify data. In case the positive and
negative distribution overlap, it is apparently difficult to
choose a good threshold between those regions. In this
paper we apply Gaussian probability distribution for the
classification task. Why we use and how we apply this
method to AdaBoost process is described in this section,
and pseudo-code is also provided. Besides, the number of
operations for Gaussian is less than using threshold, thus
the training and detection time is reduced.
a. Overview
We start by review Viola-Jones systems and point out
some functions that we need to improve. The review is in
section 1. In section 2, we describe and explain how we
choose and implement our new algorithms and why they
are effective in face detection system. After that, the
experiments and results are clearly shown in Section 3.
2. Review of Viola-Jones Algorithms
a. Haar-like patterns
In every vision system, the efficiency and accuracy
depend strongly on the features it uses and the quality of
those features. Feature design and its calculation is the key
to the success of a computer vision or machine learning
system. To extract features from example images, Viola-
Jones used Haar-like rectangle features in their systems as
shown in Figure 1.
Figure 1. Haar-like rectangle patterns in Viola-Jones system
Feature value for a certain rectangle is the difference
between white and black pixel values. If doing this task by
common methods, it would take the complexity of O(HW)
where H and W are height and width of a pattern.
46 Tuan M. Pham, Hao P. Do, Danh C. Doan, Hoang V. Nguyen
Integral image representation is one of their
contributions in their paper. It is applied to rapidly compute
feature values. Its formula is described below:
𝑖𝑖(𝑥, 𝑦) = ∑ 𝑖(𝑥′, 𝑦′)
0 ≤𝑥′≤𝑥,0 ≤𝑦′≤𝑦
Where 𝑖(𝑥, 𝑦) is the image intensity at pixel (𝑥, 𝑦) and
𝑖𝑖(𝑥, 𝑦) is the value of integral image at pixel (𝑥, 𝑦).
By using integral image method, we are able to
calculate any rectangular sum by pre-computed referenced
rectangles. Thus, the complexity is approximately O(1).
In their detection system, they trained and tested by
24x24 Grayscale PNG images. For each image, they
applied those 5 rectangle features. Heights, widths and
positions of each rectangle are also varied. Because
information of images and feature patterns are given, the
number of features which can be applied to an image is
known. There were 43200, 43200, 27600, 27600 and
20736 features for each rectangle of category (a), (b), (c),
(d) and (e) respectively, thus 162336 features in total.
b. AdaBoost algorithm
From that, there were a huge number of features
corresponding to each image sub-window. Due to this fact,
it is still a lot of work even when those features are
calculated quickly and efficiently. However, by using a
very small set of features, detection system can form an
effective classifier.
Thanks to the invention of AdaBoost [9], this method
can be used to select the essential features as well as to train
for a strong classifier. AdaBoost is an algorithm for
constructing a strong classifier from a linear combination
of weak classifier.
𝐹(𝑥) = ∑ 𝑎𝑡ℎ𝑡(𝑥)
𝑇
𝑡=1
Where 𝑥 is an example (19x19 image in our system),
ℎ𝑡(𝑥) is a weak or basis classifier. Normally, the set
ℋ = ℎ(𝑥) is finite.
A weak classifier is a function of a feature (f), a
threshold (𝜃) and a polarity (p) that denotes the direction
of the inequality:
ℎ(𝑥, 𝑓, 𝑝, 𝜃) = {
1 𝑝. 𝑓(𝑥) < 𝑝. 𝜃
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
For a weak learner, we do not expect the best
classification. After each round of AdaBoost, a weak
classifier with a smallest weighted error is chosen:
ℎ̂𝑡 = arg
𝑚𝑖𝑛
ℎ𝑗 ∈ 𝐻
𝜀𝑗 = ∑ 𝑤𝑖|ℎ𝑗(𝑥𝑖 , 𝑓, 𝑝, 𝜃) − 𝑦𝑖|
𝑖
Where 𝑦𝑖 is the correct label for example 𝑥𝑖, 𝑦𝑖 is 1 if
𝑥𝑖 represents a face, otherwise it is 0.
In additions, every example is re-weighted so that it is
emphasized in the next training round. Clearly, an example
which is incorrectly labeled in the current round will have
the greater weight compared to correct ones.
𝑤𝑡+1,𝑖 = 𝑤𝑡,𝑖𝛽
1−𝑒𝑖
Where 𝑒𝑖 = 0 if 𝑥𝑖 is correctly classified, otherwise it
is 1, and 𝛽 =
𝑒𝑖
1−𝑒𝑖
AdaBoost is very simple to implement and efficiently
extract good features from a very large set. One of the
disadvantages of AdaBoost algorithm is that over fit is the
result of choosing a very complex-training model, turning
this to the key challenge to applying this method.
Graphic visualization of AdaBoost process after each
round is shown in Figure 2 below. In this example, we need
to detect and classify blue dots from red ones. We apply
AdaBoost to this problem and try to find the best weak
classifier to classify these two regions in each round.
Figure 2. Visualization for AdaBoost process after t=1 and t=3
After completing the first round, 1 weak classifier is
chosen as described by the black straight line. The total
detection quality is now very low. However, by combining
3 weak classifiers, the accuracy is significantly improved.
In Viola-Jones system, they used AdaBoost and for
each weak learner, tried to find the optimal threshold
classification function. Supposing that there are N image
examples and K features for each image, so they had KN
combinations for feature and threshold. For the data set
used by Viola-Jones, K was 162336 features and N was
6977 images to train. In a training round of AdaBoost
procedure, it needs to iterate the whole data set to evaluate
the training error for a feature/threshold combination. It
means, the complexity required for each round was
O(NKN) to find a weak classifier. By setting the number of
weak classifiers to M, the total complexity to train their
system is O(MNKN). With M = 200, at least 1.58x1015
operations needed to be processed in any training machine.
Even when working with a super computer, it is still a
tremendous procedure and takes a significantly long time
to finish.
To improve the system as well as reduce the training
time, they proposed the modified algorithm. With a
specific feature, they could find the optimal threshold by
using current example weights without generating all
possible combinations of this feature and every image
example. To apply this algorithm, with each feature,
examples should be sorted by their feature values, and the
complexity for this process is O(Nlog2N). Thereafter, it
only requires O(N) to find the optimal threshold for current
feature. Hence, the complexity to this sub-task is O(max(N,
Nlog2N)) = O(Nlog2N). This algorithm led to the reduced
complexity of O(MKNlog2N) and sustainable decrease in
the number of operations to 2.89x1012.
ISSN 1859-1531 - THE UNIVERSITY OF DANANG, JOURNAL OF SCIENCE AND TECHNOLOGY, NO. 6(127).2018 47
Pseudo-code for Viola-Jones' algorithm is shown as
below:
1. Giving example image (𝑥1, 𝑦1), , (𝑥𝑛 , 𝑦𝑛) where
𝑦𝑖 = 0, 1 for negative and positive examples
respectively.
2. Initializing weights 𝑤𝑡,𝑖 =
1
2𝑚
,
1
2𝑙
for 𝑦𝑖 = 0, 1
respectively, where m and l are the number of
negatives and positives respectively.
3. For 𝑡 = 1, , 𝑇:
a. Normalizing the weights 𝑤𝑡,𝑖 ←
𝑤𝑡,𝑖
∑ 𝑤𝑡,𝑗
𝑛
𝑗
b. Selecting the best weak classifier with
respect to the weighted error
𝜀𝑡 =
𝑚𝑖𝑛
𝑓, 𝑝, 𝜃
∑ 𝑤𝑖|ℎ(𝑥𝑖 , 𝑓, 𝑝, 𝜃) − 𝑦𝑖|
𝑖
a. Defining ℎ𝑡(𝑥) = ℎ(𝑥, 𝑓𝑡 , 𝑝𝑡 , 𝜃𝑡) where
𝑓𝑡 , 𝑝𝑡 , and 𝜃𝑡 are the minimizers of 𝜀𝑡
b. Updating the weights:
𝑤𝑡+1,𝑖 = 𝑤𝑡,𝑖𝛽
1−𝑒𝑖
Where 𝑒𝑖 = 0 if example 𝑥𝑖 is classified
correctly, 𝑒𝑖 = 1 otherwise, and 𝛽 =
𝑒𝑖
1−𝑒𝑖
4. Combining strong classifier
3. Proposed Method to Improve Viola-Jones system
The main purpose of this paper is to propose the new
method that can perform detection faster than the
traditional Viola-Jone system, so we choose the same
Haar-like rectangle features in their system. Besides, we
also introduce our new features which are more efficient
for face detection systems when the complexity level
increases, such as detecting rotated faces.
3.1. New feature selection
In Viola-Jones system, they used integer-size of
rectangle. In our research, we again use those rectangle but
with non-integer size. With this method, features can be
more informative, thus the detection performance is higher
than that by the Viola-Jones system. Figure 3 shows some
examples of new non integer-sized rectangle. This is 2x2
sub-window from an image and the height of rectangle
feature is 3.
Figure 3. 2x2 sub-window image with non- integer-sized feature
Because the size is a non -integer number, feature
values are represented by floating-point number. It results
in new difficulties when systems use complicated pattern
of features. For this problem, we also figure out the
approach which can quickly compute the feature values in
few operations with the complexity of O(1). Compared to
the traditional feature calculation, processing time is now
nearly the same.
To apply new rectangle features, users only need to
clearly specify new pattern before training their models.
Below is an example for pattern (d) in Figure 1. The matrix
shows color map of features with 1 for white and -1 for
black.
1 1 1
-1 -1 -1
1 1 1
By using the color map above, we can quickly compute
the feature value for each rectangle with non-integer size.
The size and position of a feature can vary
correspondingly to the 19x19 image. Given the size of a
pattern, we can manage the number of arising features.
This point leads to another improvement for our system,
that is pre-calculating and storing for feature values are no
longer required. Back to Viola-Jones approach, they have
to use hard drive to store the whole set of feature values.
Apparently, the way costs a tremendously long time to get
the training data available. Besides, it consumes huge
memory storage which is now not essential in our system.
There are 29241, 29241. 23409, 23409 and 29241
features for category (a), (b), (c), (d) and (e) respectively.
Thanks to the constraint of rectangle sizes, we
separately make a lookup table from the given information
about those rectangles. It means that when it requires any
single feature value, our system can immediately find the
exact feature as well as its size and position. After that we
easily compute them by our formula as we have mentioned
before. To generate this lookup table, we just apply brute-
force algorithm to iterate and find feature information.
3.2. Gaussian distribution as classification function
Even though the training time is significantly reduced,
it is still a long time. In our research and experiment, we
propose a new way to train our detection system by
applying Gaussian probability distribution instead of
finding optimal threshold for each feature.
Starting with the point that positive and negative
distribution of Haar-like feature for an image are very hard
to classify by a single threshold. By combining multiple
weak classifiers with many thresholds, the number of
operations exponentially increases without guaranteeing
the increase of detection accuracy. Besides, it can result in
over-fitting problem on the training process.
Figure 4. Histogram of a specific feature for face and
non-face images
48 Tuan M. Pham, Hao P. Do, Danh C. Doan, Hoang V. Nguyen
Figure above shows histogram of feature values for
face and non-face images computed from a specific feature.
Blue region denotes feature values for face images, and
non-faces are drawn in orange. The x-axis is the feature
values; meanwhile y-axis shows the frequency after
normalization of each value. From the figure, it is clear that
2 histograms are overlapped, leading to the difficulty to
select a threshold. Moreover, in those situations, a weak
classifier's performance is poor but the training time is
longer and finally the testing result is poor as a
consequence.
In this paper, we propose the AdaBoost algorithm that
uses Gaussian distribution of feature values. Gaussian
distribution is one of the most important probability
distributions for continuous variables and it is really useful
in natural sciences. From theory, the averages of samples
of a variable from independent distributions converge in
distribution to the normal. In other words, it becomes
normally distributed when we have enough observations.
Below is the formula for a Gaussian distribution of a single
real-valued variable x:
𝑓𝑔(𝑥|𝜇, 𝜎
2) =
1
√2𝜋𝜎2
𝑒
−
(𝑥−𝜇)2
2𝜎2
Where 𝜇 is the mean and 𝜎2 is the variance of the
sequence of feature values for the data set with a specific
feature.
To overcome this overlapping problem, we apply
Gaussian distribution to calculate and compare the
difference to 2 means of positive and negative distributions.
The classification function is applied as follows:
ℎ(𝑥, 𝑓, 𝜇𝑝, 𝜎𝑝
2, 𝜇𝑛, 𝜎𝑛
2)
= {
1 𝑓(𝑥|𝜇𝑝, 𝜎𝑝
2) > 𝑓(𝑥|𝜇𝑛, 𝜎𝑛
2)
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Where 𝑓(𝑥|𝜇, 𝜎2) is the Gaussian function that is
mentioned above. 𝑥 is a feature value of an example image.
𝜇𝑝, 𝜎𝑝
2, 𝜇𝑛, and 𝜎𝑛
2 are means and variances for positive
and negative distributions respectively.
Our method proceeds as follows. For each feature, all
corresponded values to image examples are computed at a
given feature. Next , we calculate means and variances for
2 distributions. After that, formula of Gaussian distribution
is applied to find the distance to the means before
comparison. If an image example is more likely to be a face,
so the closer it is from the means of a positive region.
Otherwise, it is closer to negative distribution. With this
method, the difficulty of overlapping is overcome because
the means of distributions are all always separated.
After finishing the current classification, error is
computed to select the best feature of the current AdaBoost
round. Clearly, the weights of image examples are
re-computed to emphasize incorrect classification for later
training.
Our procedure is described in the pseudo-code below.
The only difference between our algorithm and Viola-
Jones' is the classification function with Gaussian method.
1. Giving example image (𝑥1, 𝑦1), , (𝑥𝑛 , 𝑦𝑛) where
𝑦𝑖 = 0, 1 for negative and positive examples
respectively.
2. Initializing weights 𝑤𝑡,𝑖 =
1
2𝑚
,
1
2𝑙
for 𝑦𝑖 = 0, 1
respectively, where m and l are the number of
negatives and positives respectively.
3. For 𝑡 = 1, , 𝑇:
a. Normalizing the weights 𝑤𝑡,𝑖 ←
𝑤𝑡,𝑖
∑ 𝑤𝑡,𝑗
𝑛
𝑗
b. Selecting K’ features from full set of features
c. For each feature:
i. Computing feature values for every
example
ii. Calculating means and variances for
positive and negative distributions.
iii. Selecting the best weak classifier that
minimizes the error:
𝜀𝑡 = 𝑚𝑖𝑛𝑓 ∑ 𝑤𝑖|ℎ(ℎ(𝑥𝑖 , 𝑓, 𝜇𝑝, 𝜎𝑝
2, 𝜇𝑛, 𝜎𝑛
2 ) − 𝑦𝑖|
𝑖
iv. Defining ℎ𝑡(𝑥) = ℎ(𝑥, 𝑓𝑡) where 𝑓𝑡 is the
minimizer of 𝜀𝑡
d. Updating the weights:
𝑤𝑡+1,𝑖 = 𝑤𝑡,𝑖𝛽
1−𝑒𝑖
Where 𝑒𝑖 = 0 if example 𝑥𝑖 is classified
correctly, 𝑒𝑖 = 1 otherwise, and 𝛽 =
𝑒𝑖
1−𝑒𝑖
4. Combining strong classifier
In our method, it takes linear time to compute mean and
variance for each positive or negative distribution with
complexity of O(N) and all simple operations. O(MKN) is
the total complexity for our algorithm. However, due to the
use of Gaussian distribution, the floating-point operations
are obligated. Indeed, exponential operation for floating-
point numbers is far complicated than simple arithmetic
ones. In our system, this expression is used to find 𝑒𝑥 for
classification function 𝑓(𝑥|𝜇𝑝, 𝜎𝑝
2). Although the current
complexity is O(MKN), it costs even longer time than
O(MKNlog2N) does if those operations are not well
computed. Thus, for this step, instead of exponentials we
use inverse operation which is natural logarithm 𝑙𝑛(𝑥). By
our experiment, it takes much less time to compute
𝑙𝑛(𝑒𝑥) compared to 𝑒𝑥. By simplifying expressions to find
Gaussian function, the number of operations is also
remarkably reduced.
If an image is a face, ratio of Gaussian functions for two
distributions should be greater than 1:
1
√2𝜋𝜎𝑝2
𝑒
(𝑥−𝜇𝑝)
2
2𝜎𝑝
2
1
√2𝜋𝜎𝑛2
𝑒
(𝑥−𝜇𝑛)
2
2𝜎𝑛
2
> 1
or,
√𝜎𝑛2
√𝜎𝑝2
> 𝑒
(𝑥−𝜇𝑝)
2
2𝜎𝑝
2 −
(𝑥−𝜇𝑛)
2
2𝜎𝑛
2
ISSN 1859-1531 - THE UNIVERSITY OF DANANG, JOURNAL OF SCIENCE AND TECHNOLOGY, NO. 6(127).2018 49
Because all elements are non-negative numbers, the
inequality remains unchanged when taking square of both
sides:
𝜎𝑛
2
𝜎𝑝2
> 𝑒
(𝑥−𝜇𝑝)
2
2𝜎𝑝
2 −
(𝑥−𝜇𝑛)
2
2𝜎𝑛
2
Moreover, natural exponential is a one-to-one and
increasing function, we can apply natural logarithm to the
inequality:
ln (
𝜎𝑛
2
𝜎𝑝2
) >
(𝑥 − 𝜇𝑝)
2
2𝜎𝑝2
−
(𝑥 − 𝜇𝑛)
2
2𝜎𝑛2
At this point, we can use the above inequality to make
comparison; all of those operations can be calculated in a
short time. We have reduced a lot of operations and the
computation is now much simpler.
The following shows pseudo-code for our approach:
1. Computing feature values for the whole data set
2. Finding mean 𝜇 and variance 𝜎2for positive and
negative distributions
3. Using natural logarithm to find: 𝑎 = ln (
𝜎𝑛
2
𝜎𝑝
2)
4. Finding the value of 𝑏 =
(𝑥−𝜇𝑝)
2
𝜎𝑝
2 −
(𝑥−𝜇𝑛)
2
𝜎𝑛
2
5. Comparing 𝑎 and 𝑏, return 1 if 𝑎 > 𝑏, otherwise 0
4. Experiments and Results
a. Experiments
As mentioned before, to train and test this system, we
use data set from MIT cbcl Face Data [10] and it contains
19x19 grayscale PGM images.
In this procedure, we conduct some experiments to
evaluate and compare the processing time between original
method with our proposed one. Before preparing lookup
table and training, we normalize the whole data set for both
train and test images [11] Thus, those images are now in
the same standard. Samples of images before and after the
normalization process are listed below:
Figure 5. Original images before normalization
Figure 6. Images after normalization
In the experiment, we implement our algorithm from
scratch in Java. After that, the system runs in a normal
computer with Mac OS, memory 8 GB 1600MHz DDR3
and processor 2.6 GHz Intel Core i5. The amount of hard
drive is not specified due to the fact that we only use Ram
to experiment our system. It means, hard drive is not
mandatory as in Viola-Jones's.
b. Results
Theoretically, for each round of AdaBoost process,
there are totally 134541 features used for testing to choose
the best weak classifier. By analysis and experiments, it is
unnecessary to test all of those features, instead, a small
number of them can be used to reduce the training time but
still maintain the accuracy level. From experimenting
different numbers of features for training each round,
we have found that K' = 5000 features are sufficient for
2 criteria above.
The graph in Figure 7 shows the comparison between
choosing K' = 5000 features versus the whole 134541
features. Both of two AdaBoost algorithms that are used
Viola-Jones with threshold and proposed method with
Gaussian distribution are involved. Viola-Jones’
approaches are figured by blue and gray dashed lines.
Figure 7. Processing time and accuracy between 2 methods
with different number of chosen features on training image set
In this experiment, we do not follow Viola-Jones
method which requires hard drive to store feature values
due to the long processing time with memory storage.
Hence, in our proposed method, lookup table is utilized to
reduce the training time.
The x-axis is processing time measured in second and
the y-axis is the corresponding accuracy by percentage.
When Viola-Jones method using threshold is applied to
classify face/non-face images, it takes approximately
760 seconds for a weak classifier if we test the whole set of
features. In contrast, 30 seconds is needed if this procedure
is performed by 5000 features. When applying Gaussian
distribution, the processing time decreases. It requires
600 seconds for the full feature set and 22 seconds if
5000 features are chosen.
Figure 8. Result from experimenting with testing image set
By conducting this experiment, it is proved that the
detection system can still obtain the high performance
50 Tuan M. Pham, Hao P. Do, Danh C. Doan, Hoang V. Nguyen
without choosing the whole set of features. From the above
figure, applying Gaussian distribution is better than
original Viola-Jones method in both cases. This result is
gained by testing with the training image set, we have the
similar result with the testing image set and it is shown in
Figure 8.
In those experiment processes, we compute the
accuracy of detection in the fixed processing time of about
1 hour. With this period of time, AdaBoost by Viola-Jones’
algorithm can produce 5 and 125 weak classifiers for the
whole set and 5000 features set respectively. Similarly,
6 and 160 weak classifiers are chosen with Gaussian
distribution algorithm.
We also conduct another experiment that sets the fixed
value of T - the number of weak classifiers. For T = 200,
if we apply Viola-Jones system that pre-compute all feature
values and store to hard drive, then use those values for
training, it takes 46 seconds to compute and choose 1 weak
class. Approximately, 153 minutes or 2 hours 33 minutes
is required for the complete training process. By setting the
same value for T, but keep using threshold for AdaBoost
procedure, our new system requires 30 seconds for each
weak classifier even though our computation is more
complicated by using floating-point numbers. The total
training time is now about 96 minutes or 1 hours
36 minutes, 2/3 of the previous time if we compare that to
Viola-Jones system.
The training time is significantly reduced if Gaussian
probability distribution is applied. For each weak classifier,
the processing time decreases to 22 seconds. Clearly, the
training process costs half of the original time with
76 minutes or 1 hour 16 minutes.
By using Gaussian probability distribution, the number
of operations is reduced and now the speed of training is
2 times faster than that of the previous work. However, in
Viola-Jones method, they had to use hard drive to
pre-compute training data and this process took a
significant time as described in previous section. If this
factor is taken into account, our new method is proved to
be far efficient not only in processing time but also in
memory usage.
5. Conclusion
In this paper, we propose a new way to apply Haar-like
patterns as well as how to use integral image technique for
computing feature values. For the same feature, much more
informative values can be extracted and hence, detection
rate is better as well.
The more important contribution is how we apply
Gaussian probability distribution to AdaBoost to improve
its performance. By utilizing this function, we can avoid
the difficulty to choose optimal thresholds for each round
of AdaBoost. Classification problem becomes simpler and
more straightforward by determining how far to the mean
positive and negative distributions are. Besides, the
detection speed is also faster because of classification rate.
Those two contributions have been characterized into
the success of our paper. By applying this system or this
idea about implementation, face detection system can be
run in a normal computer or machine. From the advance in
performance, this method can be used in other real-time
detection systems in practice.
Acknowledgement
This research was funded by Vietnam Ministry of
Science and Technology Research Project in 2017-2018,
No. CNTT-10
REFERENCES
[1] Bo Wu, Haizhou AI, Chang Huang and Shihong Lao, “Fast Rotation
Invariant Multi-View Face Detection Based on Real Adaboost”,
IEEE FGR'04, 2004.
[2] Paul Viola, Michael J. Jones, “Fast Multi-view Face Detection”,
Mitsubishi Electric Research Lab TR-2003-96, 2003.
[3] Shengcai Liao, Anil K. Jain, and Stan Z. Li, “A Fast and Accurate
Unconstrained Face Detector”, 2015.
[4] T. Mita, T. Kaneko, and O. Hori, “Joint Haar-like Features for Face
Detection”, ICCV 2005.
[5] X. P. Burgos-Artizzu, P. Perona, “Robust Face Landmark
Estimation Under Occlusion”, ICCV, 2013.
[6] B. Leibe, E. Seemann, and B. Schiele, “Pedestrian Detection in
Crowded Scenes”, CVPR, 2005.
[7] S. Zhang, R. Benenson, M. Omran, J. Hosang, and B. Schiele,
“Towards Reaching Human Performance In Pedestrian Detection”,
IEEE PAMI, 2017.
[8] Paul Viola, Michael J. Jones, “Robust Real-time Face Detection”,
International Journal of Computer Vision, 2004, pp. 138-143.
[9] Robert E. Schapire, “Explaining AdaBoost”, In Empirical Inference,
2013.
[10] CBCL Face Database. Retrieved from
datasets/FaceData2.html
[11] Dwayne Philips, “Image Processing in C 2nd”. R&D Publications,
2000.
(The Board of Editors received the paper on 03/01/2018, its review was completed on 03/4/2018)
Các file đính kèm theo tài liệu này:
- fast_gaussian_distribution_based_adaboost_algorithm_for_face.pdf