Object Tracking in Video Sequence Images Based on Color...
Transcript of Object Tracking in Video Sequence Images Based on Color...
International Journal of Mathematics and Computer Sciences (IJMCS) ISSN: 2305-7661 Vol.34 Oct 2014 www.scholarism.net
990
Object Tracking in Video Sequence Images Based on Color Histogram and Central Voting
Ahmad samaeifar1 iman attarzadeh
2 bita shadgar
3
1Department of computer, Dezful Branch , Islamic Azad university, Dezful, Iran,Email:[email protected]
2Department of computer, Dezful Branch, Islamic Azad university, Dezful, ran,Email:[email protected]
3Department of computer Shahid chamran Aahvaz University of, Iran Email:[email protected]
Abstract
Machine vision with combination of methods related to image processing and machine learning tools makes
computer to be able to meanings intelligent understanding and image contents. Object tracking is a fundamental
operation for many high level machine vision applications such as recognition based on movement, automatic
monitoring, indexing of video files, human-computer mutual communications, traffic monitoring and vehicle
guidance that nowadays places at the highest attention level. In this paper, we introduce a new algorithm for
tracking in video sequence. We present a specific color histogram model for showing the target under tracking.
By using this particular model, a central voting method based on generalized Hough transform is used to
estimate object position in each frame. In this method, from each available pixel in the previous frame target
area, one vote is selected to determine the new position of target center in new frame and the candidate pixel
specifies the target center with maximum vote of new position. After specifying the new center position, a
recursive process is done for separating the target from the background and updating the model parameters. The
simulation results indicate the successful target tracking, even in the situations that target has changes in size or
when target has the same color with the image background.
Key words
Machine vision, Object tracking, Generalized Hough transform, Color histogram, Central voting.
International Journal of Mathematics and Computer Sciences (IJMCS) ISSN: 2305-7661 Vol.34 Oct 2014 www.scholarism.net
991
1-Introduction
There are several problems in object tracking in video images sequence that we intend to overcome them. First,
conventional systems of object tracking may be confused in image that its background is changing. The next
problem is complexity of movement in desired object. In addition, being rigid or non-rigid of objects under
tracking, local blocking of the image and the image lighting change are also topics that challenge system [1].
Two main parts that determine the performance of video tracking system are the target show model and target
position detection algorithm in the image. The meaning of target show model is a framework that target is
defined in this framework and or in more precisely definition, this framework defines which features of the
target are as the representative of that target in algorithm that is under investigation, and the position detection
algorithm also states how to search in image for finding target. Based on what model is used for target show,
there are different algorithms for searching and finding target.
Tracking by contour algorithm performs the tracking operation very well even if object is not rigid [2 and 3], but
this method has very high computational processing and usage of this algorithm is not possible in online
tracking states. Furthermore this method is not able to distinguish target from background in positions that target
has the same colors with background. Tracking by feature points model method had good results so far [4, 7] in
which the target had the complicated color structure. In this method, we will be able to find the target position in
the desired frame using definition of the image by feature points and using a Newton-Raphson optimization
algorithm. Tracking with this method is fast and reliable. However, the performance quality of this algorithm is
reduced when the target position changes or turns around itself or when the image has blocking.
The usage of color histogram display model for target display is a very popular and common method in tracking
applications, because of its stability against noise and sudden position change of the target under tracking [8-
10]. An algorithm is presented for tracking of video targets using color histogram display model that is called
CAM_shift and traces the targets under study using their body color histogram method. For finding the target
position from one frame to the next one, a series of repetitive processing are performed and after finding the
target position, a rectangular area is placed around it, a rectangular shape position is determined in each frame
and puts it in a new position. Although this algorithm has been considered for face tracking, but it can be used
for tracking of other objects [8]. In [9] the presented algorithm uses of kernel coefficients and it applies the color
histogram model to target as weighted form so that pixels which are close to the target boundary that expose to
background color and more likely errors have lower coefficients for influencing on results than to pixels located
in center.
In this paper we present a new algorithm for both important stage of tracking means target model display and
finding its position. To display of target model, we present a new class of color histogram method that has
derived from the development of Spatiogram method. In this method in addition to relying on which pixels are
placed in which bin of histogram, the distance of pixels is also considered in obtaining Spatiogram. Central
voting method based on Hough procedure has been used for obtaining the position of the object in each image.
After obtaining the position of the object, its dimensions is estimated and is stored for usage in the next frame
and the desired object is separated from the background in a recursive stage and used parameters is also updated.
International Journal of Mathematics and Computer Sciences (IJMCS) ISSN: 2305-7661 Vol.34 Oct 2014 www.scholarism.net
992
In the following, we explain the procedure of color histogram method and central voting using Hough transform
and also other required concepts for studying the presented method in this thesis.
2-Generalization Hough transform
The first Hough transform had designed for finding rupture points of lines and their connection to each other,
but by its generalization in other applications, this transform is widely used for detecting the shapes in images.
This transform consists of two parts:
Formation of R-Table is for display of shape and the second part, usage of voting method is for detecting the
desired shape in image.
2-1-Formation of R-Table:
In an arbitrary shape in the image under investigation, for each point on the boundary of forming
shape perimeter, the value of directional gradient and also the position vector are calculated
from a reference point . Then the position vector is stored in a table called R-Table as a
function of directional gradient. Figure (1) and (2) show this procedure:
Figure (1): Implementation of Hough method for finding the center of shapes.
Table (1): R-Table
International Journal of Mathematics and Computer Sciences (IJMCS) ISSN: 2305-7661 Vol.34 Oct 2014 www.scholarism.net
993
3-Central voting to determine the position of the object
For each pixel from shape in figure (1-b), the directional gradient is calculated. Then by using this
directional gradient and according to equation for each pixel, the central point is calculated
and is stored in R-Table. Then in a search algorithm, the points of shape are specified that the more numbers of
pixels have selected them as their central point, and the points with the maximum vote are selected as the
existing shape centers in image. Pay attention to this note that if the desired shape is circular then all points in
perimeter of circle, present the circle center as their center, but Hough transform finds the center for each other
shape, because in other shapes although all pixels do not concentrate on one center, but several points are
presented as center that the point with maximum vote is selected as center under investigation. In figure (1-C),
the lightness of points has been shown according to the amount of vote for each point, as can be seen centers are
more highlight than other points.
3-1-Tracking and determination of object position by Spatiogram:
The presented Spatiogram in [10] represents the object under tracking according to equation (1):
In which represents the number of pixels that locate in bin bth of object color histogram and is average
position vector in bin bth, and is also covariance matrix of pixels in bin bth. Number B shows total numbers
of pixels in rectangular. Assume that we have the Spatiogram information of target object h from pervious
frame, object position in current frame is found by algorithm that calculates the amount of similarity between all
Spatiograms y in current frame image and Spatiogram h. Spatiogram y is defined as follows:
In which y is a shape with dimensions of target shape (dimensions of target shape are on hand from pervious
frame), in each point that the similarity between object Spatiogram y and h is maximized that point is presented
as object position.
The similarity between two Spatiogram is calculated according to equation (2):
In which weight parameter is calculated from the following equation:
International Journal of Mathematics and Computer Sciences (IJMCS) ISSN: 2305-7661 Vol.34 Oct 2014 www.scholarism.net
994
In which is normal Gaussian constant and also for we have:
Position y is presented as target object position if can maximize the similarity function. Therefore the procedure
in positioning with central voting method and Hough transform is as follows: at first user specifies the position
of object shape in the first frame manually, then tracking algorithm applies Hough transform to object shape and
gives the results to central voting function for tracking and according to what was said before, the position of
target object is found. Equation (2) is also computed by descending gradient method, but central voting method
using Hough function operates about three times faster. In next section we present an algorithm that develops
the presented method in this chapter and enhances the performance and precision.
4-Proposed algorithm
The problem of video tracking is defined in such a way that by having a series of object information in current
frame, we intend to track them in the next frames. It is assumed that the object under tracking in the first frame
is determined by operator and or the tracking system finds the object position in primary frame according to the
features that object has given to it. Anyway it is assumed that the system in primary frame has the object
position with rectangular perimeter surrounding object, and the problem solution is to find this rectangular in the
next frames.
4-1-Object display model
We assume is position of all pixels that belongs to object in image. represents object
center and and are half of length and width the rectangular that has the object, figure (2-a).
Figure (2)
International Journal of Mathematics and Computer Sciences (IJMCS) ISSN: 2305-7661 Vol.34 Oct 2014 www.scholarism.net
995
We use a particular histogram model as for object model display,
in which represents average position of all existing pixels in bin bth and cluster kth. Here cluster is said
to all existing pixels in one bin of histogram that their Euclidean distance from each other is not more than a
threshold. In fact, in this display model not only the color density of pixels has been considered as target
characteristic, but also their distance amount has been considered.
represents the existing pixels probability in cluster th, or in other words is the result of
division the number of existing pixels in cluster th on all forming pixels of target shape. The following
mathematical equation makes this problem clear:
In which C is normalization constant that makes us certain that , for this purpose, the
equation for calculating C is as follows:
is equal to one when is a pixel that places in bin bth histogram and the distance of these pixels from
each other is less than a threshold distance and this value is equal to zero in other pixels. The following
algorithm demonstrates how calculates:
Algorithm 1: Calculation of
Input: Position of pixel
Output: Cluster
1. Calculation bin b for pixel
2.Create a new cluster with name if there is not any other cluster in bin bth.
3.Otherwise, for each k:
(a).Calculate distance from .
International Journal of Mathematics and Computer Sciences (IJMCS) ISSN: 2305-7661 Vol.34 Oct 2014 www.scholarism.net
996
(b).Place pixel in bin bth if the calculated distance in previous step is less than .
(c).Update .
4.Create a new cluster as if does not place in none of above clusters.
We use kernel function [9] in order to overcoming the possibility of object homochromatic error with image
background and also this homochromatic may mislead the system. As a matter of fact what does this function is
in such way that it does not influence all existing pixels in target rectangular equally in process of target position
estimation, but the pixels in target rectangular center have more coefficients in comparison with pixels that are
in rectangular margin. This technique causes that the margin pixels which have greater role in misleading the
system, have less impact on the calculations. Kernel function is defined according to the following equation:
In which is the normalized Euclidean distance between pixel and target object center (c) that is
calculated from the following equation:
Equation (7) is ellipse like equation of shape 1. In fact if , it is like that we are seeking a elliptic
shape with center C and with large and small diameter and , figure (2-b). For pixels located outside this
ellipse that most of them are image background pixels, their normalized distance from center is more than 1 and
according to equation (7) their kernel coefficient is equal to zero. On the other hand the pixels inside the ellipse
have coefficients between zero to one so that existing pixels in center have larger coefficient, figure (2-c).
Unlike the presented Spatiogram in previous section that each bin had an average vector and a covariance, in
presented algorithm in this chapter, each bin includes several average vectors that each vector is average of all
pixels that locate in one bin and their distance is less than a threshold . In fact covariance has been deleted in
this algorithm but in return distance factor of pixels from each other has been affected in measurement. For
example consider a target object in figure (3-a).
International Journal of Mathematics and Computer Sciences (IJMCS) ISSN: 2305-7661 Vol.34 Oct 2014 www.scholarism.net
997
Figure (3)
Table (2)
Latin letters in the image show bins related to target rectangular pixels. Color histogram related to this target has
been presented in table (2). For simplicity, we use uniform kernel coefficients for all pixels in this example.
Note that two pixels with the same bin a, do not locate in one cluster due to long distance from each other,
therefore the bin related to a has two average vectors. On the other hand, since two pixels inside bin e are close
in the image, so they have placed in one cluster and average vector of this bin has been made of average position
of these two pixels. Indeed, bin a has been made of two cluster but bin e made of one cluster.
4-2- Determination of target position:
Determination of target position includes two steps of central voting and recursive stage. In central voting step,
all pixels in target rectangular that have obtained from previous frame, contribute in voting of object center
determination according to the procedure that will present. Rules relating to central voting step are as follows:
1.Only pixels whose color is available in obtained model histogram in the previous step are able to vote.
2.Each pixel that its color places in bin bth can add one vote to pixel as center.
3.Reliable pixels vote has larger coefficient than unreliable pixels. We explain later about being or not being
reliable. Figure (3-b) shows central voting process. Arrows in this image represent that each pixel votes to which
pixel as shape center. Pixels that have shown with represent that the color of these pixels does not place at
International Journal of Mathematics and Computer Sciences (IJMCS) ISSN: 2305-7661 Vol.34 Oct 2014 www.scholarism.net
998
none of the existing bins in object histogram. Pixels in position that have located in bins b and c and e, vote
to pixels and and respectively. In this example
coordinate center has been considered point d. According to this coordinate center, position of pixels has been
presented in table (2). Furthermore in this table, average vector has obtained from averaging the position of
pixels in a cluster in directions of X and Y. Now in figure (3-b) that is a further time frame of shape frame (3-a),
we deal with the central voting process using presented rules and table (2) that has obtained from previous
frame. According to this example we understand that two stages of target display model determination (table
(2)) and determination of position (central voting) implement in separated frame and determination of position
of a frame is always ahead of target display model stage. After computing that each pixel gives its vote to which
pixel as target object center, a maximum-making is implemented on pixels and pixel with maximum vote is
introduced as target object center, that in this example, pixel in bin d has maximum vote and is selected as
center. The new center is introduced as target position coordinate in current frame and it is used for next
calculation in next frames.
In this example, a simple assumption was considered that target object pixels have not the same color that this is
irrational and in most cases this does not happen and makes the tracking algorithm fail when target has pixels
with the same color to background environment. This problem is soluble by taking consideration rule 3 and the
point that give more chance to reliable pixels in determination of center compared to unreliable pixels. Reliable
pixel is said to pixel that its color is available in target shape histogram but does not exist in image background.
In our algorithm, pixels with higher reliability have more chances for center determination compared to pixels
with lower reliability. This more chance is achieved by giving higher impact factor to these pixels. According to
this definition, the impact factor for pixels that have average vector located in bin bth from
target histogram is defined as follows:
In which was represented in equation (4) and is probability amount of those background pixels
that has placed in bin bth and is obtained from dividing the number of these pixels on total number of
background pixels.
Consider set of pixels as all pixels that place between ellipse with small diameter
and large diameter and ellipse with small diameter and large diameter and both ellipses
have located in center C, figure (2-b). Possibility amount is calculated as follows:
International Journal of Mathematics and Computer Sciences (IJMCS) ISSN: 2305-7661 Vol.34 Oct 2014 www.scholarism.net
999
And
Is a normalization constant and is one when is a pixel that places in histogram bin bth and is zero in
other pixels. is also kernel function that is defined with following equation:
and is normalized Euclidean distance in equation (7) with the difference that and
have been used instead of and .
Developed kernel function which was introduced above, is attributed to all pixels zero weight in
background environment as can be seen in figure (2-b). Note that background environment area is determined
by factor that can be variable as a function of target dimensions. Here, the amount is obtained from the
following equation:
In which is a coefficient that is specified according to the speed of object movement and it depends on it. In
most tracking systems is selected equal to 2 in which target displacement in two consecutive frames is not
more than target dimensions. Larger amounts should be considered when the target speed is more.
We face a challenge in choice of so that we have to select large for high speeds and also in these speeds it
is needed to raise the computational speed. But choice a large means that a larger area of pixels is selected for
processing and as a result, large η means that computational process and time is higher that is in contradiction
International Journal of Mathematics and Computer Sciences (IJMCS) ISSN: 2305-7661 Vol.34 Oct 2014 www.scholarism.net
1000
with high computational speed. Therefore we have to make a compromise between these two issues and obtain
the optimal state .
Set of pixels that algorithm performs processing on them are considered for finding center as algorithm range.
Algorithm range that has the direct relationship with parameter , it should be selected such that not lose the
target tracking system. For example if target position in current frame is close to target position in previous
frame, then a short range can lead to target tracking, but range should select more in high speeds of target. For
instance in figure (3-b) dotted lines that demonstrate a rectangular with center C represent the range of tracking
system so that area of this range is equal to .
The algorithm is repeated after finding the new center. So far algorithm tracks targets well but there is also other
problem that if object is not rigid and its dimensions are variable with time, so we use parameter in order to
considering these variations in the algorithm. When target dimensions gradually increase, the algorithm should
add new pixels to voting algorithm that have right to vote and also a series of pixels should be removed of
voting process in state of getting smaller of dimensions. is a parameter that determines in each frame all pixels
that their distances from center C is less than or equal to can be contributed in voting and otherwise they are
considered in background pixels. Moreover, it is allowed that these new pixels that have placed in desired
distance have right to being candidate for object centrality. This stage of algorithm places in algorithm recursive
stage. The following algorithm shows this process:
Algorithm 2: Recursive stage
1.Calculate distance from C for all pixels of system range rectangular member in point C.
2.If distance is less than , then place pixel as system range, otherwise is considered as background.
The result of adding above algorithm to the system is to consider the variations of target dimensions in tracking
algorithm. Therefore we obtain the format of new pixels in each frame after obtaining target center for using in
next frame in a recursive stage in order to contributing them in central voting and we subsequently update kernel
function parameters. We obtain the new dimensions of target rectangular using this algorithm as follows:
In which and are new dimensions of target rectangular and and are half of the length and width of
a background rectangular that is calculated by kernel function. Parameter determines that old dimensions and
the updated values with what proportion have the effect in obtaining new dimensions.
International Journal of Mathematics and Computer Sciences (IJMCS) ISSN: 2305-7661 Vol.34 Oct 2014 www.scholarism.net
1001
5-Updating model parameters
Target model and image background parameters should update during tracking process because the related
variations can influence the model. By assuming that is image background histogram in current frame that
has been calculated by kernel function in object with centrality C, the updated value is calculated as
follows:
In which represents the impact ratio of values and in determination of new value of .
Whatever is larger so the impact of previous in determination of new is smaller and is
responsible for determining of new , and whatever is smaller so the system pay no attention to updating
value and relies on previous for determination . = 0.5 is usually used when background is changing
rapidly.
Updating the histogram model parameters has been made of three parts; combining, adding and pruning.
Considering as the calculated object histogram in current frame,
histogram model parameters are updated during the following procedure by placing kernel function in center C
by values :
1.At first combine with by taking an average if they matched, it means their distance is less
than a threshold between clusters , and subsequently the probability value is also updated according to the
following equation:
If distance is much than all values of and any match was not found, for new value we have:
2.Add to model histogram if any match was not found in histogram h.
3.Delete from model if the probability value of is less than a threshold . In practice =
0.0001 can be a good choice.
International Journal of Mathematics and Computer Sciences (IJMCS) ISSN: 2305-7661 Vol.34 Oct 2014 www.scholarism.net
1002
Parameter is one of the model parameters that behaves like . In situations which model dimensions do not
change rapidly, we select small because the model does not change much, but in situations that we have large
variations from one frame to another, we select large so that the system can change the model smartly and
according to the presented algorithm, and also can adapt itself with variations. In practice = 0.1 in situations
with low variations ,and in much variations we select it larger. After updating the model parameters, the new
values of should be normalized by dividing them on and their values place between zero
to one so that is equal to one on all s.
The algorithm was presented in this paper is the result of developing and combining two methods of Spatiogram
tracking and Hough transform that were presented in section 2. In this algorithm unlike other methods that target
position is obtained according to Spatiogram and Hough transform, we used an adaptive voting method based on
Hough transform. It should be noted that before using this algorithm, we need to have Spatiogram results in a
table like R-Table.
The presented algorithm in this paper has several advantages compared to other methods that we introduced
them. The first difference between this algorithm and similar tracking algorithms is that we give the chance to
each bin of histogram to have several average vector that it causes we have more complete information from
histogram and the tracking process operates more accurate. Secondly the central voting method that is applied
adaptively to target rectangular pixels, can perform the tracking well and the system does not make mistake
when target object has the same color with image background. Third, the target separation process from
background is simply performed in recursive stage.
6- Experimental Results
Simulations have been accomplished in MATLAB programming environment and the obtained results from
presented algorithm operations have been presented on two fields of video images in this section. Each video
frame size was in dimensions of 480 * 720 and for simulations has been used of computerized system with a
processor (cpu) dual 3 GHz and also temporary memory (RAM) 3 GHz. Moreover used MATLAB software
version was Matlab 2013-b. In this section we also deal with comparing the procedure of presented algorithm
performance with kernel algorithm in tracking of object in video and compare it with similar algorithms in terms
of computational complexity and other features such as performance at image blocking, image light change, and
target dimensions change.
For applying the desired algorithm to video images field under investigation, at first in preprocessing stage, the
address of placing target rectangular in the image has been determined manually and is applied in main code.
The address of target rectangular includes the length and width of target rectangular vertices pixels, that is
extracted from the primary video frame and enter in Matlab code. A histogram 16*16*16 is considered in color
model RGB for images frame by frame. In all tracking stage in each three video, and have
International Journal of Mathematics and Computer Sciences (IJMCS) ISSN: 2305-7661 Vol.34 Oct 2014 www.scholarism.net
1003
been selected. The extracted images that are presented in the following, have represented with distance 30 frame
due to their numerous multiplicity.
For studying the algorithm performance when object under tracking has the similar color and histogram to
background space, algorithm was applied to a football game video. There are many players with the same cloth
color in football game video and, moreover the position of target is permanently changing compared to camera
that this causes the system confronts with the error. But we can overcome this problem in the proposed
algorithm using adaptive coefficients in central voting that was introduced in previous section and as can be
seen in figure (4) the system has tracked the target well:
Figure (4)
The next video is for studying the performance of tracking system in state that the object under tracking is not a
rigid object in video images sequence and its dimensions is variable during time. The target dimension change is
one of the most challenging subjects that makes the tracking system with trouble. The target example that its
dimensions change with time in video images is a tracking of a vehicle by a constant camera. As can be seen in
images of figure (5), the system seeks the target vehicle roughly as far as the target is faded from image. In this
video not only the dimensions of target vehicle get smaller by getting away from the camera, but also its state
and position will be changed:
International Journal of Mathematics and Computer Sciences (IJMCS) ISSN: 2305-7661 Vol.34 Oct 2014 www.scholarism.net
1004
Figure (5)
International Journal of Mathematics and Computer Sciences (IJMCS) ISSN: 2305-7661 Vol.34 Oct 2014 www.scholarism.net
1005
Figure (6) illustrates the obtained results of target tracking using conventional Spatiogram method. In this
algorithm by changing target dimensions, the system does not adapt itself with the new model, and it roughly
loses the target and it makes mistake:
Figure (6)
For studying computational complexity amount of the presented algorithm and its comparison with other
algorithms, we calculate the average amount of required time for processing a frame of a video and results have
been presented in table (1-5). Calculations have been performed on a system with processor power dual core
3GHz and also secondary memory 3 GHz. Although our algorithm has more computational complexity than
some presented conventional algorithms in this table, but it’s still fast enough for online and real-time tracking.
Table (3): Comparison the presented algorithm with other algorithms considering the processing speed [10]
International Journal of Mathematics and Computer Sciences (IJMCS) ISSN: 2305-7661 Vol.34 Oct 2014 www.scholarism.net
1006
7-Conclutions
In this paper, we introduce an algorithm for tracking object in video sequence. In this algorithm, object color
histogram has been used as object specific feature for its detection in video sequence. In the following, we
introduced the central Hough transform and Spatiogram model and studied how to use these algorithms in a
general algorithm. Furthermore, we developed these two algorithms that led to a new algorithm which has more
capabilities and flexibilities compared to the conventional Spatiogram. In new method both modeling and
positioning method have been developed. In addition to developing two modeling and positioning stages, the
other section has been added in each frame called model parameters updating that makes the system to be able
to track if target and background environment has the same color and also if target is a non-rigid object. The
obtained results of algorithm operations in this paper have been presented in several series of different video
frame and in various conditions. At first, we studied its performances in pursuit of football player as a condition
that target has the same color with background, and the system could accomplish it well. In the next step, we use
the system for a vehicle tracking, this test was performed for evaluating the system performance in non-rigid
tracking. Finally, we compared the system with conventional Spatiogram algorithm in terms of performance and
also compared with several target tracking algorithms in view of computational complexity. Our results indicate
the performance improvement.
References
[1] A. Yilmaz, O. Javed, M. Shah, Object tracking: a survey, ACM Comput. Surv. 38 (2006) 13.
[2] M. Isard, A. Blake, CONDENSATION—conditional density propagation for visual tracking, Int. J. Comput.
Vision 29 (1998) 5–28.
[3] Y. Shi, W.C. Karl, Real-time tracking using level sets, Proceedings of IEEE Conference on Computer Vision
and Pattern Recognition, 2, 2005, pp. 34–41.
[4] J. Shi, C. Tomasi, Good features to track, Proceedings of IEEE Conference on Computer Vision and Pattern
Recognition, 1994, pp. 593–600.
[5] C. Tomasi, T. Kanade, Detection and tracking of point features, Technical Report, Carnegie Mellon
University, 1991.
[6] D.G. Lowe, Distinctive image features from scale-invariant keypoints, Int. J. Comput. Vision 60 (2004) 91–
110.
[7] H. Bay, A. Ess, T. Tuytelaars, L. Van Gool, Speeded-up robust features (surf), Comput. Vis. Image Underst.
110 (2008) 346–359.
[8] G.R. Bradski, Real time face and object tracking as a component of a perceptual user interface, IEEE
Workshop on Applications of Computer Vision, 1998, pp. 214–219.
[9] D. Comaniciu, V. Ramesh, P. Meer, Kernel-based object tracking, IEEE Trans. Pattern Anal. Mach. Intell.
25 (2003) 564–575.
[10] Li, X., Hu, W., Shen, C., Zhang, Z., Dick, A., & Hengel, A. V. D. (2013). A survey of appearance models
in visual object tracking. ACM Transactions on Intelligent Systems and Technology (TIST), 4(4), 58.