ComputerVision
TrackingTracking
Many thanks to: H. Bischof, B. Leibe, V. Ferrari, K. Graumann, Y. Ukrainitz, D. Wagner, V Lepetit, M. Breitenstein, P. Sabzmeydani, Z. Kalal from whom I borrowed many slides and videos.
ComputerVision
CV
PR
’06]
We all know what tracking is, right?
[Gra
bner
et al.,
Vid
eoP
roc
CV
PR
’06]
ComputerVision Tracking
actual object position
Time t+1Time t
„ FIND IT AGAIN“
ComputerVision What to track?
ComputerVision What to track?
centerpoint
ComputerVision What to track?
multiplepoints
ComputerVision What to track?
(body)parts
ComputerVision What to track?
region
ComputerVision What to track?
outline
ComputerVision What to track?
structure
ComputerVision Approaches
(i) Model-based trackingapplication-specific
human body, faces, space shuttle,…
ComputerVision Approaches
(i) Model-based trackingapplication-specific
human body, faces, space shuttle,…
(ii) Feature trackingmore genericcorner tracking
blob/contour trackingintensity profile tracking
region tracking
ComputerVision
Saliency
Object
Tracking Cues
Model/ Tracking History
Scene
ComputerVision Applications!
• Structure-from-Motion
• Gesture/Action Recognition
• Video editing
• Augmented Reality• Augmented Reality
• Navigation
• ….
ComputerVision Applications: Game Interface
ComputerVision Applications: SfM
• Tracked Points gives correspondences
ComputerVision Applications: SfM
et
al.
IJC
V 2
00
4]
[Po
llefe
yes
et
al.
IJC
V 2
00
4]
ComputerVision
Applications: Analysis of Motion Pattern
Single-Agent Level
Multi-Agent Level Scene Level
Detail Level
ComputerVision
[Ess
et
al.
CV
PR
’08
]
ComputerVision Outline
• Point Tracking• Template Tracking• Region Tracking• Model-based Tracking• Foreground vs. Background• Foreground vs. Background• Tracking-by-Detection
– Object classes– specific object
• Combining Tracking and Detection• Context in Tracking
ComputerVision
x
y
Vision = Inverse Graphics
x
y
z
ComputerVision Motion
ComputerVision Motion
ComputerVision Motion
ComputerVision
x
y
Vision = Inverse Graphics
x
y
z
Tracking = Inverse Animation
ComputerVision Steps of Tracking
predictpredict correctcorrect
• Recap: Particle filtering– Tracking can be seen as the process of
propagating the posterior distribution of state given measurements across time.
ComputerVision )|,( 111 −−− ttt zppp &
)|,( 1−ttt zppp &
prediction CONDEN
Particle Filter
)|( tt pzpweighing with
)|,( ttt zppp &
update
NSATION
ComputerVision General Tracking Loop
predict to t+1
time t
measure at t+1
update locationupdate model
ComputerVision
Point TrackingPoint Tracking
ComputerVision Estimate Optimal
Transformation
ComputerVision Simple 1D Problem
I0(x)
I (x+h)I1(x+h)
ComputerVision Sum of Squared Differences
I0(x)
I (x+h)
h
I1(x+h)
E(h) = [I0(x) – I1(x+h) ]2
ComputerVision Calculation of Displacement
E(h) [ I0 (x) – I1(x) – hI1’(x) ]2≈
E(h) = [I0(x) – I1(x+h) ]2
-2[I1’(x)(I0(x) – I1(x) ) – hI1’(x)2] h
E
∂∂ ≈
I0(x) – I1(x)h I1’(x)≈
ComputerVision Interpretation
h I0(x)
I (x+h)
I0(x) – I1(x)
I1’(x)I1(x+h)
I0(x) – I1(x)h I1’(x)≈
I1’(x)
ComputerVision Problem A: Local Minima
(a) (b)
ComputerVision Problem A: Local Minima
ComputerVision Problem B: Zero Gradient
I(x) - I0(x)h ≈
I0’(x)
?
ComputerVision Problem B: Aperture problem
No gradient along one direction:
ComputerVision Problem B: Aperture problem
No gradient along one direction:
ComputerVision “Solving” the Aperture Problem
• How to get more equations for a pixel?
• Spatial coherence constraint: Pixel’s neighbors have the same movementneighbors have the same movement
I0(x)
I1(x+h)
ComputerVision
, I
I x ∂∂= ,
∂∂=y
II y
t
II t ∂
∂=
Recall: Optical Flow
0 =++ tyx IvIuI
1 equation in 2 unknowns
, =dt
dxu
dt
dyv =
, x
I x ∂= ,
∂=
yI y
tI t ∂
=
ComputerVision Least Squares Problem
Pseudo Inverse
Over determined System of Equations
ComputerVision Eigenvectors of ATA
• Haven’t we seen an equation like this • Haven’t we seen an equation like this before?
• Recall the Harris corner detector!
• “Good Features to Track”
ComputerVision Interpreting the Eigenvalues
λ2
“Corner”λλλλ1 ~ λλλλ2 and large
“Edge” λλλλ2 >> λλλλ1
λ1
“Edge” λλλλ1 >> λλλλ2
“Flat” region
ComputerVision
Samples: Edge / Low Texture / High Texture
ComputerVision Example
ComputerVision
Template TrackingTemplate Tracking
ComputerVision
Lucas-KanadeTemplate Tracker
• From Points to templates
• Estimate „optimal“ warp W
ComputerVision
Ba
ker
& M
att
he
ws,
IJC
V’0
4,
Lu
cas
-Ka
na
de
20
Ye
ars
On
: A
Un
ifyin
g F
ram
ew
ork
][
Ba
ker
& M
att
he
ws,
IJC
V’0
4,
Lu
cas
20
Ye
ars
On
: A
Un
ifyin
g F
ram
ew
ork
ComputerVision Example
ComputerVision
Tracking by Detection of Local
Image FeaturesImage Features(specific target)
ComputerVision
3D Object Detection
Reference image(s) of the object to detect
Test image
ComputerVision Standard Approach
• Step 1: Keypoint detection– invariant to scale, rotation, or perspective
ComputerVision Standard Approach
• Step 2: Patch rectification
ComputerVision Standard Approach
• Step 3: Build descriptor vector
ComputerVision Standard Approach
• Step 4: Match descriptor vectors
Query
Database
ComputerVision Summary
Search in the Database
Search in the Database
Keypoint Detection
Keypoint Recognition
DatabaseDatabase
Pre-processingMake the actual classification easier
Robust 3D Pose Calculation
(RANSAC)
Robust 3D Pose Calculation
(RANSAC)
Geometric verification
ComputerVision
[Wa
gn
er
et
al.
ISM
AR
’08
]
ComputerVision
[Wa
gn
er
et
al.
‘09
]
ComputerVision
Region TrackingRegion Tracking
ComputerVision Background Modeling
Input Background Model
Moving Foreground
Blobs (Objects)
-
ComputerVision Mean Shift Tracking
• The mean shift tracker tracks a region, with a prescribed (color) distribution
• The similarity between the tracked region and the target region is
an
d M
ee
r, I
CC
V’9
9] region and the target region is
maximized, through evolution towards higher density in a parameter space
• Typically this search only takes a few iterations
[Co
ma
nic
iua
nd
Me
er,
IC
CV
’99
]
ComputerVision Meanshift Tracking
Region ofinterest (Kernel) Center of
mass
Mean Shiftvector Measurements
ComputerVision Intuitive Description
ComputerVision Intuitive Description
ComputerVision Intuitive Description
ComputerVision Intuitive Description
ComputerVision Intuitive Description
ComputerVision Intuitive Description
ComputerVision Intuitive Desciption
ComputerVision Example
ComputerVision Elderly People Monitoring
ComputerVision
Model based Model based Tracking
ComputerVision
Articulated Tracking with Part-Based Model
• part appearance + relative geometry.
ComputerVision Using Models
• Goal– Recover a person’s body articulation
– Detailed parameterization in terms of joint locations or joint angles
• Two basic classes of approaches– Articulated tracking as high-
dimensional inference
– Part-based models
ComputerVision
[Ra
ma
na
ne
t a
l. C
VP
R’0
5]
ComputerVision
Tracking as On-line Foreground vs.
Background Background Classification
ComputerVision Tracking as Classification
• Learning current object appearance vs. local background.
currentbackgroundbackground
[Gra
bn
er
et
al.
CV
PR
’06
]
currentobject
appearance
ComputerVision Tracking as Classification
object
backgroundvs.
ComputerVision Tracking as Classification
object
backgroundvs.
ComputerVision Tracking Loop
search Region
actual object position
from time t to t+1 evaluate classifier on sub-patches
-
+
- -
-
create confidence mapanalyze map and set new
object position update classifier (tracker)
ComputerVision
ComputerVision “Simple tracking”
ComputerVision “Tracking the Invisible”
ComputerVision When does it fail…
ComputerVision
search Region
actual object position
from time t to t+1
evaluate classifier on sub-patches
-
+
- -
-
create confidence mapanalyze map andset new object
position
update classifier(tracker)
Self-learning!
ComputerVision Drifting
Tracked Patches Confidence
ComputerVision Drifting
ComputerVision
Tracking by DetectionDetection(object class)
ComputerVision Traditional Tracking
t=1
initialization
t=2position in prev. frame
candidate new positions(e.g., dynamics)
best new position(e.g., max color similarity)
ComputerVision Tracking-by-Detection
…
detect object(s) independently in each frame
associate detections over time into tracks
ComputerVision Multiple Objects
Frame 5Frame 1 Frame 9
ComputerVision
Example: Multiple Object Tracking
ComputerVision How to get the detections?
Persons Background
Supervised Learning
ComputerVision Using the classifier
ComputerVision How to link them?
• Space-Time Analysis:(a) collect detections
Detections
Space Time Volume
[Le
ibe
et
al.
CV
PR
’07
]
ComputerVision Trajectory Estimation
(a) collect detections(b) trajectory growing and selection
t
x
t
z
Space Time Volume
ComputerVision Trajectory Estimation
(a) collect detections(b) trajectory growing and selection
t
x
t
z
H1H2
Space Time Volume
ComputerVision Result
Input (Object Detections) “Tracking” Result
ComputerVision
ComputerVision More information helps…
• Articulated tracking– “walking” people
• 3D Information
Ground Plane Depth verification
ComputerVision
[Gammeter et al. ECCV’08]
ComputerVision Towards Scene Interpretation
ComputerVision
Combining Tracking and Tracking and
Detection
ComputerVision Refining an object model
• Limit drift
Current Model
Fix (initial) Model
[Gra
bn
er
et
al.
EC
CV
’08
]
ComputerVision Recover from Drift
ComputerVision Drifting
CLICK HERE TO START
ComputerVision Combination: KLT & TbD
• Use a KLT Tracker to explore
• Learn an object detector on the fly.
[Ka
lal
et
al.
CV
PR
’10
]
ComputerVision
ComputerVision
ComputerVision
ComputerVision
Contextin Trackingin Tracking
ComputerVision I’m Carl – Track me…
[Gra
bn
er
et
al.
CV
PR
’10
]
ComputerVision Tracking Carl
ComputerVision SUPPORTERS…
• … came with different strength.
• … change over time.
ComputerVision SUPPORTERS…
• … came with different strength.
• … change over time.
ComputerVision SUPPORTERShelp Tracking of…
• … objects which change there appearance very quickly.
• … occluded objects or object outside the image.
• … small and/or low textured objects or even “virtual points”.
ComputerVision ETH-Cup Sequenze
ComputerVision ETH-Cup: Humans
ComputerVision Of the Web Tracker
ComputerVision ETH-Cup: Supporters
ComputerVision Beyond the Image
Supporters
ComputerVision Coupled Motion
Supporters
ComputerVision Changing Supporters
Supporters
ComputerVision
Obviously, there are failure cases……. and magician knows that.
Supporters
ComputerVision
Tracking IssuesTracking Issues
ComputerVision Tracking Requirements
• Strongly depends on the application!
Robust, Accurate, Fast,…
• Constrain the tracking task!
Information about the object, dynamics, environment,…
ComputerVision Tracking Issues
• Initialization
object position
Time t = 0
object position
ComputerVision Tracking Issues
• Prediction vs. Correction– If the dynamics model is too strong, will end up
ignoring the data
– If the observation model is too strong, tracking is reduced to repeated detectionis reduced to repeated detection
http://www.ethlife.ethz.ch/archive_articles/091008_kalman_per/index
ComputerVision Tracking Issues
• Obtaining observation…– Generative: “render” the state on top of
the image and compare
– Discriminative: classifier or detector scorescore
• …and dynamics model– specify using domain knowledge
– learn (very difficult)
ComputerVision Tracking Issues
• Nonlinear dynamics– Sometimes needed
to keep multiple trackers in paralleltrackers in parallel
– E.g., for abrupt direction changes („Persons“)
Wrong prediction
Correctprediction
ComputerVision Tracking Issues
• Data Association - Multiple Object Tracking– What if we don’t know which
measurements to associate with which tracks?tracks?
ComputerVision Tracking Issues
• Data Association – Fast Motion
ComputerVision Tracking Issues
• Data Association – Background / Appearance Change– Cluttered Background
– Changes in shape, orientation, color,…– Changes in shape, orientation, color,…
ComputerVision Tracking Issues
• Data Association – Occlusions / Self Occlusions
ComputerVision Tracking Issues
• Model- vs. Model-free-Tracking
ComputerVision Tracking Issues
• Drift– Errors caused by dynamical model,
observation model, and data association tend to accumulate over time
ComputerVision End.
Top Related