Silhouette Lookup for Automatic Pose Tracking N ICK H OWE.

Post on 15-Jan-2016

231 views 0 download

Tags:

Transcript of Silhouette Lookup for Automatic Pose Tracking N ICK H OWE.

Silhouette Lookup for Automatic Pose Tracking

NICK HOWE

Goal: 3D Pose Tracking

Full 3D “motion capture” from 2D video Single camera Unmarked video

Difficulties: 3D ambiguity Self-occlusion Foreshortening Appearance changes Shadowing

↑(Uses hand-entered data)

The “Old” Way:

Incremental Tracking

Previous frame

Compare withwith ImageRefine 2D PoseRefine 2D Pose

2D Pose2D Pose+ Appearance+ Appearance

NumericalNumericalOptimizationOptimization

NextNext frame

Creeping Error

Incremental Errors accumulate and grow.

May be mitigated by: Better motion models (more guidance) Better appearance models (3D) Better tracking (multiple hypotheses)[Sidenbladh, et. al.; Sminchisescu, et. al.]

Intrinsic problems still remain.(initialization, error recovery)

Direct Pose Estimation

Consider human abilities: Estimate pose from still photo Estimate pose from stick figure Estimate pose from silhouette

[Brand ’99; Rosales et. al. ’01)

Recognition/Retrieval

Hypothesis: Humans can recognize pose by recalling similar examples. Pose Recognition Retrieval

Recognition/Retrieval

Hypothesis: Humans can recognize pose by recalling similar examples. Pose Recognition Retrieval

New Approach: 1. Store many silhouettes with known poses

2. Given video, extract silhouettes3. Retrieve best candidate matches4. Look for plausible series of poses over time

Some Related Work

Estimating Human Body Configuration Using Shape Context MatchingMori & Malik, ECCV 2002

3D Tracking = Classification+InterpolationTomasi, Petrov, & Sastry, ICCV 2003

Temporal Integration of Multiple Silhouette-based Body-part HypothesesKwatra, Bobick, & Johnson, CVPR 2001

3D Human Pose from Silhouettes by Relevance Vector RegressionAgarwal & Triggs, CVPR 2004

Silhouette Comparison

Turning angle(Captures morphology)

Chamfer distance(Captures overlap)

Combine using Belkin technique(score = sum of individual ranks)

Sample Retrievals

(Hits from a small library of 1600 poses)

Coordination Between Frames

Need to pick from top matches at each frame. Want good image match at all frames Want small change between frames

Markov chain minimization!

Best local choices minimize global error

etc.

frame i-1 frame i frame i+1

Too Much Coffee?

Initial solution shows “twitches”

Smoothing it Out

Jitters in motion parameters smoothed via polynomial splines

Making it Match

Problem: poor overlap between observed silhouette & smoothed solution Work with 11-frame splines

Optimize spline parameters to reduce chamfer distance

Result: better match to observations, still smooth

Walking Sequence Result

Re-rendering

Same scene, different viewpoint.

Another Example

Tracked using library of ballet poses

Incremental Tracking

Markov chain is best for offline use But: Convergence after ~10 frames

Incremental tracking with latency

Key Points

Silhouette lookup provides set of potential poses for each frame

Markov chain selects best temporal pose sequence (HMM)

Smoothing & optimization based upon temporal splines

Result: simple tracker, tolerates errors

Thank you! Questions?

Continuing Challenges

Mistakes in rotational direction No data for parts not on silhouette

Incorporate optical flow Some unrealistic motions generated

Incorporate motion model Correct pose not always retrieved

Improve library coverage, retrieval

Future Research

People carrying objects Multiple overlapping people (sports) Time considerations

Optimization slow Chaining currently slow Holy Grail: Real-time tracking

2. Identify best (least expensive)

result

Markov Chain Minimization

Frame 1 Frame 2 Frame n

...

1. Compute least expense to reach each state from previous frame (cost = estimate of plausibility)

State 2A

State 2C

State 2B

State 1A

State 1C

State 1B

State nA

State nC

State nB

3. Backtrack, picking out path that gave best result.

Silhouette Extraction

Many candidate approaches. Moving & fixed camera

This work: Static camera Graph-based segmentation

Making it Match

Solution doesn’t match exactly yet.