CVPR2017勉強会 Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
11.3 Pose Estimation16385/s17/Slides/11.3_Pose_Estimation.pdf · Pose Estimation known estimate 3D...
Transcript of 11.3 Pose Estimation16385/s17/Slides/11.3_Pose_Estimation.pdf · Pose Estimation known estimate 3D...
Pose Estimation16-385 Computer Vision (Kris Kitani)
Carnegie Mellon University
Structure(scene geometry)
Motion (camera geometry) Measurements
Pose Estimation known estimate 3D to 2D correspondences
Triangulation estimate known 2D to 2D coorespondences
Reconstruction estimate estimate 2D to 2D coorespondences
Given a single image, estimate the exact position of the photographer
Pose Estimation
Pose estimation for digital display
Given a set of matched points
and camera model
Find the (pose) estimate of
parametersprojection model
point in 3D space
point in the image
3D Pose Estimation(Resectioning, Calibration, Perspective n-Point)
{Xi,xi}
x = f(X;p) = PXCamera matrix
P
Recall: Camera Models (projections)
Weak perspective
Perspective
X
�
Z̄fC
Parallel (Orthographic)
world point
We’ll use a perspective camera model for pose estimation
What is Pose Estimation?
X world point
Ocamera
Oimage
Oworld
Given
X world point
f = ?
Ocamera
Oimage
py = ?
R = ?
t = ?
Oworld
Estimate
What is Pose Estimation?
extrinsic parameters
intrinsic parameters
Same setup as homography estimation using DLT (slightly different derivation here)
2
4x
y
z
3
5 =
2
4p1 p2 p3 p4
p5 p6 p7 p8
p9 p10 p11 p12
3
5
2
664
X
Y
Z
1
3
775
x
0 =p>1 X
p>3 X
y0 =p>2 X
p>3 X
Inhomogeneous coordinates
(non-linear correlation between coordinates)
Mapping between 3D point and image points
How can we make these relations linear?
2
4x
y
z
3
5 =
2
4—– p>
1 —–—– p>
2 —–—– p>
3 —–
3
5
2
4 X
3
5
What are the unknowns?
x
0 =p>1 X
p>3 X
y0 =p>2 X
p>3 X
p>2 X � p>
3 Xy0 = 0
p>1 X � p>
3 Xx
0 = 0
Make them linear with algebraic manipulation…
Now you can setup a system of linear equations with multiple point correspondences
(this is just DLT for different dimensions)
How can we make these relations linear?
p>2 X � p>
3 Xy0 = 0
p>1 X � p>
3 Xx
0 = 0
In matrix form …
X> 0 �x
0X>
0 X> �y
0X>
�2
4p1
p2
p3
3
5 = 0
For N points … 2
666664
X>1 0 �x
0X>1
0 X>1 �y
0X>1
......
...X>
N 0 �x
0X>N
0 X>N �y
0X>N
3
777775
2
4p1
p2
p3
3
5 = 0
Solve for camera matrix by
ˆ
x = argmin
x
kAxk2 subject to kxk2 = 1
A =
2
666664
X>1 0 �x
0X>1
0 X>1 �y
0X>1
......
...X>
N 0 �x
0X>N
0 X>N �y
0X>N
3
777775x =
2
4p1
p2
p3
3
5
SVD!
Solve for camera matrix by
ˆ
x = argmin
x
kAxk2 subject to kxk2 = 1
A =
2
666664
X>1 0 �x
0X>1
0 X>1 �y
0X>1
......
...X>
N 0 �x
0X>N
0 X>N �y
0X>N
3
777775x =
2
4p1
p2
p3
3
5
A = U⌃V>
=9X
i=1
�iuiv>i
Solution x is the column of V corresponding to smallest singular
value of
Solve for camera matrix by
ˆ
x = argmin
x
kAxk2 subject to kxk2 = 1
A =
2
666664
X>1 0 �x
0X>1
0 X>1 �y
0X>1
......
...X>
N 0 �x
0X>N
0 X>N �y
0X>N
3
777775x =
2
4p1
p2
p3
3
5
Equivalently, solution x is the Eigenvector corresponding to
smallest Eigenvalue of A>A
P =
2
4p1 p2 p3 p4p5 p6 p7 p8p9 p10 p11 p12
3
5
How do you get the intrinsic and extrinsic parameters from the projection matrix?
Almost there …
P =
2
4p1 p2 p3 p4p5 p6 p7 p8p9 p10 p11 p12
3
5
Decomposition of the Camera Matrix
P =
2
4p1 p2 p3 p4p5 p6 p7 p8p9 p10 p11 p12
3
5
P = K[R|t]= K[R|�Rc]
= [M|�Mc]
Decomposition of the Camera Matrix
P =
2
4p1 p2 p3 p4p5 p6 p7 p8p9 p10 p11 p12
3
5
P = K[R|t]= K[R|�Rc]
= [M|�Mc]
Decomposition of the Camera Matrix
P =
2
4p1 p2 p3 p4p5 p6 p7 p8p9 p10 p11 p12
3
5
P = K[R|t]= K[R|�Rc]
= [M|�Mc]
Decomposition of the Camera Matrix
Find the camera center C Find intrinsic K and rotation R
P =
2
4p1 p2 p3 p4p5 p6 p7 p8p9 p10 p11 p12
3
5
P = K[R|t]= K[R|�Rc]
= [M|�Mc]
Decomposition of the Camera Matrix
Find the camera center C Find intrinsic K and rotation R
Pc = 0
P =
2
4p1 p2 p3 p4p5 p6 p7 p8p9 p10 p11 p12
3
5
P = K[R|t]= K[R|�Rc]
= [M|�Mc]
SVD of P!c is the Eigenvector corresponding to
smallest Eigenvalue
Decomposition of the Camera Matrix
Find the camera center C Find intrinsic K and rotation R
Pc = 0
P =
2
4p1 p2 p3 p4p5 p6 p7 p8p9 p10 p11 p12
3
5
M = KR
P = K[R|t]= K[R|�Rc]
= [M|�Mc]
SVD of P!c is the Eigenvector corresponding to
smallest Eigenvalue
Decomposition of the Camera Matrix
Find the camera center C Find intrinsic K and rotation R
Pc = 0
right upper triangle
orthogonal
P =
2
4p1 p2 p3 p4p5 p6 p7 p8p9 p10 p11 p12
3
5
M = KR
P = K[R|t]= K[R|�Rc]
= [M|�Mc]
RQ decomposition
SVD of P!c is the Eigenvector corresponding to
smallest Eigenvalue
Decomposition of the Camera Matrix
Find the camera center C Find intrinsic K and rotation R
Pc = 0
Simple AR program
1. Compute point correspondences (2D and AR tag)
2. Estimate the pose of the camera P
3. Project 3D content to image plane using P
3D locations of planar marker features are known in advance
3D content prepared in advance
(0,0,0)
(10,10,0)
(0,0,0)(10,10,0)
Do you need computer vision to do this?
Structure(scene geometry)
Motion (camera geometry) Measurements
Pose Estimation known estimate 3D to 2D correspondences
Triangulation estimate known 2D to 2D coorespondences
Reconstruction estimate estimate 2D to 2D coorespondences