CV 輪講 Putting Objects in Perspective
description
Transcript of CV 輪講 Putting Objects in Perspective
![Page 1: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/1.jpg)
CV 輪講Putting Objects in
Perspective
藤吉研究室 土屋成光 2008年 7月 1日
![Page 2: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/2.jpg)
Back ground
一般物体認識 / 画像シーン認識– 低解像度– 見えの違い– 奥行きによるサイズの違い ⇒局所的な認識法が通用しない
人間は物体間の関係を利用– 三次元構造のモデル化– 局所的な認識手法を高精度に
![Page 3: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/3.jpg)
Putting Objects in Perspective
Derek Hoiem , Alexei A. Efros , Martial Hebert
Carnegie Mellon University Robotics Institute
CVPR2006
![Page 4: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/4.jpg)
Understanding an Image
![Page 5: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/5.jpg)
Today: Local and Independent
![Page 6: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/6.jpg)
検出結果
![Page 7: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/7.jpg)
Local Object Detection
True Detection
True Detections
MissedMissed
False Detections
Local Detector: [Dalal-Triggs 2005]
![Page 8: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/8.jpg)
Object Support
![Page 9: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/9.jpg)
Surface Estimation
Image Support Vertical Sky
V-Left V-Center V-Right V-Porous V-Solid
[Hoiem, Efros, Hebert ICCV 2005]
Software available online
ObjectSurface?
Support?
![Page 10: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/10.jpg)
Object Size in the Image
Image World
![Page 11: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/11.jpg)
Input Image
Object Size ↔ Camera Viewpoint
Loose Viewpoint Prior
![Page 12: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/12.jpg)
Object Size ↔ Camera Viewpoint
Input Image Loose Viewpoint Prior
![Page 13: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/13.jpg)
Object Position/Sizes Viewpoint
Object Size ↔ Camera Viewpoint
![Page 14: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/14.jpg)
Object Position/Sizes Viewpoint
Object Size ↔ Camera Viewpoint
![Page 15: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/15.jpg)
Object Position/Sizes Viewpoint
Object Size ↔ Camera Viewpoint
![Page 16: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/16.jpg)
Object Size ↔ Camera Viewpoint
Object Position/Sizes Viewpoint
![Page 17: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/17.jpg)
Efficient from surface and viewpoint
Image
P(object) P(object | surfaces)
P(surfaces) P(viewpoint)
P(object | viewpoint)
![Page 18: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/18.jpg)
Image
P(object | surfaces, viewpoint)
Efficient from surface and viewpoint
P(object)
P(surfaces) P(viewpoint)
![Page 19: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/19.jpg)
Scene Parts Are All Interconnected
Objects
3D SurfacesCamera Viewpoint
![Page 20: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/20.jpg)
Input to Algorithm
Surface Estimates Viewpoint Prior
Surfaces: [Hoiem-Efros-Hebert 2005]
Local Car Detector
Local Ped Detector
Object Detection
Local Detector: [Dalal-Triggs 2005]
![Page 21: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/21.jpg)
Approximate Model
Objects
3D SurfacesViewpoint
![Page 22: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/22.jpg)
s1
o1
θ
on...
sn…
Local Object Evidence
Local Surface Evidence
Local Object Evidence
Local Surface Evidence
Viewpoint
Objects
Local Surfaces
Inference over Tree
![Page 23: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/23.jpg)
Viewpoint estimation
Viewpoint Prior
HorizonHeight Height Horizon
Like
liho
od
Like
liho
od
Viewpoint Final
![Page 24: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/24.jpg)
Object Identitie
Local detector
![Page 25: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/25.jpg)
Surface Geometry
Probability map
![Page 26: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/26.jpg)
Object detection
4 TP / 2 FP
3 TP / 2 FP
4 TP / 1 FP
Ped Detection
Car Detection
Local Detector: [Dalal-Triggs 2005]4 TP / 0 FP
Car: TP / FP
Ped: TP / FP
Initial (Local) Final (Global)
![Page 27: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/27.jpg)
Experiments on LabelMe Dataset
Testing with LabelMe dataset: 422 images– 923 Cars at least 14 pixels tall– 720 Peds at least 36 pixels tall
![Page 28: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/28.jpg)
Each piece of evidence improves performance
Local Detector from [Murphy-Torralba-Freeman 2003]
Car Detection Pedestrian Detection
![Page 29: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/29.jpg)
Can be used with any detector that outputs confidences
Local Detector: [Dalal-Triggs 2005] (SVM-based)
Car Detection Pedestrian Detection
![Page 30: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/30.jpg)
Accurate Horizon Estimation
Median Error: 8.5% 4.5% 3.0%
90% Bound:
[Murphy-Torralba-
Freeman 2003]
[Dalal- Triggs 2005]
Horizon Prior
![Page 31: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/31.jpg)
Qualitative Results
Initial: 2 TP / 3 FP Final: 7 TP / 4 FP
Local Detector from [Murphy-Torralba-Freeman 2003]
Car: TP / FP Ped: TP / FP
![Page 32: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/32.jpg)
Qualitative Results
Local Detector from [Murphy-Torralba-Freeman 2003]
Car: TP / FP Ped: TP / FP
Initial: 1 TP / 14 FP Final: 3 TP / 5 FP
![Page 33: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/33.jpg)
Qualitative Results
Car: TP / FP Ped: TP / FP
Local Detector from [Murphy-Torralba-Freeman 2003]
Initial: 1 TP / 23 FP Final: 0 TP / 10 FP
![Page 34: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/34.jpg)
Qualitative Results
Local Detector from [Murphy-Torralba-Freeman 2003]
Car: TP / FP Ped: TP / FP
Initial: 0 TP / 6 FP Final: 4 TP / 3 FP
![Page 35: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/35.jpg)
Geometric Context
Estimate surface
ground: green, sky: blue, vertical: red, o:porous, x: solid
![Page 36: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/36.jpg)
Geometric Cues
Color
Location
Texture
Perspective
![Page 37: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/37.jpg)
Robust Spatial Support
RGB Pixels Superpixels
[Felzenszwalb and Huttenlocher 2004]
oversegmentation
![Page 38: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/38.jpg)
Multiple Segmentations
Superpixels
…
Multiple Segmentations
単一のセグメントではセグメントエラーの可能性 複数のセグメント数でセグメンテーション
![Page 39: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/39.jpg)
Labeling Segments
…
…
各セグメント結果を統合
![Page 40: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/40.jpg)
Learn from training images
前準備– multiple segmentation の算出– 各セグメントのラベルの算出 – ground, vertical,
sky, or “mixed” boosted decision trees による密度計算
– 8 nodes per tree– Logistic regression version of Adaboost
[Collins and Schapire and Singer 2002]
Label LikelihoodHomogeneity Likelihood
![Page 41: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/41.jpg)
Image Labeling
…
Labeled Segmentations
Labeled Pixels
Learned from training images
![Page 42: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/42.jpg)
Summary & Future Work
meters
met
ers
Ped Pe
dCar
Reasoning in 3D:• Object to object• Scene label• Object segmentation
![Page 43: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/43.jpg)
Conclusion
Image understanding is a 3D problem– Must be solved jointly
This paper is a small step– Much remains to be done
![Page 44: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/44.jpg)
![Page 45: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/45.jpg)
![Page 46: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/46.jpg)
CV 輪講Recovering Occlusion
Boundaries from a Single Image,
Closing the Loop in Scene Interpretation
藤吉研究室 土屋成光 2008年 8月 26日
![Page 47: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/47.jpg)
Back ground
一般物体認識 / 画像シーン認識– 低解像度– 見えの違い– 奥行きによるサイズの違い ⇒局所的な認識法が通用しない
人間は物体間の関係を利用– 三次元構造のモデル化– 局所的な認識手法を高精度に
![Page 48: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/48.jpg)
Recovering Occlusion Boundaries from a Single
Image
Derek Hoiem , Andrew N. Stein, Alexei A. Efros , Martial Hebert
Carnegie Mellon University Robotics Institute
ICCV’07
![Page 49: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/49.jpg)
単画像からのオクルージョン理解
オクルージョン,境界理解– 物体を探索する際に必須– Edge, region, depth によって推定
![Page 50: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/50.jpg)
手法の流れ
1. 千領域にセグメンテーションWatershed with Pb soft boundaries
2. Region, Boundary, 3D Cues の算出depth : horizon + junction to ground
3. Boundary の算出Conditional random field (CRF)
4. Boundary を用いて更にセグメンテーション
![Page 51: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/51.jpg)
results
Boundary
Object popout
![Page 52: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/52.jpg)
Closing the Loopin Scene Interpretation
Derek Hoiem , Alexei A. Efros , Martial Hebert
Carnegie Mellon University Robotics Institute
CVPR’08
![Page 53: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/53.jpg)
Putting Objects in Perspective
4 TP / 2 FP
3 TP / 2 FP
4 TP / 1 FP
Ped Detection
Car Detection
Local Detector: [Dalal-Triggs 2005]4 TP / 0 FP
Car: TP / FP
Ped: TP / FP
Initial (Local) Final (Global)
![Page 54: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/54.jpg)
Scene Parts Are All Interconnected
Objects
3D SurfacesCamera Viewpoint
![Page 55: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/55.jpg)
with Occlusions
一般物体認識フレームワークPutting Objects in Perspective
シーン構造認識Automatic Photo Pop-up
Occlusion, Boundary 情報の利用
![Page 56: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/56.jpg)
関係モデル
相互に関係
![Page 57: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/57.jpg)
Putting Objects への利用
相互的に情報を利用することで高精度に
Initial : Dalal-Triggs Iter 1 : Hoiem et al. Final : This paper
Car : Up, Ped : Down群衆の境界線の精度が問題
![Page 58: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/58.jpg)
Photo popup への利用
Occlusion, Object の利用により高精度化
![Page 59: CV 輪講 Putting Objects in Perspective](https://reader035.fdocuments.net/reader035/viewer/2022081505/56815b22550346895dc8e336/html5/thumbnails/59.jpg)
まとめ
Occlusion/Boundary の算出– 一枚の画像から geometry, depth などを用いて算出– 高精度なセグメンテーション
Occlusion/Boundary の利用– セグメンテーションによるエラーの低減– 一般物体認識に有用
課題:– 群衆などから得られる Boundary の高精度化