Generic Object and Action Detection with LARK (Locally Adaptive
Transcript of Generic Object and Action Detection with LARK (Locally Adaptive
![Page 1: Generic Object and Action Detection with LARK (Locally Adaptive](https://reader036.fdocuments.net/reader036/viewer/2022071602/613d54d5736caf36b75c102f/html5/thumbnails/1.jpg)
Generic Object and Action Detection withLARK (Locally Adaptive Regression Kernels)
Haejong SeoUniversity of California, Santa Cruz
Mentor: Gary Bradski
![Page 2: Generic Object and Action Detection with LARK (Locally Adaptive](https://reader036.fdocuments.net/reader036/viewer/2022071602/613d54d5736caf36b75c102f/html5/thumbnails/2.jpg)
Haejong Seo (summer Internship) (1)
MotivationHumanRobot InteractionRobotRobot Interaction
Where to look?
Is there any motion?
Where is a bottle of beer located?How big is this bottle?What pose is this ?
Is he/she waving hands to me?Is other PR2 approaching to me?
122
3
4
![Page 3: Generic Object and Action Detection with LARK (Locally Adaptive](https://reader036.fdocuments.net/reader036/viewer/2022071602/613d54d5736caf36b75c102f/html5/thumbnails/3.jpg)
MotivationDOT (dominant orientation templates) by Stefan
1) can handle object detection2) run with a single webcam3) pretty fast
However, can not deal with
Haejong Seo (summer Internship) (2)
BiGGPy (binarized gradients grid pyramid) by Gary
3
1 2 4
LARK can tackle all the problems like
![Page 4: Generic Object and Action Detection with LARK (Locally Adaptive](https://reader036.fdocuments.net/reader036/viewer/2022071602/613d54d5736caf36b75c102f/html5/thumbnails/4.jpg)
Motivation & Goal
static saliency object detection
spacetime saliency action detection
Haejong Seo (summer Internship) (3)
Develop fast and robust detection systems in open sourse
1
2 4
3
![Page 5: Generic Object and Action Detection with LARK (Locally Adaptive](https://reader036.fdocuments.net/reader036/viewer/2022071602/613d54d5736caf36b75c102f/html5/thumbnails/5.jpg)
OutlineOutline
LARK Overview
Saliency Detection
Object Detection
Conclusion
Spacetime Saliency Detection
Action Detection
1
2
3
4
5
6
![Page 6: Generic Object and Action Detection with LARK (Locally Adaptive](https://reader036.fdocuments.net/reader036/viewer/2022071602/613d54d5736caf36b75c102f/html5/thumbnails/6.jpg)
LARK (locally adaptive regression kernels)
Haejong Seo (summer Internship) (4)
●Euclidean distance vs. Geodesic distance
![Page 7: Generic Object and Action Detection with LARK (Locally Adaptive](https://reader036.fdocuments.net/reader036/viewer/2022071602/613d54d5736caf36b75c102f/html5/thumbnails/7.jpg)
Image as a Surface Embedded in the Euclidean 3‐space
Arclength on the surface
Chain rule
Haejong Seo (summer Internship) (5)
![Page 8: Generic Object and Action Detection with LARK (Locally Adaptive](https://reader036.fdocuments.net/reader036/viewer/2022071602/613d54d5736caf36b75c102f/html5/thumbnails/8.jpg)
LARK selfsimilarity →
Haejong Seo (summer Internship) (6)
![Page 9: Generic Object and Action Detection with LARK (Locally Adaptive](https://reader036.fdocuments.net/reader036/viewer/2022071602/613d54d5736caf36b75c102f/html5/thumbnails/9.jpg)
LARK (example)
Haejong Seo (summer Internship) (7)
![Page 10: Generic Object and Action Detection with LARK (Locally Adaptive](https://reader036.fdocuments.net/reader036/viewer/2022071602/613d54d5736caf36b75c102f/html5/thumbnails/10.jpg)
LARK (speedup)Step1: downsample by a factor of 4
Step 2: interpolate C = [C11, C12, C22] after computing in a lower scale
C11 C22 C12
Haejong Seo (summer Internship) (8)
0.02 sec (70 times faster)
![Page 11: Generic Object and Action Detection with LARK (Locally Adaptive](https://reader036.fdocuments.net/reader036/viewer/2022071602/613d54d5736caf36b75c102f/html5/thumbnails/11.jpg)
Saliency Detection
Haejong Seo (summer Internship) (9)
LARK selfresemblance
Saliency map
thresholding
![Page 12: Generic Object and Action Detection with LARK (Locally Adaptive](https://reader036.fdocuments.net/reader036/viewer/2022071602/613d54d5736caf36b75c102f/html5/thumbnails/12.jpg)
Saliency Detection (video)
Haejong Seo (summer Internship) (10)
![Page 13: Generic Object and Action Detection with LARK (Locally Adaptive](https://reader036.fdocuments.net/reader036/viewer/2022071602/613d54d5736caf36b75c102f/html5/thumbnails/13.jpg)
Object Detection
Haejong Seo (summer Internship) (11)
Compute LARKCompute LARK
templatetemplate
imageimage
![Page 14: Generic Object and Action Detection with LARK (Locally Adaptive](https://reader036.fdocuments.net/reader036/viewer/2022071602/613d54d5736caf36b75c102f/html5/thumbnails/14.jpg)
Object Detection (speedup)
Use saliency to reduce search space
Haejong Seo (summer Internship) (12)
![Page 15: Generic Object and Action Detection with LARK (Locally Adaptive](https://reader036.fdocuments.net/reader036/viewer/2022071602/613d54d5736caf36b75c102f/html5/thumbnails/15.jpg)
Face Detection (video)
Haejong Seo (summer Internship) (13)
One template Three templates
![Page 16: Generic Object and Action Detection with LARK (Locally Adaptive](https://reader036.fdocuments.net/reader036/viewer/2022071602/613d54d5736caf36b75c102f/html5/thumbnails/16.jpg)
Object Detection (video)
Haejong Seo (summer Internship) (14)
Door knob PR2
Drawing Small robot
Three templates
![Page 17: Generic Object and Action Detection with LARK (Locally Adaptive](https://reader036.fdocuments.net/reader036/viewer/2022071602/613d54d5736caf36b75c102f/html5/thumbnails/17.jpg)
3D Object Detection (speedup)
Pyramid searchPyramid searchTree structure for template
Haejong Seo (summer Internship) (15)
![Page 18: Generic Object and Action Detection with LARK (Locally Adaptive](https://reader036.fdocuments.net/reader036/viewer/2022071602/613d54d5736caf36b75c102f/html5/thumbnails/18.jpg)
3D Object Detection (video)
Haejong Seo (summer Internship) (18)
CD case mouse
naked organizer
![Page 19: Generic Object and Action Detection with LARK (Locally Adaptive](https://reader036.fdocuments.net/reader036/viewer/2022071602/613d54d5736caf36b75c102f/html5/thumbnails/19.jpg)
3D Object Detection (video)
Haejong Seo (summer Internship) (19)
Two objects Three objects
![Page 20: Generic Object and Action Detection with LARK (Locally Adaptive](https://reader036.fdocuments.net/reader036/viewer/2022071602/613d54d5736caf36b75c102f/html5/thumbnails/20.jpg)
3D LARK selfsimilarity in 3D→
Haejong Seo (summer Internship) (19)
![Page 21: Generic Object and Action Detection with LARK (Locally Adaptive](https://reader036.fdocuments.net/reader036/viewer/2022071602/613d54d5736caf36b75c102f/html5/thumbnails/21.jpg)
Spacetime Saliency Detection
Haejong Seo (summer Internship) (11)
![Page 22: Generic Object and Action Detection with LARK (Locally Adaptive](https://reader036.fdocuments.net/reader036/viewer/2022071602/613d54d5736caf36b75c102f/html5/thumbnails/22.jpg)
Spacetime Saliency (video)
Haejong Seo (summer Internship) (12)
![Page 23: Generic Object and Action Detection with LARK (Locally Adaptive](https://reader036.fdocuments.net/reader036/viewer/2022071602/613d54d5736caf36b75c102f/html5/thumbnails/23.jpg)
Action Detection
Haejong Seo (summer Internship) (20)
template
Input video
LARKs(30~35 frames)
![Page 24: Generic Object and Action Detection with LARK (Locally Adaptive](https://reader036.fdocuments.net/reader036/viewer/2022071602/613d54d5736caf36b75c102f/html5/thumbnails/24.jpg)
Action Detection (speedup)
Haejong Seo (summer Internship) (21)
spacetime saliencypyramid search
5 frames 5 frames
35 frames
7 frames of 3D LARK (3x3 (space)x5 (time))
![Page 25: Generic Object and Action Detection with LARK (Locally Adaptive](https://reader036.fdocuments.net/reader036/viewer/2022071602/613d54d5736caf36b75c102f/html5/thumbnails/25.jpg)
Action Detection (video)
Haejong Seo (summer Internship) (22)
4 actions
sitting down
moving closer
boxing
waving
![Page 26: Generic Object and Action Detection with LARK (Locally Adaptive](https://reader036.fdocuments.net/reader036/viewer/2022071602/613d54d5736caf36b75c102f/html5/thumbnails/26.jpg)
Code Availability
Package larks: service that trains object templates and detects objects locations and poses (available now)
→ stacks/object_recognition_experimental/larks
Haejong Seo (summer Internship) (23)
Package saliency: service that provides salient regions in images and videos (will be available)
→ cturtle/wgrospkgunreleased/sandbox/saliency
Package actiondetection: service that detect generic human actions in videos (will be available)
→ cturtle/wgrospkgunreleased/sandbox/actiondetection
![Page 27: Generic Object and Action Detection with LARK (Locally Adaptive](https://reader036.fdocuments.net/reader036/viewer/2022071602/613d54d5736caf36b75c102f/html5/thumbnails/27.jpg)
Discussion & Future Work
Use partsbased detection to deal with occlusion (Steve Gould)
Use a tracking algorithm to avoid blinking effects
Improve scalability build a common tree for all the objects→
Haejong Seo (summer Internship) (24)
Learn threshold values for each object and action
Use LARK as a post filter for BiGGPy
![Page 28: Generic Object and Action Detection with LARK (Locally Adaptive](https://reader036.fdocuments.net/reader036/viewer/2022071602/613d54d5736caf36b75c102f/html5/thumbnails/28.jpg)
Thank you!
Any Questions?