20101230 Discriminative Local Binary Patterns for Human Detection in Personal Album.
Understanding Human-Object Interaction in RGB-D videos for ...€¦ · Discriminative models for...
Transcript of Understanding Human-Object Interaction in RGB-D videos for ...€¦ · Discriminative models for...
Zhiwen FangBeingTogether Centre,IMI, Research Fellow
1
Understanding Human-Object Interaction in RGB-D videos for Human Robot Interaction
Non-verbal language
2
MotivationHuman-robot interaction (HRI)[1,2,3]
[1] Yang Xiao, Zhijun Zhang, Aryel Beck, Junsong Yuan, and Daniel Thalmann. 2014. Human–robot interaction by understanding upper body gestures. Presence: teleoperators and virtual environments 23, 2 (2014), 133–154.[2] Isibor Kennedy Ihianle, Usman Naeem, and Abdel‐Rahman Tawil. 2016. Recognition of activities of daily living from topic model. Procedia Computer Science 98 (2016), 24–31.[3] Marina P′erez‐Jim′enez, Borja Bordel S′anchez, and Ram′on Alcarria. 2016. T4AI: A system for monitoring people based on improved wearable devices. Research Briefs on Information & Communication Technology Evolution (ReBICTE) 2 (2016), 1–16.
Human
Verbal language
Facial expression
body gesture
Object
Social robot
…
Motivation
Motivation
Understand the intention of the human based on the object information
with a cell phone in hand and close to ear, it may indicate
that the person is having a call.
with a cup in hand and close to mouse, it may indicate the
person is drinking.
How to detect hand-held objects?
1 Introduction
2 Method
4 Results
5
Outline
5
Conclusions
3 System overview
6
Wearable sensors & Radio Frequency Identification tags [1]
Thermal band images [2]
Computer vision method based on RGB camera [3][4]
[1] K. P. Fishkin, M. Philipose, and A. Rea. 2005. Hands-on RFID: wireless wearables for detecting use of objects. In IEEE International Symposium on Wearable Computers, 2005. Proceedings.38–43.[2] Cigdem Beyan and Alptekin Temizel. 2015. A multimodal approach for individual tracking of people and their belongings. The Imaging Science Journal 63, 4 (2015), 192–202.[3] Chaitanya Desai, Deva Ramanan, and Charless Fowlkes. 2010. Discriminative models for static human‐object interactions. In Computer vision and pattern recognition workshops (CVPRW), 2010 IEEE computer society conference on. IEEE, 9–16.[4] Zhaozhuo Xu, Yuan Tian, Xinjue Hu, and Fangling Pu. 2015. Dangerous human event understanding using human‐object interaction model. In Signal Processing, Communications and Computing (ICSPCC), 2015 IEEE International Conference on. IEEE, 1–5.
Introduction
7
Introduction
Research problems in hand-held object detection(1) Relationship between objects and a person
(2) Hand-held objects are often very small
(3) Targets loss because of appearance changes and/or part
occlusion in the sequence.
Chair, bottle, cell phone, keyboard… About 5 meters, bottle Part occlusion, cell phone
1 Introduction
2 Method
4 Results
5
Outline
8
Conclusions
3 System overview
9
Method
Human contextual information
1. Skeleton data (25 body joint positions)
2. Local patch around the hand joint
10RGB image Person Index
Estimate the probability of belonging to a person1. Object Detection in the local patch
2. Estimate the probability using the person index map
Method
11
Estimate the probability of belonging to a person
Method
12
Object detection in a local patch by YOLO[1, 2]
(1) resize the image to 544 * 544
(2) run a convolutional network on the resized image
(3) output the results by the confidence of network model.
[1] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 779-788.[2] Redmon J, Farhadi A. YOLO9000: better, faster, stronger[J]. arXiv preprint, 2017.
Method
13
Method
14
Object tracking based on correlation filter [1]
(1) dense sampling by modeling all possible translations of the
base sample in a search window as circulant shifts
(2) learning the correlation filter by solving a ridge regression
problem in the Fourier domain.[1] Henriques J F, Caseiro R, Martins P, et al. High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583-596.
Method
1 Introduction
2 Method
4 Results
5
Outline
15
Conclusions
3 System overview
16
Natural Language Processing
Natural Language Processing
Speech recognition
Natural Language Processing
Hand‐held object detection Object detection
Human and robot interaction
Language interaction
Object exchange
System overview
1 Introduction
2 Method
4 Results
5
Outline
17
Conclusions
3 System overview
18
Results
Detection rate of different methods in three categories (i.e. bottle, cup, cell phone).
* w/o represents the method without human contextual information
1 Introduction & Literature Review
2 Method
4 Results
5
Outline
19
Conclusions
3 System overview
20
Conclusions
To provide intelligent human-robot interaction, it is critical to
understand the interaction between the human and daily objects,
so that we can analyze the intention of the human.
Using a RGB-D sensor, we can provide a method to detect
hand-held objects
Human contextual information is introduced to improve the
performance of hand-held object detection
THANK YOU!
21
Q & A