Visual Memorability for Egocentric Cameras
Marc Carné Herrera
Advisors: Xavier Giró-i-Nieto and Cathal Gurrin
Outline
➔ Introduction➔ Contributions
◆ Annotation tool for visual memorability◆ EgoMemNet: visual memorability adaptation to egocentric images◆ Visual memorability and physiological signals
➔ Conclusions
3
Outline
➔ Introduction➔ Contributions
◆ Annotation tool for visual memorability◆ EgoMemNet: visual memorability adaptation to egocentric images◆ Visual memorability and physiological signals
➔ Conclusions
4
“Brain is designed to forget in order to survive”
● Lifelogger → person that captures his daily life in order to create a virtual and digital memory.
● Wearable cameras → capture first person vision.● Big data → 1.400 - 2.000 images/day.● Challenge → retrieval!
5
Introduction
“Brain is designed to forget in order to survive”
● Lifelogger → person that captures his daily life in order to create a virtual and digital memory.
● Wearable cameras → capture first person vision.● Big data → 1.400 - 2.000 images/day.● Challenge → retrieval!
6
Introduction
What we want to
remember?
8
Image set
Relevant images with low level feature from a
CNN
Relevant images with object detection, faces detction… (based on content)
Outline
➔ Introduction➔ Contributions
◆ Annotation tool for visual memorability◆ EgoMemNet: visual memorability adaptation to egocentric images◆ Visual memorability and physiological signals
➔ Conclusions
12
Why an annotation tool?
13
ConvolutionalNeural Network
Input(image)
Output(label)
Image + label to train the model
● Inspired by MIT research work [1]● Visual memory game:
○ Simple task → press ‘d’ when a repeated image is found○ Duration: 9 minutes○ Output: text file with detections
14
[1] Understanding and Predicting Image Memorability at a Large
Scale, A. Khosla, A. S. Raju, A. Torralba and A. Oliva. ICCV 2015
Annotation tool for visual memorability
UTEgocentric
Insight Center for Data Analytics
● Docker:○ Container with an operating system and software required.○ Always run the same in any environment.
● Simple implementation → dockerfile
17
Annotation tool implementationWhy to use a Docker?
First docker implementation
in GPI for research
18
● Memorability score → [0,1]
● Result:○ Dataset → 50 annotated images (25 users)
Annotation tool results
Outline
➔ Introduction➔ Contributions
◆ Annotation tool for visual memorability◆ EgoMemNet: visual memorability adaptation to egocentric images◆ Visual memorability and physiological signals
➔ Contributions
19
Convolutional neural network: definition
● Automatic learning paradigm based by how human brain works● Neuron interconnection that work together to generate an output stimulus or activation
20
● MemNet → CNN for memorability prediction○ 5 conv layers + 2 fully connected layers + linear regression
22
EgoMemNet: visual memorability adaptation to egocentric images
MemNet CNN[Koshla, ICCV 2015]
1
Structure: AlexNet
● No augmentation● Spatial data augmentation → common method● Temporal data augmentation → egocentric feature
24
Data augmentation strategies
Spatial data augmentation Temporal data augmentation
25
Quantitative results
Spearman’s rank correlation
Compute the similarity between positions between two different ranked lists.
Memorability rank Ground truth rank
28
Memorability maps
● Heat maps that highlight most memorable regions.● Methods:
○ Grid-and-forward → obtain a memorability score per patch○ EgoMemNet → fully convolutional version
Grid-and-forward EgoMemNet
31
Memorability vs. saliency maps
Original image Saliency map(SalNet CNN)
[Pan, CVPR 2016]
Memorability map(EgoMemNet* CNN)
In green, parts shared between saliency and memorability maps.In blue, memorability regions non-salient.In red, salient regions non-memorability
Binarized maps with
learned threshold
Outline
➔ Introduction➔ Contributions
◆ Annotation tool for visual memorability◆ EgoMemNet: visual memorability adaptation to egocentric images◆ Visual memorability and physiological signals
➔ Conclusions
32
The Insight dataset
● Multimodal homemade dataset:○ Images○ Memorability score○ Heart rate value (during
image acquisition)○ Galvanic skin response
(during image acquisition)
33
Publicly available!
Detect snap points● Prior approach → efficient capture without image processing
37
Linear Regression
Adding physiological signals for memorability prediction
● Post approach
38
Linear Regression
EgoMemNet score
Outline
➔ Introduction➔ Contributions
◆ Annotation tool for visual memorability◆ EgoMemNet: visual memorability adaptation to egocentric images◆ Visual memorability and physiological signals
➔ Conclusions
42
Conclusions
● New annotation tool allows to create novel dataset for egocentric memorability.
● Egocentric (first person vision) dataset containing 50 annotated images.
● EgoMemNet, a model adapted for memorability prediction to egocentric images, presents a perform over MemNet, a convolutional neural network model trained with human-taken images.
● Physiological signals for memorability prediction.
43
Extended abstract
Carné-Herrera M, Giró-i-Nieto X, Gurrin C. EgoMemNet: Visual Memorability Adaptation to Egocentric Images. Las Vegas, NV, USA: 4th Workshop on Egocentric (First-Person) Vision, CVPR 2016;
44
Spotlight
Full spotlight in youtube!
https://www.youtube.com/watch?v=qwM5NNW37YE
Top Related