YFCC100M HybridNet fc6 Deep Features for Content-Based Image Retrieval
-
Upload
fabrizio-falchi -
Category
Science
-
view
4.961 -
download
0
Transcript of YFCC100M HybridNet fc6 Deep Features for Content-Based Image Retrieval
Giuseppe Amato, Fabrizio Falchi, Claudio Gennaro and Fausto Rabitti
YFCC100M HYBRIDNET FC6 DEEP FEATURES
FOR CONTENT-BASED IMAGE RETRIEVAL
Multimedia COMMONS Workshop at ACM Multimedia 2016
Amsterdam, The Netherlands, October 15-19
WHERE WE COME FROM AND MOTIVATIONS
CoPhIR – Content-based Photo Image Retrieval
http://cophir.isti.cnr.it
• Flickr 106M Photos (not all CC)
• title, description, author, tags, comments, notes, and also its GPS, coordinates, the number of views and the number of users considering the photo a favorite
• MPEG-7 Visual Features
• mainly used by the Similarity Search community(144 citations and about 100 requests)
Similarity SearchThe Metric Space ApproachZezula, Amato, Dohnal, Batko
2008
MAJOR RELATED EVENTS
Deep Learning explosion
YFCC100M
The Multimedia Commons Initiative
CONTRIBUTIONS
• HybridNet fc6 Deep Features for YFCC100M imagesmultimediacommons.wordpress.com/
• CBIR Systems on the YFCC100M
o MI-Filemifile.deepfeatures.org
o Lucene Quantizationmelisandre.deepfeatures.org
• Ground-truth Results for evaluating Approximate k-NN (k=10,001)www.deepfeatures.org/
o On 3 types of the neuron activations (features) processing
o For subsets of the whole collections at each 1M step
HYBRIDNET
• Trained on 3.5 million images from 1,183 categories:o ImageNet-ILSVRC
• about 1 million images from 888 categories (removing Places 295 duplicates)
o Places 205
• about 2.5 million images from 205 categories
Learning Deep Features for Scene Recognition using Places DatabaseZhou, Lapedriza, Xiao, Torralba, Oliva, NIPS 2014
WHY HYBRIDNET FC6?
A Practical Guide to CNNs and Fisher Vectors for Image Instance RetrievalV Chandrasekhar, J Lin, O Morère, H Goh, A Veillard - Signal Processing, 2016 - Elsevier
DEEP FEATURES PROCESSING
• We generated 3 distinct features from the fc6 activations:
o Raw (no ReLu) + L2Norm.
o ReLu + L2Norm.
o BinaryA simple binarization of deep features was shown to lead to a negligible performance drop for both classification and detection (PASCAL-CLS in particular).
𝑏𝑖 = 1 𝑓𝑖 > 00 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Analyzing the performance of multilayer neural networks for object recognition.P. Agrawal, R. Girshick, and J. Malik. (ECCV 2014)
GT RESULTS www.deepfeature.org
GT RESULTS (SEQUENTIAL SCANNNING)
GT RESULTS (SEQUENTIAL SCANNNING)
THE CBIR ONLINE SYSTMES
• MI-File
o Permutation Based method
o Uses Inverted Files
MI-File: using inverted files for scalable approximate similarity search
G Amato, C Gennaro, P Savino (Multimedia tools and applications)
• Lucene Quantization
o Exploits the sparsity of deep features (ReLu -> 25% non zeros)
o Quantization approach to allow text encoding
o Also able to perform text and combined search
Large scale indexing and searching deep convolutional neural network features
G. Amato, F. Debole, F. Falchi, C. Gennaro, and F. Rabitti (DaWaK 2016)
MI-FILE (INDEXING BINARY FEATURES)
LUCENE QUANTIZATION (INDEXING RELU L2NORM.)
MI-FILE (COMPARED TO GT FOR RELU-L2NORM)
ONGOING WORKS
IMAGE ANNOTATION
CROSS MEDIA RETRIEVAL (RESULTS ON MS-COCO)
• Text queries are translated in HybridNet fc6 Visual Vectors by a NN
Picture It In Your Mind: Generating High Level Visual Representations From Textual DescriptionsFabio Carrara, Andrea Esuli, Tiziano Fagni, Fabrizio Falchi, Alejandro Moreo Fernándezhttps://arxiv.org/abs/1606.07287
CROSS MEDIA RETRIEVAL (RESULTS ON YFCC100M)
Picture It In Your Mind: Generating High Level Visual Representations From Textual DescriptionsFabio Carrara, Andrea Esuli, Tiziano Fagni, Fabrizio Falchi, Alejandro Moreo Fernándezhttps://arxiv.org/abs/1606.07287
CONCLUSIONS AND FUTURE WORK
Contributions:
• HybridNet fc6 Deep Features
• CBIR Systems for YFCC100M:
o MI-File mifile.deepfeatures.org
o Lucene Quantization melisandre.deepfeatures.org
• GT k-NN results for evaluating Approximate Search www.deepfeatures.org/
Ongoing and future works:
• HybridNet fc6 PCA256
• Image annotation based on the YFCC100M metadata
• Extracting new features, e.g.:Deep Image Retrieval: Learning Global Representations for Image SearchAlbert Gordo, Xerox Research; Jon Almazan, XRCE; Jerome Revaud, Xerox Research; Diane Larlus, Xerox
• Cross-media retrievalPicture It In Your Mind: Generating High Level Visual Representations From Textual DescriptionsFabio Carrara, Andrea Esuli, Tiziano Fagni, Fabrizio Falchi, Alejandro Moreo Fernándezhttps://arxiv.org/abs/1606.07287
THANKS!
Questions are welcomed
Fabrizio Falchi