Post on 21-Jan-2018
Building a Deep Learning-powered Search Engine
Koby Karp
Deep Learning Paris Meetup #7
I’m Koby - Data Scientist @ Equancy
★ Robotics Engineer (2007-2011)
★ Computer Visioner (2011-2012)
★ Data Scientist, Data Engineer, Data Miner, Data Analyst, ... (2011-2016)
★ Deep Learner (2016-)
★ ?
E-Commerce ♥ Images
★ Catalogue
★ Social Network
★ Marketplace
Three use cases for FASHION:
★ Visual Search Engine
★ Fashion Object Detection
★ Data Quality
Three use cases for FASHION:
★ Visual Search Engine
➹ Take pictures with your phone
➹ Search through catalogue using your images
➹ Return most similar or exact products
Big City Life = High Exposure to Fashion Daily
Visual Search Engine at a glance
Visual Search Engine at a glance
★ Batch Phase: Build
➢ Describe - Encode image into a numeric description (vector)
➢ Index - Apply transformation to all images and store in a DB
★ Online Phase: Deploy
➢ Measure Distance - Apply a distance metric between DB and a new (unseen) image
➢ Ranking - Sort by distance and return first N results
Visual Search Engine at a glance
Describe
Numerical Representation
0.672
0.510
0.741
...
0.919
Catalogue Image
★ Batch Phase: Build
➢ Describe
➢ Index
★ Online Phase: Deploy
➢ Measure Distance
➢ Ranking
Encode image into a numeric description (vector)
★ Batch Phase: Build
➢ Describe
➢ Index
★ Online Phase: Deploy
➢ Measure Distance
➢ Ranking
Visual Search Engine at a glanceApply transformation to all images and store in a DB
Index
0.672 0.435 0.482 ... 0.141
0.510 0.525 0.810 .... 0.241
0.741 0.526 0.210 ... 0.571
... ... ... ... 0.816
0.919 0.552 0.161 0.622 0.412
Catalogue Images
0.672 0.435 0.482 ... 0.141
0.510 0.525 0.810 .... 0.241
0.741 0.526 0.210 ... 0.571
... ... ... ... 0.816
0.919 0.552 0.161 0.622 0.412
Visual Search Engine at a glanceApply a distance metric between DB and a new (unseen) image
Measure Distance
★ Batch Phase: Build
➢ Describe
➢ Index
★ Online Phase: Deploy
➢ Measure Distance
➢ Ranking
0.672
0.510
0.741
...
0.919User’s Image
Visual Search Engine at a glanceSort by distance and return first N results
Top 5
★ Batch Phase: Build
➢ Describe
➢ Index
★ Online Phase: Deploy
➢ Measure Distance
➢ RankingUser’s Image
Focus on the Describe step
Three attributes that we need to describe
Shape Color Texture
Three attributes that we need to describe
Shape Color Texture
How is it done with “classic” Computer Vision?
Edge DetectorsImage Moment
HOG / HOF / SIFTFourier / Wavelet
Color Histograms
Three attributes that we need to describe
Problems with this approach:
1. Too many parameters (difficult to tune)
2. Multiple methods (how to weigh?)
3. Slow (many transformations)
4. Ungeneralizable
Solution: Pre-Trained Convolutional Neural Network (CNN)
Entering: Convolutional Neural Network (CNN)
AlexNet (2012)
1. “The Beatles of the CNNs” -Me
2. Trained on the ImageNet dataset (15 million images)
3. Used for classification of 1000 categories (Animals, Plants, Urban - No Fashion)
4. Invariant to translations and horizontal reflections
5. Tried other models such as VGG16.
Entering: Convolutional Neural Network (CNN)
AlexNet (simplified visualization)
Convolutional Neural Network (CNN)
AlexNet (simplified visualization)
❖ We remove the last Fully connected layer (Soft-Max)
❖ We feed our images and generate CNN codes of size 4096
❖ The weights of the Trained CNN contain the Feature Engineering mapping that was necessary
to discriminate between the 1000 classes
❖ We use the network as a general-purpose descriptor.
Test Time ...
Dataset
M. Manfredi; C. Grana; S. Calderara; R. Cucchiara "A complete system for garment segmentation and color classification" MACHINE VISION AND APPLICATIONS, vol. 25, pp. 955 -969 , 2014
Mix of various clothing and accessory:
❖ 60000 items
❖ Medium Quality
❖ Grey background
❖ Used as a benchmark for garment classification
Image Clustering
❖ Using t-SNE for compression to 2D
❖ Selected random 10% for visualization
Image Clustering Jewelry & Accessories
Image Clustering T-Shirts
Image Clustering Shoes
Image Clustering
Shorts
Image ClusteringJeans, Khakis & Chinos
Image ClusteringTrousers
Image ClusteringBags
Image ClusteringJackets
Image ClusteringFunky Tops
Search Results ...
We propose our customers to
collaborate, using their data,
for building a first prototype
Built with our customers
Selected topics look for an
innovative way of using existing
data
Leveraging smart data
Topics must lead to real,
operational applications, with
added value for the business
For industrial applications
Equancy selects several topics we consider worth
investigating for our yearly program
Cutting-Edge Topics
Depending how speculative we judge
each topic, Equancy will support
significant time costs of consultants
Co-investment
EQUANCYR&D Program
Equancy R&D Initiative
Thanks!You were great :)
Equancy is recruiting:
❖ Data Scientist Intern❖ Data Engineer
kkarp@equancy.com