Learning from Instagram
Transcript of Learning from Instagram
Problem Formulation• GOAL: Predict possible number of likes for the picture
user is about to post.
• MEANS: Instagram Data, Deep and Shallow learning techniques
Data CollectionWe’ve used Instagram API to collect photos and their attributes: – likes count – comments – hashtags – geotags – etc…
We have also collected likes count for each photo in the timeframe of 6, 12 and 24 hours.
Result: 300k photos (in 4 days) — thanks to Kirill Potekhin
Data AnalysisWe have employed an instance of reduced trained ImageNet deep network in order to extract photo features.
tSNE
Dark = liked by more than 20% of subscribers Azure = liked by 10-20% Yellow = liked by <10%
No meaningful data here…
Architecture• We have designed our own deep learning network so that it takes a photo as an
input and returns it’s «score».«Score» is a value that represents how many likes a photo is expected to receive.
• Network structure: 32CONV3-MP2-64CONV3-MP2-128CONV3-MP2-256CONV3-MP2-1000FC-1FC
• «Score» formulas we have tried:
And neither worked.
Pivot!
• Learned scores were not significantly different between photos.
• The reason for that is probably that «likeability» of the photo is not characterized by the features ImageNet is extracting.
• Trendiness? Right place, right time? Mere luck?
• «Can’t predict likes… Predict hashtags! #deeplearning #YOLO»
New ArchitecturePipeline:• choose the most significant and different hashtags
GOOD: #food, #flowers, #sunset BAD: #cool, #like4like, #follow4follow
• form photos datasets based on these hashtags in photo comment * some cherry-picking is required
• extract features using reduced ImageNet net without 2 last layers (N=4096)
• train an SVM on those features
ResultsWe have tested the quality of trained classifiers on the remains of our dataset:
#food: ~94%#flowers: ~85%#sunset: ~92%
It is possible to avoid choosing these hashtags by hand using tSNE representation.
Application
• Hashtag Prediction: When a user uploads a photo, he is presented to a list of the most relevant hashtags (which are popular now)
• Relevant Instagram Search: Separate meaningful hashtags from «like-hunters’s» posts.
• Hashtag-Specific Geosearch: Extract photos by their location and find it’s current characteristics in terms of hashtags — food, parties, etc…
Conclusion
• Worked on estimating likeability of the photo with RDL techniques.
• Successfully extracted hashtags from photos with RDL techniques.
• Put the knowledge of Deep and Shallow Learning to practice.