Real timeanalyticsl oreal

download Real timeanalyticsl oreal

of 28

  • date post

  • Category


  • view

  • download


Embed Size (px)


click log heat map, demographic targeting, long tail

Transcript of Real timeanalyticsl oreal

  • 1. Case studies w/Analytics, Real Time, DM/ML in a hackathon LOreal 8/27/2013 Not Hadoop
  • 2. Agenda Problem Statement: Digital and Retail behavior analysis: Long tail problem similarities Propensity Marketing: Propensity for consumer to respond to promotion? Cover DM/ML Demographics presentation Profitability Marketing Who are the most profitable customers? Obvious answer, select * from customers join orders order by amt desc; Promotion Modeling What drives order values and who should receive promotions?
  • 3. What do I do Work, Tech lead Google, ~10y, Architect Absolute SW Teach, mentor others on Big Data, Hadoop, DM/ML ngEvents/.
  • 4. Review Theory: What is long tail? Long tail success case studies Demographic targeting/Modeling and prediction ML/DM success case studies Data Analysis Strategies/Structure
  • 5. What is the Long Tail? Originated from search engines/Google Dont focus on the top 20% queries, focus on the bottom 50% first Why? The bottom 50% was the hardest: LP&SB. The top 20% was automatic
  • 6. Long Tail Example, keywords
  • 7. Keyword Lift/Complementary Strategies 70% of the keywords are not used frequently. Page Rank/feature selection/Spam reduction Most data (demographics is inaccurate, eBay problem) Quality of features enable ML/DM modeling Identify these words first using simple SQL queries then run a model and use A/B testing to iterate to better results Example of ML/DM later Case study of data visualisation for search query length
  • 8. Complete solution not possible A complete solution to the long tail is not possible via a hackathon Examples of Complete Solutions Example: Symantec uses modified page rank to see if virus files are safe/not safe. Viruses are different, all are unique. You cant rely on past examples. >90% accuracy rate. Uses people feedback. Example: Yahoo content system matching users to content ~100 attributes->1k attributes. Most users only go to Yahoo news for a few stories. MM guides this
  • 9. Another long tail on search query length
  • 10. Long Tail Obvious longer queries imply user wants more precise result. Precision vs. Recall Obvious these users are more valuable b/c the directed intent is more focused. Showing the user enter in queries with more precision is very very valuable for shopping and other applications with focused directed intent The above case results in a $50.00 click to Google for Salesforce/SAP ads (e.g home financing/mortgages) Best way to see this is in a demo: Move mouse on dots which are close to each other: DEMO!!!!!
  • 11. Example real time applied to previous example We looked at search keywords and search phrase length. Visualizations as a substitute for Machine Learning algorithms. Much faster to implement Some students > Recall
  • 14. UI:mouse over a stream of dots
  • 15. Mouse on a dot which is part of a group which looks like a snake Can see what user typed in as queries after another, here is one example; How to fix car-> What is a fuel filter-> How to replace a fuel filter. This is valuable in adding additional features to the user who asked this Can't get this from SQL queries easily or at all.
  • 16. What is the lesson here? Viewing data in real time has value Minimum it helps clear the thinking for the next step Use as an alerting system/QC process to show if ML/DM is running correctly (proprietary in Google/Yahoo). Every business has these. Key: visible to everybody w/o running a SQL query
  • 17. Wisdom gained matches across 2 hackathons One of the most surprising pieces of work was a unique data visualization from the DM hackathon None of these positive results were defined in the problem statement. Required creativity. Careful
  • 18. Review ML/DM Review a small subset of these slides: s-andweblogtargeting-10757778 Agenda: review a case study of the Motley Fool and how to create/target promotions to likely subscribers for problem #2, propensity marketing Case study of a past hackathon. My role: I seed the ideas, Mike Bowles, Nick Kolegraff
  • 19. ML/DM Slides DO NOT INSERT SLIDES, cover the original so we dont limit the scope of audience questions
  • 20. ML/DM and Hackathons Done 2 as examples, Motley Fool, cosponsored by Kaggle (Mike Bowles) Best Buy, paid Kaggle (Nick Kolegraff@Accenture/DM SIG, we sought him out) These events require guidance/very successful, both still are receptive to more DM/ML events Careful: an algorithm doesnt mean you have a production process or something someone can manage via a paid analyst headcount Why arent there more? Time investment to clean data, tech talk to guide participants, min 3 months work
  • 21. What do I do for others which may help you? Seed the ideas; should add a structure to this. NDA. Run SQL queries Current Case Study Starting to do the prep work for another real time analytics example, teaching from this Nick/Mike did this for the other 2 hackathons. Match the strategy w/structure Take time off work to build an engineering prototype (Twitter Storm in old slide deck) Not covering this here Strategy: first display the data in a real time dashboard then iterate the visualizations, then add DM/ML algorithms after the A/B testing framework is complete
  • 22. One example, web page heat maps
  • 23. Amazon Web Page
  • 24. Google Shopping Example/Reversed/Why?
  • 25. Upper Left hand corner
  • 26. Example of Kiehls
  • 27. Kiehls Example Put in offers w/($ amount, product desc, click url) customized per user, A/B test layouts and placement, store data for customization and measure lift Measure facebook ads via page rank Predict missing links application a-social-graph-my-solution-to-facebooks-user- recommendation-contest-on-kaggle/ Careful, dont copy. Example only. Generalize to hackathon. Many other ideas Your answer is different from Yahoo & Google. This isnt a roadmap.
  • 28. Structure has to match Strategy Partner w/Macys? Develop a structure to work with retail partners to increase their sales E.g. customized shopkick Dont just release API