8/12/2019 AMPLab Yahoo
1/12
Yahoo & AMPLabCelebration of our partnershipApril 16, 2013
8/12/2019 AMPLab Yahoo
2/12
Yahoo!s core business Make the worlds daily habits inspiring and entertaining Put brands in the center of peoples daily habits
Yahoo! Confidential & Proprietary. 2 4/18/2013
Yahoo!
Users
Adv Publ
8/12/2019 AMPLab Yahoo
3/12
What problems do we solve? Matching content to user
Personalized Responsive
Matching ads to users Maximize yield Maximize return on investment Maintain positive user experience
Yahoo! Confidential & Proprietary. 3 4/18/2013
8/12/2019 AMPLab Yahoo
4/12
8/12/2019 AMPLab Yahoo
5/12
What do we need?
Science Data
Platforms that analyze data at scale and close the loop On grid solution Horizontally scalable and fault tolerant Interactive Easy to describe sophisticated data mining tasks
Quick to prototype, and easy to productionize Few knobs to turn
Yahoo! Confidential & Proprietary. 6 4/18/2013
8/12/2019 AMPLab Yahoo
6/12
How do we do this today?
Yahoo! Confidential & Proprietary. 7 4/18/2013
8/12/2019 AMPLab Yahoo
7/12
How the AMPLab, Yahoo! Relationship started
How to cut down ETL, and query on grid directly? Inspired by Dremel/ enhance with in memory techniques
Mateis talk on Shark @ Hadoop Summit 2012 Shark Server Further small enhancements and bug fixes Meet with Ion and Mike at AMPLab
Yahoo! Confidential & Proprietary. 8 4/18/2013
8/12/2019 AMPLab Yahoo
8/12
8/12/2019 AMPLab Yahoo
9/12
Where are we headed? End of Q2, Shark will be available on a 50 node cluster (100GB
RAM) for advertising analytics. One customer facing analytics optimization feature planned on top
of Shark Shark/ Spark packaged and available to autodeploy on any cluster
within Yahoo! Mid Q2 start work on 4000 node cluster productionize YARN patch Bug fixes, memory leak fixes and features like Column Pruning, Map
Join etc will be checked back into Shark/Spark main branch.
Upcoming work includes further join optimization, queryoptimizations specific to analytic workloads., Compression etc. Longer term roadmap to enhance on disk performance as well
Yahoo! Confidential & Proprietary. 10 4/18/2013
8/12/2019 AMPLab Yahoo
10/12
Future Architecture
Yahoo! Confidential & Proprietary. 11 4/18/2013
8/12/2019 AMPLab Yahoo
11/12
8/12/2019 AMPLab Yahoo
12/12
Top Related