Keynote at Big Data Tech Con SF 2014
-
Upload
gloria-lau -
Category
Data & Analytics
-
view
1.336 -
download
1
Transcript of Keynote at Big Data Tech Con SF 2014
![Page 1: Keynote at Big Data Tech Con SF 2014](https://reader030.fdocuments.net/reader030/viewer/2022032502/55b8673abb61eb78578b470d/html5/thumbnails/1.jpg)
Building Data Products: The Right Order of Things
Gloria Lau VP of Data, Timeful
Keynote @ Big Data Tech Con
![Page 2: Keynote at Big Data Tech Con SF 2014](https://reader030.fdocuments.net/reader030/viewer/2022032502/55b8673abb61eb78578b470d/html5/thumbnails/2.jpg)
http://www.linkedin.com/in/gloriatlau/ @gloriatlau
![Page 3: Keynote at Big Data Tech Con SF 2014](https://reader030.fdocuments.net/reader030/viewer/2022032502/55b8673abb61eb78578b470d/html5/thumbnails/3.jpg)
What do they have in common?
![Page 4: Keynote at Big Data Tech Con SF 2014](https://reader030.fdocuments.net/reader030/viewer/2022032502/55b8673abb61eb78578b470d/html5/thumbnails/4.jpg)
Right order of things
def __init__(self):
data infrastructure
for x in range(3):
offline modeling
online data product
user feedback
![Page 5: Keynote at Big Data Tech Con SF 2014](https://reader030.fdocuments.net/reader030/viewer/2022032502/55b8673abb61eb78578b470d/html5/thumbnails/5.jpg)
Model Product
![Page 6: Keynote at Big Data Tech Con SF 2014](https://reader030.fdocuments.net/reader030/viewer/2022032502/55b8673abb61eb78578b470d/html5/thumbnails/6.jpg)
Model Product
![Page 7: Keynote at Big Data Tech Con SF 2014](https://reader030.fdocuments.net/reader030/viewer/2022032502/55b8673abb61eb78578b470d/html5/thumbnails/7.jpg)
The challenge
Exception: tracking code missing/overloaded!
Debug: Power user computation takes forever!
def __init__(self):
data infrastructure
for x in range(3):
offline modeling
online data product
user feedback
![Page 8: Keynote at Big Data Tech Con SF 2014](https://reader030.fdocuments.net/reader030/viewer/2022032502/55b8673abb61eb78578b470d/html5/thumbnails/8.jpg)
The challenge
Data viz --> ID'ed new data potential --> Yet another data product
Sparse data --> Crappy model --> Need to nudge users for *more* data
Non-standardized data --> Crappy model --> Need to standardize
def __init__(self):
data infrastructure
for x in range(3):
offline modeling
online data product
user feedback
![Page 9: Keynote at Big Data Tech Con SF 2014](https://reader030.fdocuments.net/reader030/viewer/2022032502/55b8673abb61eb78578b470d/html5/thumbnails/9.jpg)
![Page 10: Keynote at Big Data Tech Con SF 2014](https://reader030.fdocuments.net/reader030/viewer/2022032502/55b8673abb61eb78578b470d/html5/thumbnails/10.jpg)
• Four diseases have broken out in the world and it is up to a team of specialists in various fields to find cures for these diseases before mankind is wiped out ... the diseases are out breaking fast and time is running out: the team must try to stem the tide of infection in diseased areas while also towards cures. A truly cooperative game where you all win or you all lose.
• How do you win?
• Optimally deploy minimal resources in the right order
![Page 11: Keynote at Big Data Tech Con SF 2014](https://reader030.fdocuments.net/reader030/viewer/2022032502/55b8673abb61eb78578b470d/html5/thumbnails/11.jpg)
• What is optimal
• Do you fix that tracking issue first?
• Do you optimize your power user computation?
• Do you double down on standardization?
• Relevant classifications
• P0 vs P1
• big company vs small company
![Page 12: Keynote at Big Data Tech Con SF 2014](https://reader030.fdocuments.net/reader030/viewer/2022032502/55b8673abb61eb78578b470d/html5/thumbnails/12.jpg)
2 Questions to ask
1 Quote answers them all
![Page 13: Keynote at Big Data Tech Con SF 2014](https://reader030.fdocuments.net/reader030/viewer/2022032502/55b8673abb61eb78578b470d/html5/thumbnails/13.jpg)
–Donald Knuth
“Premature optimization is the root of all evil.”
![Page 14: Keynote at Big Data Tech Con SF 2014](https://reader030.fdocuments.net/reader030/viewer/2022032502/55b8673abb61eb78578b470d/html5/thumbnails/14.jpg)
What is the one metric that your data product will move?
• Retention. Growth. Engagement. Money. Etc.
• Find it, and focus
![Page 15: Keynote at Big Data Tech Con SF 2014](https://reader030.fdocuments.net/reader030/viewer/2022032502/55b8673abb61eb78578b470d/html5/thumbnails/15.jpg)
If your users use your product a min/day/user, how would you spend that?
• Data scientists love data. More the merrier.
• More data solves your data scientist's problem. It does not solve your user's problem.
![Page 16: Keynote at Big Data Tech Con SF 2014](https://reader030.fdocuments.net/reader030/viewer/2022032502/55b8673abb61eb78578b470d/html5/thumbnails/16.jpg)
• Q1: Is it in the critical path of measuring that metric?
• Q2: Are you throwing away user's time?
Do you fix that tracking issue first?
![Page 17: Keynote at Big Data Tech Con SF 2014](https://reader030.fdocuments.net/reader030/viewer/2022032502/55b8673abb61eb78578b470d/html5/thumbnails/17.jpg)
Do you optimize your power user computation?
• Q1: Are power users your key user metric to lift?
• Q2: What fraction of total user's time is affected by this?
![Page 18: Keynote at Big Data Tech Con SF 2014](https://reader030.fdocuments.net/reader030/viewer/2022032502/55b8673abb61eb78578b470d/html5/thumbnails/18.jpg)
Do you double down on standardization?
• Q1: Peel the onion. How will x% increase in standardization rate affect your current and projected metric?
• Q2: Does it add friction to the funnel?
![Page 19: Keynote at Big Data Tech Con SF 2014](https://reader030.fdocuments.net/reader030/viewer/2022032502/55b8673abb61eb78578b470d/html5/thumbnails/19.jpg)
–Donald Knuth
“Premature optimization is the root of all evil.”
![Page 20: Keynote at Big Data Tech Con SF 2014](https://reader030.fdocuments.net/reader030/viewer/2022032502/55b8673abb61eb78578b470d/html5/thumbnails/20.jpg)
• Right order:
• talent first
• assimilation
• the 3%; fail fast
![Page 21: Keynote at Big Data Tech Con SF 2014](https://reader030.fdocuments.net/reader030/viewer/2022032502/55b8673abb61eb78578b470d/html5/thumbnails/21.jpg)
–Donald Knuth
“Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and
these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about
small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that
critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified. It is often a mistake to make a priori judgments about what parts of a program are really critical, since
the universal experience of programmers who have been using measurement tools has been that their intuitive guesses fail.”
![Page 22: Keynote at Big Data Tech Con SF 2014](https://reader030.fdocuments.net/reader030/viewer/2022032502/55b8673abb61eb78578b470d/html5/thumbnails/22.jpg)
It's an art.