intro data science at NYT 2015-01-22
-
Upload
chris-wiggins -
Category
Education
-
view
205 -
download
2
Transcript of intro data science at NYT 2015-01-22
data science @ The New York Times
and how a 163-year old content company became data-driven
[email protected]@nytimes.com@chrishwiggins
1. the path
biology: 1892 vs. 1995
biology changed for good.
genetics: 1837 vs. 2012
ML toolset; data science mindset
becoming a data science culture
- drew conway, 2010
data science: web scale
example:
163 yr old
bit.ly/nyt-interactive-2013
example:
millions of views per hour
data science: the web
data science: the web
is your “online presence”
data science: the web
is a microscope
data science: the web
is an experimental tool
data science: the web
is an optimization tool
1. the path
learnings
- supervised learning- unsupervised learning- reinforcement learning
supervised learning, e.g.,
supervised learning, e.g.,
“the funnel”
interpretable supervised learning
supe
r co
ol s
tuff
supervised learning, e.g.,
“logistics”
unsupervised learning, e.g,
“segments”
unsupervised learning, e.g,
“segments”
unsupervised learning, e.g,
“segments”
argmax_z p(z|x)=14
unsupervised learning, e.g,
“segments”
“baby boomer”
reinforcement learning
reinforcement learning
aka “A/B testing”;RCT
reinforcement learning
reinforcement learning
img: MSR SV (RIP)e.g., multi-armed bandits
data science: - people, - process, - technology
2. data science:
“data”:
“data”:
“metrics”“business analytics”
“Excel”“reporting”
Reporting
Reportingbusiness as usual
Reporting
Learning
business as usual
Reporting
Learning(esp. supervised)
business as usual
supervised learning, e.g.,
“the funnel”
Reporting
Learning
Test
business as usual
(esp. supervised)
Reporting
Learning
Testaka “A/B testing”;
business as usual
(esp. supervised)
Reporting
Learning
Testaka “A/B testing”;
business as usual
(esp. supervised)
Some of the most recognizable personalization in our service is the collection of “genre” rows. …Members connect with these rows so well that we measure an increase in
member retention by placing the most tailored rows higher on the page instead of lower.
Reporting
Learning
Testaka “A/B testing”;
business as usual
(esp. supervised)
Reporting
Learning
Test
Optimizing
aka “A/B testing”;
(esp. supervised)
business as usual
Reporting
Learning
Test
Optimizing
aka “A/B testing”;(i.e. reinforcement
(esp. supervised)
business as usual
Reporting
Learning
Test
Optimizing
Explore
aka “A/B testing”;(i.e. reinforcement
(esp. supervised)
business as usual
Reporting
Learning
Test
Optimizing
Explore
aka “A/B testing”;
aka “segmenting”
(i.e. reinforcement
(esp. supervised)
business as usual
Reporting
Learning
Test
Optimizing
Explore
aka “A/B testing”;
aka “segmenting”
(i.e. reinforcement
(esp. supervised)
business as usual
“segments”
Exploreaka “segmenting”
“segments”
“z=14”
Exploreaka “segmenting”
“segments”
“baby boomer”
Exploreaka “segmenting”
Reporting
Learning
Optimizing
tech company
Reporting
“model” company
Reporting
fake company
Reporting
Learning
Test
Optimizing
Explorestartups:
“a startup is a temporary organization in search of a repeatable and scalable business model” —Steve Blank
every publisher is now a startup
data science: - people, - process, - technology
[email protected]@nytimes.com@chrishwiggins
your questions?