intro data science at NYT 2015-01-22

61
data science @ The New York Times and how a 163-year old content company became data-driven [email protected] [email protected] @chrishwiggins

Transcript of intro data science at NYT 2015-01-22

Page 1: intro data science at NYT 2015-01-22

data science @ The New York Times

and how a 163-year old content company became data-driven

[email protected]@nytimes.com@chrishwiggins

Page 2: intro data science at NYT 2015-01-22

1. the path

Page 3: intro data science at NYT 2015-01-22

biology: 1892 vs. 1995

biology changed for good.

Page 4: intro data science at NYT 2015-01-22

genetics: 1837 vs. 2012

ML toolset; data science mindset

Page 5: intro data science at NYT 2015-01-22

becoming a data science culture

- drew conway, 2010

Page 6: intro data science at NYT 2015-01-22

data science: web scale

Page 7: intro data science at NYT 2015-01-22

example:

163 yr old

Page 8: intro data science at NYT 2015-01-22

bit.ly/nyt-interactive-2013

Page 9: intro data science at NYT 2015-01-22
Page 10: intro data science at NYT 2015-01-22
Page 11: intro data science at NYT 2015-01-22

example:

millions of views per hour

Page 12: intro data science at NYT 2015-01-22
Page 13: intro data science at NYT 2015-01-22
Page 14: intro data science at NYT 2015-01-22

data science: the web

Page 15: intro data science at NYT 2015-01-22

data science: the web

is your “online presence”

Page 16: intro data science at NYT 2015-01-22

data science: the web

is a microscope

Page 17: intro data science at NYT 2015-01-22

data science: the web

is an experimental tool

Page 18: intro data science at NYT 2015-01-22

data science: the web

is an optimization tool

Page 19: intro data science at NYT 2015-01-22

1. the path

Page 20: intro data science at NYT 2015-01-22

learnings

- supervised learning- unsupervised learning- reinforcement learning

Page 21: intro data science at NYT 2015-01-22

supervised learning, e.g.,

Page 22: intro data science at NYT 2015-01-22

supervised learning, e.g.,

“the funnel”

Page 23: intro data science at NYT 2015-01-22

interpretable supervised learning

supe

r co

ol s

tuff

Page 24: intro data science at NYT 2015-01-22

supervised learning, e.g.,

“logistics”

Page 25: intro data science at NYT 2015-01-22

unsupervised learning, e.g,

“segments”

Page 26: intro data science at NYT 2015-01-22

unsupervised learning, e.g,

“segments”

Page 27: intro data science at NYT 2015-01-22

unsupervised learning, e.g,

“segments”

argmax_z p(z|x)=14

Page 28: intro data science at NYT 2015-01-22

unsupervised learning, e.g,

“segments”

“baby boomer”

Page 29: intro data science at NYT 2015-01-22

reinforcement learning

Page 30: intro data science at NYT 2015-01-22

reinforcement learning

aka “A/B testing”;RCT

Page 31: intro data science at NYT 2015-01-22

reinforcement learning

Page 32: intro data science at NYT 2015-01-22

reinforcement learning

img: MSR SV (RIP)e.g., multi-armed bandits

Page 33: intro data science at NYT 2015-01-22

data science: - people, - process, - technology

Page 34: intro data science at NYT 2015-01-22

2. data science:

Page 35: intro data science at NYT 2015-01-22

“data”:

Page 36: intro data science at NYT 2015-01-22

“data”:

“metrics”“business analytics”

“Excel”“reporting”

Page 37: intro data science at NYT 2015-01-22

Reporting

Page 38: intro data science at NYT 2015-01-22

Reportingbusiness as usual

Page 39: intro data science at NYT 2015-01-22

Reporting

Learning

business as usual

Page 40: intro data science at NYT 2015-01-22

Reporting

Learning(esp. supervised)

business as usual

Page 41: intro data science at NYT 2015-01-22

supervised learning, e.g.,

“the funnel”

Page 42: intro data science at NYT 2015-01-22

Reporting

Learning

Test

business as usual

(esp. supervised)

Page 43: intro data science at NYT 2015-01-22

Reporting

Learning

Testaka “A/B testing”;

business as usual

(esp. supervised)

Page 44: intro data science at NYT 2015-01-22

Reporting

Learning

Testaka “A/B testing”;

business as usual

(esp. supervised)

Some of the most recognizable personalization in our service is the collection of “genre” rows. …Members connect with these rows so well that we measure an increase in

member retention by placing the most tailored rows higher on the page instead of lower.

Page 45: intro data science at NYT 2015-01-22

Reporting

Learning

Testaka “A/B testing”;

business as usual

(esp. supervised)

Page 46: intro data science at NYT 2015-01-22

Reporting

Learning

Test

Optimizing

aka “A/B testing”;

(esp. supervised)

business as usual

Page 47: intro data science at NYT 2015-01-22

Reporting

Learning

Test

Optimizing

aka “A/B testing”;(i.e. reinforcement

(esp. supervised)

business as usual

Page 48: intro data science at NYT 2015-01-22

Reporting

Learning

Test

Optimizing

Explore

aka “A/B testing”;(i.e. reinforcement

(esp. supervised)

business as usual

Page 49: intro data science at NYT 2015-01-22

Reporting

Learning

Test

Optimizing

Explore

aka “A/B testing”;

aka “segmenting”

(i.e. reinforcement

(esp. supervised)

business as usual

Page 50: intro data science at NYT 2015-01-22

Reporting

Learning

Test

Optimizing

Explore

aka “A/B testing”;

aka “segmenting”

(i.e. reinforcement

(esp. supervised)

business as usual

Page 51: intro data science at NYT 2015-01-22

“segments”

Exploreaka “segmenting”

Page 52: intro data science at NYT 2015-01-22

“segments”

“z=14”

Exploreaka “segmenting”

Page 53: intro data science at NYT 2015-01-22

“segments”

“baby boomer”

Exploreaka “segmenting”

Page 54: intro data science at NYT 2015-01-22

Reporting

Learning

Optimizing

tech company

Page 55: intro data science at NYT 2015-01-22

Reporting

“model” company

Page 56: intro data science at NYT 2015-01-22

Reporting

fake company

Page 57: intro data science at NYT 2015-01-22

Reporting

Learning

Test

Optimizing

Explorestartups:

Page 58: intro data science at NYT 2015-01-22

“a startup is a temporary organization in search of a repeatable and scalable business model” —Steve Blank

Page 59: intro data science at NYT 2015-01-22

every publisher is now a startup

Page 60: intro data science at NYT 2015-01-22

data science: - people, - process, - technology

Page 61: intro data science at NYT 2015-01-22

[email protected]@nytimes.com@chrishwiggins

your questions?