Devoxx Real-Time Learning

45
1 ©MapR Technologies - Confidential Real-time Learning

description

An expanded description of real-time learning including system designs that Ted Dunning presented at Devox France in March 2013

Transcript of Devoxx Real-Time Learning

Page 1: Devoxx Real-Time Learning

1©MapR Technologies - Confidential

Real-time Learning

Page 2: Devoxx Real-Time Learning

2©MapR Technologies - Confidential

whoami – Ted Dunning

Chief Application Architect, MapR Technologies Committer, member, Apache Software Foundation– particularly Mahout, Zookeeper and Drill

(we’re hiring)

Contact me [email protected]@[email protected]@ted_dunning

Page 3: Devoxx Real-Time Learning

3©MapR Technologies - Confidential

Slides and such (available late tonight):– http://www.mapr.com/company/events/devoxx-3-29-2013

Hash tags: #mapr #devoxxfr

Page 4: Devoxx Real-Time Learning

4©MapR Technologies - Confidential

Agenda

What is real-time learning? A sample problem Philosophy, statistics and the nature of the knowledge A solution System design

Page 5: Devoxx Real-Time Learning

5©MapR Technologies - Confidential

What is Real-time Learning?

Training data arrives one record at a time

The system improves a mathematical model based on a small amount of training data

We retain at most a fixed amount of state

Each learning step takes O(1) time and memory

Page 6: Devoxx Real-Time Learning

6©MapR Technologies - Confidential

We have a product to sell … from a web-site

Page 7: Devoxx Real-Time Learning

7©MapR Technologies - Confidential

What picture?

What tag-line?

What call to action?

Page 8: Devoxx Real-Time Learning

8©MapR Technologies - Confidential

The Challenge

Design decisions affect probability of success– Cheesy web-sites don’t even sell cheese

The best designers do better when allowed to fail– Exploration juices creativity

But failing is expensive– If only because we could have succeeded– But also because offending or disappointing customers is bad

Page 9: Devoxx Real-Time Learning

9©MapR Technologies - Confidential

A Quick Diversion

You see a coin– What is the probability of heads?– Could it be larger or smaller than that?

I flip the coin and while it is in the air ask again I catch the coin and ask again I look at the coin (and you don’t) and ask again Why does the answer change?– And did it ever have a single value?

Page 10: Devoxx Real-Time Learning

10©MapR Technologies - Confidential

A Philosophical Conclusion

Probability as expressed by humans is subjective and depends on information and experience

Page 11: Devoxx Real-Time Learning

11©MapR Technologies - Confidential

So now you understand Bayesian probability

Page 12: Devoxx Real-Time Learning

12©MapR Technologies - Confidential

Another Quick Diversion

Let’s play a shell game This is a special shell game It costs you nothing to play The pea has constant probability of being under each shell

(trust me)

How do you find the best shell? How do you find it while maximizing the number of wins?

Page 13: Devoxx Real-Time Learning

13©MapR Technologies - Confidential

Pause for short con-game

Page 14: Devoxx Real-Time Learning

14©MapR Technologies - Confidential

Conclusions

Can you identify winners or losers without trying them out?No

Can you ever completely eliminate a shell with a bad streak?No

Should you keep trying apparent losers?Yes, but at a decreasing rate

Page 15: Devoxx Real-Time Learning

15©MapR Technologies - Confidential

So now you understand multi-armed bandits

Page 16: Devoxx Real-Time Learning

16©MapR Technologies - Confidential

Is there an optimum strategy?

Page 17: Devoxx Real-Time Learning

17©MapR Technologies - Confidential

Thompson Sampling

Select each shell according to the probability that it is the best

Probability that it is the best can be computed using posterior

But I promised a simple answer

Page 18: Devoxx Real-Time Learning

18©MapR Technologies - Confidential

Thompson Sampling – Take 2

Sample θ

Pick i to maximize reward

Record result from using i

Page 19: Devoxx Real-Time Learning

19©MapR Technologies - Confidential

Nearly Forgotten until Recently

Citations for Thompson sampling

Page 20: Devoxx Real-Time Learning

20©MapR Technologies - Confidential

Bayesian Bandit for the Shells

Compute distributions based on data so far Sample p1, p2 and p3 from these distributions

Pick shell i where i = argmaxi pi

Lemma 1: The probability of picking shell i will match the probability it is the best shell

Lemma 2: This is as good as it gets

Page 21: Devoxx Real-Time Learning

21©MapR Technologies - Confidential

And it works!

Page 22: Devoxx Real-Time Learning

22©MapR Technologies - Confidential

Video Demo

Page 23: Devoxx Real-Time Learning

23©MapR Technologies - Confidential

The Basic Idea

We can encode a distribution by sampling Sampling allows unification of exploration and exploitation

Can be extended to more general response models

Page 24: Devoxx Real-Time Learning

24©MapR Technologies - Confidential

The Original Problem

x1x2

x3

Page 25: Devoxx Real-Time Learning

25©MapR Technologies - Confidential

Mathematical Statement

Logistic or probit regression

Page 26: Devoxx Real-Time Learning

26©MapR Technologies - Confidential

Same Algorithm

Sample θ

Pick design x to maximize reward

Page 27: Devoxx Real-Time Learning

27©MapR Technologies - Confidential

Context Variables

x1x2

x3

y1=user.geo y2=env.time y3=env.day_of_week y4=env.weekend

Page 28: Devoxx Real-Time Learning

28©MapR Technologies - Confidential

Two Kinds of Variables

The web-site design - x1, x2, x3– We can change these– Different values give different web-site designs

The environment or context – y1, y2, y3, y4– We can’t change these– They can change themselves

Our model should include interactions between x and y

Page 29: Devoxx Real-Time Learning

29©MapR Technologies - Confidential

Same Algorithm, More Greek Letters

Sample θ, π, φ

Pick design x to maximize reward, y’s are constant

This looks very fancy, but is actually pretty simple

Page 30: Devoxx Real-Time Learning

30©MapR Technologies - Confidential

Surprises

We cannot record a non-conversion until we wait

We cannot record a conversion until we wait for the same time

Learning from conversions requires delay

We don’t have to wait very long

Page 31: Devoxx Real-Time Learning

31©MapR Technologies - Confidential

Page 32: Devoxx Real-Time Learning

32©MapR Technologies - Confidential

Page 33: Devoxx Real-Time Learning

33©MapR Technologies - Confidential

Page 34: Devoxx Real-Time Learning

34©MapR Technologies - Confidential

Page 35: Devoxx Real-Time Learning

35©MapR Technologies - Confidential

Required Steps

Learn distribution of parameters from data– Logistic regression or probit regression (can be on-line!)– Need Bayesian learning algorithm

Sample from posterior distribution– Generally included in Bayesian learning algorithm

Pick design– Simple sequential search

Record data

Page 36: Devoxx Real-Time Learning

36©MapR Technologies - Confidential

Required system design

Page 37: Devoxx Real-Time Learning

37©MapR Technologies - Confidential

t

now

Hadoop is Not Very Real-time

UnprocessedData

Fully processed

Latest full period

Hadoop job takes this long for this data

Page 38: Devoxx Real-Time Learning

38©MapR Technologies - Confidential

t

now

Hadoop works great back here

Storm workshere

Real-time and Long-time together

Blended view

Blended view

Blended View

Page 39: Devoxx Real-Time Learning

39©MapR Technologies - Confidential

Traditional Hadoop Design

Can use Kafka cluster to queue log lines Can use Storm cluster to do real time learning Can host web site on NAS Can use Flume cluster to import data from Kafka to Hadoop Can record long-term history on Hadoop Cluster

How many clusters?

Page 40: Devoxx Real-Time Learning

40©MapR Technologies - Confidential

Kafka

Kafka Cluster

Kafka Cluster

Kafka Cluster

Storm

Users

Web Site

Kafka API

Web Service NAS

Design Targeting

Hadoop

HDFS Data

Flume

Page 41: Devoxx Real-Time Learning

41©MapR Technologies - Confidential

That is a lot of moving parts!

Page 42: Devoxx Real-Time Learning

42©MapR Technologies - Confidential

Alternative Design

Can host log catcher on MapR via NFS Storm can read data directly from queue Can host web server directly on cluster

Only one cluster needed– Total instances drops by 3x– Admin burden massively decreased

Page 43: Devoxx Real-Time Learning

43©MapR Technologies - Confidential

Users

Catcher Storm

Topic Queue

Web-server

http

Web Data

MapR

Page 44: Devoxx Real-Time Learning

44©MapR Technologies - Confidential

You can do thisyourself!

Page 45: Devoxx Real-Time Learning

45©MapR Technologies - Confidential

Contact Me!

We’re hiring at MapR in US and Europe

MapR software available for research use

Contact me at [email protected] or @ted_dunning

Share news with @apachemahout

Tweet #devoxxfr #mapr #mahout @ted_dunning