Getting Started with Machine Learning

116
Getting started with Machine Learning

Transcript of Getting Started with Machine Learning

Page 1: Getting Started with Machine Learning

Gettingstarted with Machine Learning

Page 2: Getting Started with Machine Learning

HELLO!

Thomas PaulaSoftware [email protected]

Humberto MarcheziSoftware [email protected]

Page 3: Getting Started with Machine Learning

How could we teach machines in a way

that they could learn from experience?

Page 4: Getting Started with Machine Learning

“If you invent a breakthrough in artificial intelligence so

machines can learn, that is worth 10 Microsofts.”

Bill Gates Quoted in NY Times, Monday March 3, 2004

Page 5: Getting Started with Machine Learning

1.INTRODUCTION Machine what?

Page 6: Getting Started with Machine Learning

“▸A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.

Tom M. Mitchell

Page 7: Getting Started with Machine Learning

“▸A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.

Tom M. Mitchell

Page 8: Getting Started with Machine Learning

“▸A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.

Tom M. Mitchell

Page 9: Getting Started with Machine Learning

“▸A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.

Tom M. Mitchell

Page 10: Getting Started with Machine Learning

Example

Task T: Classify an e-mail as spam or not spam.

Performance measure P: Percentage of e-mails correctly classified.

Training experience E: E-mails manually labeled by humans.

Page 11: Getting Started with Machine Learning

Learning Paradigms

Supervised Learning

Unsupervised Learning

Page 12: Getting Started with Machine Learning

Supervised Learning

Page 13: Getting Started with Machine Learning

Supervised Learning

This is a pawn! This is a knight!

This is a king!

Page 14: Getting Started with Machine Learning

Supervised Learning

▸Labeled data;▸Has a “supervisor”;▸Classifies unseen instances (classification) or predicts values (regression).

Page 15: Getting Started with Machine Learning

Unsupervised Learning

Page 16: Getting Started with Machine Learning

Smaller than others, with a

ball on the top: group 0!

Not conic, nothing on the top: group 1!

Unsupervised LearningHigher than others,

with cross on the top: group 2!

Page 17: Getting Started with Machine Learning

Group 0 Group 1

Unsupervised Learning

Page 18: Getting Started with Machine Learning

Unsupervised Learning

▸No labeled data;▸No “supervisor”;▸Finds the regularities in the input.

Page 19: Getting Started with Machine Learning

Task Categories

Classification

Regression

Clustering

Page 20: Getting Started with Machine Learning

Classification

▸Classifies a given element in a category;▸Output as discrete labels (categories);▸Email spam classification.

Category A

Category B

Category C

Category D

Input Data

Page 21: Getting Started with Machine Learning

Classification

Not spam

Spam

From: [email protected]: [email protected]

Fake Watches Advertising

Don’t miss the chance !Buy fake watches right now !

Click here

Page 22: Getting Started with Machine Learning

Classification

Human

Tree

Car

Page 23: Getting Started with Machine Learning

Regression

Input Data

▸Forecast/projection of given element▸Output as continuous values▸Predict house price given its features

What is the value?

Page 24: Getting Started with Machine Learning

Regression

House featuresArea crime rate: 2.5Bedrooms: 5Age: 45Dist. to business district: 8

House valueUSD 303,365.00

Page 25: Getting Started with Machine Learning

Clustering

▸Find groups from a dataset of elements▸Related news in websites

Page 26: Getting Started with Machine Learning

Clustering

Page 27: Getting Started with Machine Learning

Clustering

Page 28: Getting Started with Machine Learning

2.ALGORITHMS The fun part ☺

Page 29: Getting Started with Machine Learning

Linear Regression

▸Supervised learning;▸Regression.

Page 30: Getting Started with Machine Learning

Linear Regression

1 2 3 4 5 6 7 8 9 10

1234

56

78 Find a line to fit

the point cloud

sales

customers

Page 31: Getting Started with Machine Learning

Linear Regression

1 2 3 4 5 6 7 8 9 10

1234

56

78

sales

customers

Page 32: Getting Started with Machine Learning

Linear Regression

1 2 3 4 5 6 7 8 9 10

1234

56

78

sales

customers

Page 33: Getting Started with Machine Learning

Linear Regression

1 2 3 4 5 6 7 8 9 10

1234

56

78

sales

customers

Page 34: Getting Started with Machine Learning

Linear Regression

1 2 3 4 5 6 7 8 9 10

1234

56

78

sales

customers

Page 35: Getting Started with Machine Learning

Linear Regression

1 2 3 4 5 6 7 8 9 10

1234

56

78

sales

customers

Page 36: Getting Started with Machine Learning

Linear Regression

1 2 3 4 5 6 7 8 9 10

1234

56

78

sales

customers

Page 37: Getting Started with Machine Learning

Linear Regression

1 2 3 4 5 6 7 8 9 10

1234

56

78

Project the value of hypothetical point.

sales

customers

Page 38: Getting Started with Machine Learning

Polynomial Regression

4 5

200K

price

bedrooms321

300K

400K

500K

Page 39: Getting Started with Machine Learning

Polynomial Regression

4 5

200K

price

bedrooms321

300K

400K

500K

Line doesn´t fit well in the point cloud

Page 40: Getting Started with Machine Learning

Polynomial Regression

4 5

200K

price

bedrooms321

300K

400K

500K

f(x) = ax³ + bx² + cx + d

Page 41: Getting Started with Machine Learning

K-means

▸Unsupervised learning;▸Clustering;▸Uses K to specify the number of clusters.

Page 42: Getting Started with Machine Learning

K-means

1 2 3 4 5 6 7 8 9 10

1234

56

78 Find K clusters.

For K = 3

Page 43: Getting Started with Machine Learning

K-means

1 2 3 4 5 6 7 8 9 10

1234

56

78 Step 1

Randomly choose K initial samples as cluster centroids

Page 44: Getting Started with Machine Learning

K-means

1 2 3 4 5 6 7 8 9 10

1234

56

78 Step 2

For every sample, - calculate

distance to centroids

- pick cluster with closest centroid

shortest path

Page 45: Getting Started with Machine Learning

K-means

1 2 3 4 5 6 7 8 9 10

1234

56

78 Step 2

For every sample, - calculate

distance to centroids

- pick cluster with closest centroid

Page 46: Getting Started with Machine Learning

K-means

1 2 3 4 5 6 7 8 9 10

1234

56

78 Resulting

clusters

Page 47: Getting Started with Machine Learning

K-means

1 2 3 4 5 6 7 8 9 10

1234

56

78 Step 3

Recalculate centroids from cluster members

Page 48: Getting Started with Machine Learning

K-means

1 2 3 4 5 6 7 8 9 10

1234

56

78 Step 4

If centroids move is bigger than distance D, go back to step 2 if new centroids

Page 49: Getting Started with Machine Learning

K-means

1 2 3 4 5 6 7 8 9 10

1234

56

78 2nd iteration

Page 50: Getting Started with Machine Learning

K-means

1 2 3 4 5 6 7 8 9 10

1234

56

78 3rd iteration

Page 51: Getting Started with Machine Learning

K-means

1 2 3 4 5 6 7 8 9 10

1234

56

78 4th iteration

Page 52: Getting Started with Machine Learning

K-means

1 2 3 4 5 6 7 8 9 10

1234

56

78 Find K clusters.

For K = 3

http://www.naftaliharris.com/blog/visualizing-k-means-clustering/

Page 53: Getting Started with Machine Learning

K-Nearest Neighbors (K-NN)

▸Supervised learning;▸Classification and regression;▸Lazy learning;▸Uses the k closest instances to classify the unknown one;▸Easy to implement.

Page 54: Getting Started with Machine Learning

K-NN Example

1 2 3 4 5 6 7 8 9 10

1234

56

78 For k=1, what is

the class of the unknown element ( )?

Page 55: Getting Started with Machine Learning

K-NN Example

1 2 3 4 5 6 7 8 9 10

1234

56

78 For k=1, what is

the class of the unknown element ( )?

It’s a blue square!

Page 56: Getting Started with Machine Learning

K-NN Example

1 2 3 4 5 6 7 8 9 10

1234

56

78 For k=3, what is

the class of the unknown element ( )?

Page 57: Getting Started with Machine Learning

K-NN Example

1 2 3 4 5 6 7 8 9 10

1234

56

78 For k=3, what is

the class of the unknown element ( )?

It’s a red circle!

Page 58: Getting Started with Machine Learning

Neural Network

Page 59: Getting Started with Machine Learning

Neural Network

"Convolutional Neural Networks for Visual Recognition" (http://cs231n.github.io/convolutional-networks/)

Page 60: Getting Started with Machine Learning

Neuron model

+1

Input Layer Output Layer

a 1

Page 61: Getting Started with Machine Learning

Neuron model

+1

Input Layer Output Layer

a 1

Weights

Page 62: Getting Started with Machine Learning

Neuron model

+1

Input Layer Output Layer

a 1

Inputs

Page 63: Getting Started with Machine Learning

Neuron model

+1

Input Layer Output Layer

a 1

Bias

Page 64: Getting Started with Machine Learning

Neuron model

+1

Input Layer Output Layer

a 1

Output

Page 65: Getting Started with Machine Learning

Sigmoid

Page 66: Getting Started with Machine Learning

Neural Network Example

Create a neural network that behaves as an Logic OR.

+1

Input Layer Output Layer

a 1

Page 67: Getting Started with Machine Learning

Neural Network Example

+1Input Output

0 0 0

0 1 1

1 0 1

1 1 1

a 1

Page 68: Getting Started with Machine Learning

Neural Network Example

+1Input Output

0 0 0

0 1 1

1 0 1

1 1 1

a 1

Page 69: Getting Started with Machine Learning

Neural Network Example

+1Input Output

0 0 0

0 1 1

1 0 1

1 1 1

a 1

a = -10*1 + 20*0 + 20*0 = -101

Page 70: Getting Started with Machine Learning

Input Output

0 0 0

0 1 1

1 0 1

1 1 1

Neural Network Example

Page 71: Getting Started with Machine Learning

Input Output

0 0 0

0 1 1

1 0 1

1 1 1

Neural Network Example

Page 72: Getting Started with Machine Learning

Neural Network Example 2

+1Input Output

0 0 0

0 1 1

1 0 1

1 1 1

a 1

a = -10*1 + 20*0 + 20*1 = 101

Page 73: Getting Started with Machine Learning

Input Output

0 0 0

0 1 1

1 0 1

1 1 1

Neural Network Example 2

Page 74: Getting Started with Machine Learning

Input Output

0 0 0

0 1 1

1 0 1

1 1 1

Neural Network Example 2

Page 75: Getting Started with Machine Learning

Neural Network Example (OR)

So far, linearly separable.What if the problem is not linearly separable?

Input Output

0 0 0

0 1 1

1 0 1

1 1 1

Page 76: Getting Started with Machine Learning

Neural Network Example 3 (XOR)

Input Output

0 0 0

0 1 1

1 0 1

1 1 0

What could we do in this case?

Page 77: Getting Started with Machine Learning

Neural Network Example 3 (XOR)

Input Output

0 0 0

0 1 1

1 0 1

1 1 0

What could we do in this case?

Page 78: Getting Started with Machine Learning

Multi-Layer Perceptron

+1

Input Layer Hidden Layer

a 1

(2)

a 2

(2)

a 0

(2)

Output Layer

a 1

(3)

Hidden Layer

Page 79: Getting Started with Machine Learning

Why does it work?

Page 80: Getting Started with Machine Learning

Why does it work?

Page 81: Getting Started with Machine Learning

Why does it work?

Page 82: Getting Started with Machine Learning

“So what can a MLP solve?

▸Can approximate any boolean function, with one hidden layer;▸Can approximate any function with two hidden layers.

Simon Haykin

Page 83: Getting Started with Machine Learning

3.DEEP LEARNING

Page 84: Getting Started with Machine Learning

“▸A machine learning approach that attempts to extract high-level and hierarchical features from raw data.

Page 85: Getting Started with Machine Learning

Deep Neural Network

Input Layer

Hidden Layer Hidden Layer Hidden Layer

Output Layer

Page 86: Getting Started with Machine Learning

Where is it used?

▸Speech Recognition;▸Music Classification;▸Recommendation;▸Image Classification;▸...

Page 87: Getting Started with Machine Learning

The “creators”

Yann LeCun Geoffrey Hinton Yoshua Bengio

Page 88: Getting Started with Machine Learning

AI Winter and the Canadian Mafia

Page 89: Getting Started with Machine Learning

So… why now?

▸Better hardware▹GPUs enable ~9x faster training compared to CPUs.

▸High-quality datasets;▸New activation functions;▸Regularization methods.

Page 90: Getting Started with Machine Learning

Different Implementations

▸Convolutional Neural Networks (CNN);▸Deep Belief Networks;▸Autoencoders;▸Long Short Term Memory (LSTM)▸...

Page 91: Getting Started with Machine Learning

3.1DEEP LEARNINGEXAMPLE

Page 92: Getting Started with Machine Learning

Image Classification Problem

"Convolutional Neural Networks for Visual Recognition" (http://cs231n.github.io/convolutional-networks/)

Page 93: Getting Started with Machine Learning

Image Classification - Challenges

Page 94: Getting Started with Machine Learning

Traditional way

"Deep Learning Methods for Vision" (Honglak Lee, 2012)

Page 95: Getting Started with Machine Learning

Hand-crafted features

▸Time consuming;▸Demand expert knowledge;▸Sometimes meaningful for a very specific scenario.

Page 96: Getting Started with Machine Learning

But what if we could learn feature extractors instead?

Traditional

Hand-crafted feature extractor

Trainable feature extractor

Trainable Classifier

Trainable Classifier

Deep Learning

Page 97: Getting Started with Machine Learning

Architecture Example

Page 98: Getting Started with Machine Learning

Architectures Overview

Page 99: Getting Started with Machine Learning

4.HOW COULD WE APPLY IT AT OUR WORK ?

Page 100: Getting Started with Machine Learning

But my work is not about Machine Learning...

Page 101: Getting Started with Machine Learning

Steps

Start with the problem

Understand your data

Try simple hypotheses first

Evaluating and presenting results

Page 102: Getting Started with Machine Learning

Start with the problem

Talk to your business contact

Get insights about your problem domain

Ask about hypothetical features

Be clear about your intentions

Do not commit with results!

Page 103: Getting Started with Machine Learning

Where is the data ?▸Application logs▸User inputs

Search things like:▸Repetitive tasks by user▸Customer habits▸Patterns of system usage by the user

Start with the problem

Page 104: Getting Started with Machine Learning

Understand your data

▸Explore your data▸Plot your data▸Start with 2 or 3 attributes▸Try to spot some pattern

Page 105: Getting Started with Machine Learning

Understand your data

1 tablespoon extra-virgin olive oil 1 tablespoon chopped garlic 1/2 teaspoon kosher salt

Heat 1 tablespoon butter and olive oil together in a large skillet over medium heat; cook and stir zucchini noodles (zoodles),

▸Explore your data▸Plot your data▸Start with 2 or 3 attributes▸Try to spot some pattern

Page 106: Getting Started with Machine Learning

Try simple hypotheses first

▸Start with simple hypotheses▹Naive Bayes▹Linear regression▹Logistic regression▹K-means▹KNN

▸Try more complex hypotheses later▹SVM▹Random Forest▹Neural Networks

Page 107: Getting Started with Machine Learning

Evaluating and presenting results

▸Define a metric based on the business goal▹Error rate▹F1 score

▸Demonstrate results to business unit▸Use functional prototypes when possible

Page 108: Getting Started with Machine Learning

Examples

Market Analysis Case

Page 109: Getting Started with Machine Learning

Market Analysis Case

▸Optimize subscription plans▸Find customer groups▸Use and analyze the application log

application logs:[2015-01-05 09:12:34] user id=543 clicked buy button for app id=12[2015-01-06 08:10:31] user id=143 clicked buy button for book id=786[2015-01-06 09:22:06] user id=563 clicked buy button for music id=900[2015-01-06 13:12:34] user id=543 clicked buy button for music id=34[2015-01-07 15:45:22] user id=890 clicked more details button for movie id=189[2015-01-07 17:21:14] user id=897 clicked more details button for app id=231[2015-01-08 08:22:02] user id=239 clicked buy button for book id=786[2015-01-09 09:00:11] user id=253 clicked buy button for music id=373[2015-01-09 17:12:56] user id=389 clicked more details button for app id=262[2015-01-09 20:12:04] user id=141 clicked buy button for app id=145

Page 110: Getting Started with Machine Learning

Market Analysis Case

application logs:[2015-01-05 09:12:34] user id=543 clicked buy button for app id=12[2015-01-06 08:10:31] user id=143 clicked buy button for book id=786[2015-01-06 09:22:06] user id=563 clicked buy button for music id=900[2015-01-06 13:12:34] user id=543 clicked buy button for music id=34[2015-01-07 15:45:22] user id=890 clicked more details button for movie id=189[2015-01-07 17:21:14] user id=897 clicked more details button for app id=231[2015-01-08 08:22:02] user id=239 clicked buy button for book id=786[2015-01-09 09:00:11] user id=253 clicked buy button for music id=373[2015-01-09 17:12:56] user id=389 clicked more details button for app id=262[2015-01-09 20:12:04] user id=141 clicked buy button for app id=145

▸Optimize subscription plans▸Find customer groups▸Use and analyze the application log

Page 111: Getting Started with Machine Learning

Market Analysis Case

Group 1: App Buyers+- 75% Applications+- 15% Movies+- 5% Music+- 5% Books

Group 2: Media Users+- 10% Applications +- 40% Movies+- 40% Music+- 10% Books

Group 3: Book Readers +- 5% Applications+- 2.5% Movies+- 7.5% Music+- 90% Books

▸Clustering algorithm can be used▸Present results back to business unit

Page 112: Getting Started with Machine Learning

5.BOOKS & COURSES

Page 113: Getting Started with Machine Learning

Books

Machine LearningTom Mitchell

Introduction to Data MiningPang-Ning Tan

Page 114: Getting Started with Machine Learning

MOOCs

Page 115: Getting Started with Machine Learning

Other

Page 116: Getting Started with Machine Learning

THANK YOU!

Any questions?