Machine)Learning)Department School)of)Computer)Science...
Transcript of Machine)Learning)Department School)of)Computer)Science...
![Page 1: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/1.jpg)
Machine Learning in Practice + k-‐Nearest
Neighbors
1
Intro Readings:Mitchell 1HTF 1, 2Murphy 1Bishop 1
KNN Readings:Mitchell 8.2HTF 13.3Murphy -‐-‐-‐Bishop 2.5.2
10-‐601 Introduction to Machine Learning
Matt GormleyLecture 2
January 23, 2016
Machine Learning DepartmentSchool of Computer ScienceCarnegie Mellon University
![Page 2: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/2.jpg)
Reminders
• Background Test– Tue, Jan. 24 at 6:30pm– **Your test location depends on yourregistration status – see Piazza for details
• Background Exercises (Homework 1)– Released: Tue, Jan. 24 after the test– Due: Mon, Jan. 30 at 5:30pm
2
![Page 3: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/3.jpg)
Machine Learning & EthicsWhat ethical responsibilities do we have as machine learning experts?
3
If our search results for news are optimized for ad revenue, might they reflect gender / racial / socio-‐economic biases?
Should restrictions be placed on intelligent agents that are capable of interacting with the world?
How do autonomous vehicles make decisions when all of the outcomes are likely to be negative?
http://vizdoom.cs.put.edu.pl/
http://bing.com/
http://arstechnica.com/
Some topics that we won’t cover are probably deserve an entire course
![Page 4: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/4.jpg)
Outline• Defining Learning Problems
– Artificial Intelligence (AI)– Mitchell’s definition of learning– Example learning problems– Data annotation– The Machine Learning framework
• Classification– Binary classification– 2D examples– Decision rules / hypotheses
• k-‐Nearest Neighbors (KNN)– KNN for binary classification– Distance functions– Special cases– Choosing k
4
Covered Next
Lecture
![Page 5: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/5.jpg)
DEFINING LEARNING PROBLEMSThis section is based on Chapter 1 of (Mitchell, 1997)
5
![Page 6: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/6.jpg)
Artificial IntelligenceThe basic goal of AI is to develop intelligent machines.
This consists of many sub-‐goals:• Perception• Reasoning• Control / Motion / Manipulation• Planning• Communication• Creativity• Learning
6
Artificial Intelligence
Machine Learning
![Page 7: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/7.jpg)
Amazon Gohttps://www.amazon.com/b?node=16008589011
https://www.youtube.com/watch?v=NrmMk1Myrxc
7
![Page 8: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/8.jpg)
Artificial Intelligence (AI): Example Tasks:
– Identify objects in an image– Translate from one human language to another– Recognize speech– Assess risk (e.g. in loan application)– Make decisions (e.g. in loan application)– Assess potential (e.g. in admission decisions)– Categorize a complex situation (e.g. medical diagnosis)– Predict outcome (e.g. medical prognosis, stock prices,
inflation, temperature)– Predict events (default on loans, quitting school, war)– Plan ahead under perfect knowledge (chess)– Plan ahead under partial knowledge (Poker, Bridge)
© Roni Rosenfeld, 2016 8
Slide from Roni Rosenfeld
![Page 9: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/9.jpg)
Well-‐Posed Learning Problems
Three components:1. Task, T2. Performance measure, P3. Experience, E
Mitchell’s definition of learning:A computer program learns if its performance at tasks in T, as measured by P, improves with experience E.
9Definition from (Mitchell, 1997)
![Page 10: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/10.jpg)
Example Learning Problems(historical perspective)
1. Learning to recognize spoken words
10
“…the SPHINX system (e.g. Lee 1989) learns speaker-specific strategies for recognizing the primitive sounds (phonemes) and words from the observed speech signal…neural network methods…hidden Markov models…”
(Mitchell, 1997)
THEN
Source: https://www.stonetemple.com/great-‐knowledge-‐box-‐showdown/#VoiceStudyResults
NOW
![Page 11: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/11.jpg)
Example Learning Problems(historical perspective)
2. Learning to drive an autonomous vehicle
11
“…the ALVINN system (Pomerleau 1989) has used its learned strategies to drive unassisted at 70 miles per hour for 90 miles on public highways among other cars…”
(Mitchell, 1997)
THEN
waymo.com
NOW
![Page 12: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/12.jpg)
Example Learning Problems(historical perspective)
2. Learning to drive an autonomous vehicle
12
“…the ALVINN system (Pomerleau 1989) has used its learned strategies to drive unassisted at 70 miles per hour for 90 miles on public highways among other cars…”
(Mitchell, 1997)
THEN
https://www.geek.com/wp-‐content/uploads/2016/03/uber.jpg
NOW
![Page 13: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/13.jpg)
Example Learning Problems(historical perspective)
3. Learning to beat the masters at board games
13
“…the world’s top computer program for backgammon, TD-GAMMON (Tesauro, 1992, 1995), learned its strategy by playing over one million practice games against itself…”
(Mitchell, 1997)
THEN NOW
![Page 14: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/14.jpg)
Example Learning Problems
3. Learning to beat the masters at chess1. Task, T:
2. Performance measure, P:
3. Experience, E:
14
![Page 15: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/15.jpg)
Example Learning Problems
4. Learning to respond to voice commands (Siri)1. Task, T:
2. Performance measure, P:
3. Experience, E:
15
![Page 16: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/16.jpg)
Capturing the Knowledge of Experts
16
Solution #1: Expert Systems• Over 20 years ago, we
had rule based systems• Ask the expert to
1. Obtain a PhD in Linguistics
2. Introspect about the structure of their native language
3. Write down the rules they devise
Give me directions to Starbucks
If: “give me directions to X”Then: directions(here, nearest(X))
How do I get to Starbucks?
If: “how do i get to X”Then: directions(here, nearest(X))
Where is the nearest Starbucks?
If: “where is the nearest X”Then: directions(here, nearest(X))
1990 20001980 2010
![Page 17: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/17.jpg)
Capturing the Knowledge of Experts
17
Solution #1: Expert Systems• Over 20 years ago, we
had rule based systems• Ask the expert to
1. Obtain a PhD in Linguistics
2. Introspect about the structure of their native language
3. Write down the rules they devise
Give me directions to Starbucks
If: “give me directions to X”Then: directions(here, nearest(X))
How do I get to Starbucks?
If: “how do i get to X”Then: directions(here, nearest(X))
Where is the nearest Starbucks?
If: “where is the nearest X”Then: directions(here, nearest(X))
I need directions to Starbucks
If: “I need directions to X”Then: directions(here, nearest(X))
Is there a Starbucks nearby?If: “Is there an X nearby”Then: directions(here, nearest(X))
Starbucks directions
If: “X directions”Then: directions(here, nearest(X))
1990 20001980 2010
![Page 18: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/18.jpg)
Capturing the Knowledge of Experts
18
Solution #2: Annotate Data and Learn• Experts:– Very good at answering questions about specific cases
– Not very good at telling HOW they do it• 1990s: So why not just have them tell you what they do on SPECIFIC CASES and then let MACHINE LEARNING tell you how to come to the same decisions that they did
1990 20001980 2010
![Page 19: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/19.jpg)
Capturing the Knowledge of Experts
19
Solution #2: Annotate Data and Learn1. Collect raw sentences {x1, …, xn}2. Experts annotate their meaning {y1, …, yn}
x2: Show me the closest Starbucks
y2: map(nearest(Starbucks))
x3: Send a text to John that I’ll be late
y3: txtmsg(John, I’ll be late)
x1: How do I get to Starbucks?y1: directions(here,
nearest(Starbucks))
x4: Set an alarm for seven in the morningy4: setalarm(7:00AM)
1990 20001980 2010
![Page 20: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/20.jpg)
Example Learning Problems
4. Learning to respond to voice commands (Siri)1. Task, T:
predicting action from speech2. Performance measure, P:
percent of correct actions taken in user pilot study
3. Experience, E: examples of (speech, action) pairs
20
![Page 21: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/21.jpg)
The Machine Learning Framework
• Formulate a task as a mapping from input to output– Task examples will usually be pairs: (input, correct_output)
• Formulate performance as an error measure– or more generally, as an objective function (aka Loss function)
• Examples:– Medical Diagnosis
• mapping input to one of several classes/categories è Classification– Predict tomorrow’s Temperature
• mapping input to a number è Regression– Chance of Survival: From patient data to p(survive >= 5 years)
• mapping input to probability è Density estimation– Driving recommendation
• mapping input into a plan è Planning
© Roni Rosenfeld, 2016 21
Slide from Roni Rosenfeld
![Page 22: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/22.jpg)
Often, the same task can be formulated in more than one way:• Ex. 1: Loan applications
– creditworthiness/score (regression)– probability of default (density estimation)– loan decision (classification)
• Ex. 2: Chess– Nature of available training examples/experience:
• expert advice (painful to experts)• games against experts (less painful but limited, and not much control)• experts’ games (almost unlimited, but only ”found data” – no control)• games against self (unlimited, flexible, but can you learn this way?)
– Choice of target function: boardàmove vs. boardàscore
© Roni Rosenfeld, 2016 22
Slide from Roni Rosenfeld
Choices in ML Formulation
![Page 23: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/23.jpg)
How to Approach a Machine Learning Problem
1. Consider your goal à definition of task T– E.g. make good loan decisions, win chess competitions, …
2. Consider the nature of available (or potential) experience E– How much data can you get? What would it cost (in money, time or effort)?
3. Choose type of output O to learn– (Numerical? Category? Probability? Plan?)
4.Choose the Performance measure P (error/loss function)
5. Choose a representation for the input X6.Choose a set of possible solutions H (hypothesis space)
– set of functions h: X è O– (often, by choosing a representation for them)
7. Choose or design a learning algorithm– for using examples (E) to converge on a member of H that optimizes P
© Roni Rosenfeld, 2016 23
Slide from Roni Rosenfeld
![Page 24: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/24.jpg)
CLASSIFICATION
24
![Page 25: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/25.jpg)
![Page 26: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/26.jpg)
Fisher Iris DatasetFisher (1936) used 150 measurements of flowers from 3 different species: Iris setosa (0), Iris virginica (1), Iris versicolor (2) collected by Anderson (1936)
26Full dataset: https://en.wikipedia.org/wiki/Iris_flower_data_set
Species Sepal Length
Sepal Width
Petal Length
Petal Width
0 4.3 3.0 1.1 0.1
0 4.9 3.6 1.4 0.1
0 5.3 3.7 1.5 0.2
1 4.9 2.4 3.3 1.0
1 5.7 2.8 4.1 1.3
1 6.3 3.3 4.7 1.6
1 6.7 3.0 5.0 1.7
![Page 27: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/27.jpg)
Fisher Iris Dataset
![Page 28: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/28.jpg)
Classification
Whiteboard:– Binary classification– 2D examples– Decision rules / hypotheses
28
![Page 29: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/29.jpg)
K-‐NEAREST NEIGHBORS
29
![Page 30: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/30.jpg)
k-‐Nearest Neighbors
Whiteboard:– KNN for binary classification– Distance functions
30
![Page 31: Machine)Learning)Department School)of)Computer)Science ...mgormley/courses/10601-s17/slides/...MachineLearningin(Practice(+k0Nearest(Neighbors 1 Intro(Readings: Mitchell1 HTF)1,)2](https://reader033.fdocuments.net/reader033/viewer/2022050504/5f967a847ae4e233050077b2/html5/thumbnails/31.jpg)
Takeaways
• Learning Problems– Defining a learning problem is tricky– Formalizing exposes the many possibilities
• k-‐Nearest Neighbors– KNN is an extremely simple algorithm for classification
31