Sharif University of...
Transcript of Sharif University of...
![Page 1: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/1.jpg)
Course Overview and IntroductionCE-717 : Machine LearningSharif University of Technology
M. Soleymani
Fall 2016
![Page 2: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/2.jpg)
Instructor: Mahdieh Soleymani
Email: [email protected]
Lectures: Sun-Tue (13:30-15:00)
Website: http://ce.sharif.edu/cources/95-96/1/ce717-2
Course Info
2
![Page 3: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/3.jpg)
Text Books
Pattern Recognition and Machine Learning, C. Bishop, Springer, 2006.
Machine Learning,T. Mitchell, MIT Press,1998.
Additional readings: will be made available when appropriate.
Other books:
The elements of statistical learning, T. Hastie, R. Tibshirani, J. Friedman,
Second Edition, 2008.
Machine Learning: A Probabilistic Perspective, K. Murphy, MIT Press,
2012.
3
![Page 4: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/4.jpg)
Midterm Exam: 25%
Final Exam: 30%
Project: 5-10%
Homeworks (written & programming) : 20-25%
Mini-exams: 15%
Marking Scheme
4
![Page 5: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/5.jpg)
Machine Learning (ML) and Artificial Intelligence (AI)
ML appears first as a branch of AI
ML is now also a preferred approach to other subareas of
AI
ComputerVision, Speech Recognition,…
Robotics
Natural Language Processing
ML is a strong driver in ComputerVision and NLP
5
![Page 6: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/6.jpg)
A Definition of ML
Tom Mitchell (1998):Well-posed learning problem
“A computer program is said to learn from experience E
with respect to some task T and some performance
measure P, if its performance on T, as measured by P,
improves with experience E”.
Using the observed data to make better decisions
Generalizing from the observed data
6
![Page 7: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/7.jpg)
ML Definition: Example
Consider an email program that learns how to filter spam
according to emails you do or do not mark as spam.
T: Classifying emails as spam or not spam.
E: Watching you label emails as spam or not spam.
P: The number (or fraction) of emails correctly classified as
spam/not spam.
7
![Page 8: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/8.jpg)
The essence of machine learning
8
A pattern exist
We do not know it mathematically
We have data on it
![Page 9: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/9.jpg)
Example: Home Price
Housing price prediction
0
100
200
300
400
0 500 1000 1500 2000 2500
Price ($)
in 1000’s
Size in feet2
9
Figure adopted from slides of Andrew Ng,
Machine Learning course, Stanford.
![Page 10: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/10.jpg)
Example: Bank loan
10
Applicant form as the input:
Output: approving or denying the request
![Page 11: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/11.jpg)
Components of (Supervised) Learning
11
Unknown target function: 𝑓: 𝒳 → 𝒴
Input space: 𝒳
Output space: 𝒴
Training data: 𝒙1, 𝑦1 , 𝒙2, 𝑦2 , … , (𝒙𝑁, 𝑦𝑁)
Pick a formula 𝑔: 𝒳 → 𝒴 that approximates the target
function 𝑓
selected from a set of hypotheses ℋ
![Page 12: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/12.jpg)
Training data: Example
𝑥1 𝑥2 𝑦
0.9 2.3 1
3.5 2.6 1
2.6 3.3 1
2.7 4.1 1
1.8 3.9 1
6.5 6.8 -1
7.2 7.5 -1
7.9 8.3 -1
6.9 8.3 -1
8.8 7.9 -1
9.1 6.2 -1x1
x2
12
Training data
![Page 13: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/13.jpg)
Components of (Supervised) Learning
13
Learning
model
![Page 14: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/14.jpg)
Solution Components
14
Learning model composed of:
Learning algorithm
Hypothesis set
Perceptron example
![Page 15: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/15.jpg)
Perceptron classifier
15
Input 𝒙 = 𝑥1, … , 𝑥𝑑
Classifier:
If 𝑖=1𝑑 𝑤𝑖𝑥𝑖 > threshold then output 1
else output −1
The linear formula 𝑔 ∈ ℋ can be written:
𝑔 𝒙 = sign 𝑖=1
𝑑
𝑤𝑖𝑥𝑖 − threshold
If we add a coordinate 𝑥0 = 1 to the input:
𝑔 𝒙 = sign 𝑖=0
𝑑
𝑤𝑖𝑥𝑖
+ 𝑤0
𝑔 𝒙 = sign 𝒘𝑇𝒙
Vector form
x1
x2
![Page 16: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/16.jpg)
Perceptron learning algorithm:
linearly separable data
16
Give the training data 𝒙 1 , 𝑦 1 , … , (𝒙 𝑁 , 𝑦(𝑁))
Misclassified data 𝒙 𝑛 , 𝑦 𝑛 :
sign(𝒘𝑇𝒙 𝑛 ) ≠ 𝑦(𝑛)
Repeat
Pick a misclassified data 𝒙 𝑛 , 𝑦 𝑛 from training data and
update 𝒘:
𝒘 = 𝒘 + 𝑦(𝑛)𝒙(𝑛)
Until all training data points are correctly classified by 𝑔
![Page 17: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/17.jpg)
Perceptron learning algorithm:
Example of weight update
17 x1
x2
x1
x2
![Page 18: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/18.jpg)
Experience (E) in ML
18
Basic premise of learning:
“Using a set of observations to uncover an underlying
process”
We have different types of (getting) observations in
different types or paradigms of ML methods
![Page 19: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/19.jpg)
Paradigms of ML
Supervised learning (regression, classification)
predicting a target variable for which we get to see examples.
Unsupervised learning
revealing structure in the observed data
Reinforcement learning
partial (indirect) feedback, no explicit guidance
Given rewards for a sequence of moves to learn a policy and
utility functions
Other paradigms: semi-supervised learning, active learning,
online learning, etc.
19
![Page 20: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/20.jpg)
Supervised Learning:
Regression vs. Classification
Supervised Learning
Regression: predict a continuous target variable
E.g., 𝑦 ∈ [0,1]
Classification: predict a discrete target variable
E.g.,𝑦 ∈ {1,2, … , 𝐶}
20
![Page 21: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/21.jpg)
Data in Supervised Learning
Data are usually considered as vectors in a 𝑑 dimensional
space
Now, we make this assumption for illustrative purpose
We will see it is not necessary
21
𝑦(Target)
𝑥𝑑...𝑥2𝑥1
Sample1
Sample
2
…
Sample
n-1
Sample
n
Columns:Features/attributes/dimensions
Rows: Data/points/instances/examples/samples
Y column:Target/outcome/response/label
![Page 22: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/22.jpg)
Regression: Example
Housing price prediction
0
100
200
300
400
0 500 1000 1500 2000 2500
Price ($)
in 1000’s
Size in feet2
22
Figure adopted from slides of Andrew Ng
![Page 23: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/23.jpg)
Classification: Example
Weight (Cat, Dog)
23
1(Dog)
0(Cat)
weight
weight
![Page 24: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/24.jpg)
Supervised Learning vs. Unsupervised
Learning
Supervised learning
Given:Training set
labeled set of 𝑁 input-output pairs 𝐷 = 𝒙 𝑖 , 𝑦 𝑖𝑖=1
𝑁
Goal: learning a mapping from 𝒙 to 𝑦
Unsupervised learning
Given:Training set
𝒙 𝑖𝑖=1
𝑁
Goal: find groups or structures in the data
Discover the intrinsic structure in the data
24
![Page 25: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/25.jpg)
Supervised Learning: Samples
25
x1
x2
Classification
![Page 26: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/26.jpg)
Unsupervised Learning: Samples
26
x1
x2
Clustering
Type I
Type II
Type III
![Page 27: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/27.jpg)
Sample Data in Unsupervised Learning
Unsupervised Learning:
Columns:
Features/attributes/dimensions
Rows:
Data/points/instances/examples/s
amples
𝑥𝑑...𝑥2𝑥1
Sample1
Sample
2
…
Sample
n-1
Sample
n
27
![Page 28: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/28.jpg)
Unsupervised Learning: Example
Applications
Clustering docs based on their similarities
Grouping new stories in the Google news site
Market segmentation: group customers into different
market segments given a database of customer data.
Social network analysis
28
![Page 29: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/29.jpg)
Reinforcement
29
Provides only an indication as to whether an action is
correct or not
Data in supervised learning:
(input, correct output)
Data in Reinforcement Learning:
(input, some output, a grade of reward for this output)
![Page 30: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/30.jpg)
Reinforcement Learning
30
Typically, we need to get a sequence of decisions
it is usually assumed that reward signals refer to the entire sequence
![Page 31: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/31.jpg)
Is learning feasible?
31
Learning an unknown function is impossible.
The function can assume any value outside the data we have.
However, it is feasible in a probabilistic sense.
![Page 32: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/32.jpg)
Example
32
![Page 33: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/33.jpg)
Generalization
33
We don’t intend to memorize data but need to figure out
the pattern.
A core objective of learning is to generalize from the
experience.
Generalization: ability of a learning algorithm to perform
accurately on new, unseen examples after having experienced.
![Page 34: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/34.jpg)
Components of (Supervised) Learning
34
Learning
model
![Page 35: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/35.jpg)
Main Steps of Learning Tasks
Selection of hypothesis set (or model specification)
Which class of models (mappings) should we use for our data?
Learning: find mapping 𝑓 (from hypothesis set) based on the
training data
Which notion of error should we use? (loss functions)
Optimization of loss function to find mapping 𝑓
Evaluation: how well 𝑓 generalizes to yet unseen examples
How do we ensure that the error on future data is minimized?
(generalization)
35
![Page 36: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/36.jpg)
Some Learning Applications
Face, speech, handwritten character recognition
Document classification and ranking in web search
engines
Photo tagging
Self-customizing programs (recommender systems)
Database mining (e.g., medical records)
Market prediction (e.g., stock/house prices)
Computational biology (e.g., annotation of biological
sequences)
Autonomous vehicles
36
![Page 37: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/37.jpg)
ML in Computer Science
Why ML applications are growing?
Improved machine learning algorithms
Availability of data (Increased data capture, networking, etc)
Demand for self-customization to user or environment
Software too complex to write by hand
37
![Page 38: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/38.jpg)
Handwritten Digit Recognition Example
38
Data: labeled samples
0
1
2
3
4
5
6
7
8
9
![Page 39: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/39.jpg)
Example: Input representation
39
![Page 40: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/40.jpg)
Example: Illustration of features
40
![Page 41: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/41.jpg)
Example: Classification boundary
41
![Page 42: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/42.jpg)
Supervised learning
Regression
Classification (our main focus)
Learning theory
Unsupervised learning
Reinforcement learning
Some advanced topics & applications
Main Topics of the Course
42
Most of the lectures
are on this topic
![Page 43: Sharif University of Technologymaktabfile.ir/wp-content/uploads/2017/11/Introduction-overview.pdf · Paradigms of ML Supervised learning (regression,classification) predicting a target](https://reader033.fdocuments.net/reader033/viewer/2022042913/5f499d154463972c4b463638/html5/thumbnails/43.jpg)
Resource
43
Yaser S. Abu-Mostafa, Malik Maghdon-Ismail, and Hsuan
Tien Lin,“Learning from Data”, 2012.