ELEG/CISC 867 Advanced Machine Learning
Transcript of ELEG/CISC 867 Advanced Machine Learning
![Page 1: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/1.jpg)
ELEG/CISC 867 Advanced Machine Learning
1
Xiugang Wu
Feb 14, 2019
Dept of Electrical & Computer Engineering https://www.eecis.udel.edu/~xwu
![Page 2: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/2.jpg)
2
Agenda
• What is machine learning?
• What can we expect to learn from this course?
• Logistics
![Page 3: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/3.jpg)
3
What is machine learning?
Example from the course cs231n at Stanford.
![Page 4: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/4.jpg)
4
Image Classification
Given a set of discrete labels {cat, dog, frog, … }
cat
![Page 5: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/5.jpg)
5
The Problem: Semantic Gap
a big grid of numbers between [0, 255]
What the computer sees:
e.g. 800 x 600 x 3 (3 channels RGB)
![Page 6: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/6.jpg)
6
Challenges: Viewpoint Variation
All pixels will change!
![Page 7: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/7.jpg)
7
Challenges: Illumination
![Page 8: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/8.jpg)
8
Challenges: Deformation
![Page 9: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/9.jpg)
9
Challenges: Background Clutter
![Page 10: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/10.jpg)
An Image Classifier
10
• Unlike, e.g., sorting a list of numbers • No obvious way to code the algorithm for classification
![Page 11: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/11.jpg)
Rule-Based Approach
11
(Canny, 1986)
![Page 12: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/12.jpg)
Data-Driven Approach
1) Data Collection: Collect a dataset of images and labels
12
An example of training set
![Page 13: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/13.jpg)
Data-Driven Approach
1) Data Collection: Collect a dataset of images and labels 2) Training: Use Machine Learning to train a classifier
13
![Page 14: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/14.jpg)
Data-Driven Approach
1) Data Collection: Collect a dataset of images and labels 2) Training: Use Machine Learning to train a classifier 3) Testing: Apply the classifier to new images
14
![Page 15: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/15.jpg)
First Classifier: Nearest Neighbor
15
Memorize all images/labels
Output the label of the most similar training image
![Page 16: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/16.jpg)
How to Measure Similarity
16
l1 distance: d(A,B) =X
i
|Ai �Bi|
![Page 17: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/17.jpg)
Example on Dataset CIFAR10
17
10 classes; 50,000 training images; 10,000 testing images; size: 32 x 32
![Page 18: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/18.jpg)
Example on Dataset CIFAR10
18
Test images and nearest neighbors
![Page 19: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/19.jpg)
Complexity of Nearest Neighbor Classifier
19
Q: With N training examples, how fast is training and prediction ?
A: Training O(1); Prediction O(N)
This is bad:
- Slow for training is OK
- We want classifiers that are fast at prediction
Alternatives: Support Vector Machine, Neural Network…
![Page 20: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/20.jpg)
Summary
20
• Machine Learning Approach to Image Classification
- Start with a training set of (image, label) pairs - Predict labels on test set
• Nearest Neighbor Classifier - Training: memorize all (image, label) pairs
- Prediction: output the label of the nearest neighbor
![Page 21: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/21.jpg)
Learning:
Inference:
X
{(Xi, Yi)}ni=1Learner
21
ML Framework
f
Inferrer fY = f(X)
The common theme of ML is a prediction problem:
- Learn a predictor f from the training data {(Xi, Yi)}ni=1
- Apply f at the inference stage to predict Y based on X
![Page 22: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/22.jpg)
Learning:
Inference:
X
{(Xi, Yi)}ni=1Learner
22
ML Framework
f
Inferrer fY = f(X)
In image classification:
- {(Xi, Yi)}ni=1 = (image, label) pairs
- f = classifier
- X = new image
- Y = true label of new image
-
ˆY = predicted label
![Page 23: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/23.jpg)
Learning:
Inference:
X
{(Xi, Yi)}ni=1Learner
23
ML Framework
f
Inferrer fY = f(X)
In regression:
- X,Y are real vectors
- f is a predictor
-
ˆY is predicted output
![Page 24: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/24.jpg)
24
What can we learn from this course?
![Page 25: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/25.jpg)
25
• Part I: Foundation • Part II: From Theory to Algorithms • Part III: Advanced Topics
Top-Down Approach
Theory ) Algorithm ) Implementation
![Page 26: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/26.jpg)
26
Part I: Foundation
Learning:
Inference:
X
{(Xi, Yi)}ni=1Learner
f
Inferrer fY = f(X)
How much data do we need to learn?
2 F
We have to restrict f to F to avoid overfitting. Why?
What kind of F is good, i.e. learnable? What is not learnable?
- No free lunch theorem.
- PAC (Probably Approximately Correct) learning framework.
- VC theory.
![Page 27: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/27.jpg)
27
Part II: From Theory to Algorithms
Learning:
Inference:
X
{(Xi, Yi)}ni=1Learner
f
Inferrer fY = f(X)
2 F
Various F lead to various learning models.
- Linear predictor and boosting
- Support vector machine
- Decision trees
![Page 28: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/28.jpg)
28
Part III: Advanced Topics
Learning:
Inference:
X
{(Xi, Yi)}ni=1Learner
f
Inferrer fY = f(X)
2 F
Can we learn even if f is not restricted?
- Yes, minimax learning.
Can we learn in real time when training data is progressively given?
- Yes, online learning.
Why is deep neural network working?
- An information theoretic perspective.
Other forms of learning?- Unsupervised learning, reinforcement learning...
![Page 29: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/29.jpg)
29
Logistics
![Page 30: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/30.jpg)
Lecture and Office Hour
30
• Lecture
- TR 9:30-10:45 AM - Gore 308
• Office hour - TR 11:00-12:00 AM
- Evans 314
https://www.eecis.udel.edu/~xwu/class/ELEG867/
• Course website
![Page 31: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/31.jpg)
Prerequisite
31
• Undergraduate-level probability theory and linear algebra; mathematical maturity in general
• Previous exposure to machine learning is preferred but not mandatory
![Page 32: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/32.jpg)
Textbook
32
• Free pdf version online • Lecture notes on course website
![Page 33: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/33.jpg)
Other Recommended Books
33
• Text for “Statistical learning”, by Prof. Gonzalo Arce
![Page 34: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/34.jpg)
Other Recommended Books
34
• Classical “ISL” and “ESL” • Both free to download
![Page 35: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/35.jpg)
Grading
35
• Attendance: 10 points
- Six random samples (after the class is stabilized) - Two points for each attendance (5 out of 6 gives you max 10 pts)
• Project (Presentation and Report): 40 points - Can be either theoretical or experimental - Topics will be posted towards midterm
• Final: 50 points + 10 bonus points - Closed book with one letter-size aid-sheet allowed - Homeworks (three in total, one for each part) are not graded but crucial for after-exam happiness
![Page 36: ELEG/CISC 867 Advanced Machine Learning](https://reader030.fdocuments.net/reader030/viewer/2022012502/617c12f8154af40c4277ce71/html5/thumbnails/36.jpg)
Questions?
36