IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary...
Transcript of IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary...
![Page 1: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/1.jpg)
IN5490 – Advanced Topics in Artificial
Intelligence for Intelligent Systems
Md. Zia Uddin
16/10/2018
![Page 2: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/2.jpg)
![Page 3: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/3.jpg)
Principal Components Analysis
![Page 4: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/4.jpg)
Principal Component Analysis (PCA)
PCA is a way of identifying patterns in data, and expressing the data
in such a way as to highlight their similarities and differences. It’s a
powerful tool to analyze data.
Main advantage
Compression of the data by reducing the number of dimensions,
without much loss of information.
This technique used in image compression, as we will see later.
![Page 5: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/5.jpg)
Original Variable A
Ori
gin
al V
aria
ble
B
PC 1PC 2
• Orthogonal directions of greatest variance in data•Projections along PC1 discriminate the data most along anyone
axis
Principal Component Analysis (PCA)
![Page 6: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/6.jpg)
6
Principal Components Analysis (PCA)
16.10.2017
![Page 7: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/7.jpg)
• First principal component is the direction of greatest variability (covariance) in the data
• Second is the next orthogonal (uncorrelated) direction of greatest variability
• So first remove all the variability along the first component, and then find the next direction of greatest variability
• And so on …
Principal Component Analysis (PCA)
![Page 8: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/8.jpg)
Principal Component Analysis (PCA)
![Page 9: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/9.jpg)
Principal Components
![Page 10: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/10.jpg)
Principal Components
![Page 11: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/11.jpg)
Reconstruction Using PCA
![Page 12: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/12.jpg)
Silhouettes
![Page 13: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/13.jpg)
Top 150 Eigenvalues of eigenvectors
Eigenvalues
![Page 14: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/14.jpg)
Principal Components
![Page 15: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/15.jpg)
1) Convert each image to a row vector
2) Calculate the mean
3) Subtract the mean
4) Calculate covariance matrix
5) Eigenvalue decomposition
6) Choose top eigenvectors based on eigenvalues
7) Project each image vector to the PCA space
Principal Component Analysis (PCA) Steps
![Page 16: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/16.jpg)
Linear Discriminant Analysis
![Page 17: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/17.jpg)
▪ LDA seeks directions along which the classes are best separated.
▪ It takes into consideration the scatter within-classes SW but also the scatter between-classes SB.
▪ LDA computes a transformation that maximizes the between-class scatters while minimizing the within-class scatters.
▪ It can be solved by where is the eigenvalues of .
Linear Discriminant Analysis(LDA)
1
( )( )c
T
B i i i
i
S J m m m m=
= − −
1
( )( )k i
cT
W k i k i
i m C
S m m m m=
= − −
= arg max
T
B
LDA TDW
D S DD
D S D
1− =W B
S S D D 1−
W BS S
17
![Page 18: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/18.jpg)
3-D plot of LDA of the binary silhouettes of different activities.
-0.2-0.1
00.1
0.2
-0.2
-0.1
0
0.1
0.2-0.05
0
0.05
0.1
0.15
LDC1LDC2
LD
C3
Walking
Running
Skipping
Right hand waving
Both hand waving
All activity binary silhouettes
Linear Discriminant Analysis(LDA)
![Page 19: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/19.jpg)
Independent Components Analysis
![Page 20: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/20.jpg)
What is ICA?
“Independent component analysis (ICA) is a method for finding underlying factors or components from multivariate (multi-dimensional) statistical data. What distinguishes ICA from other methods is that it looks for components that are both statistically independent, and nonGaussian.”
A.Hyvarinen, A.Karhunen, E.Oja
‘Independent Component Analysis’
![Page 21: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/21.jpg)
ICA
Blind Signal Separation (BSS) or Independent Component Analysis (ICA) is the
identification & separation of mixtures of sources with little prior information.
• Applications include:
• Audio Processing
• Medical data
• Finance
• Array processing (beamforming)
• Coding
• … and most applications where Factor Analysis and PCA is currently used.
• While PCA seeks directions that represents data best in a Σ|x0 - x|2 sense, ICA seeks such directions that are most independent from each other.
Often used on Time Series separation of Multiple Targets
![Page 22: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/22.jpg)
The simple “Cocktail Party” Problem
Sources
Observations
s1
s2
x1
x2
Mixing matrix A
x = As
n sources, m=n observations
![Page 23: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/23.jpg)
ICA
0 50 100 150 200 250
-0.2
-0.1
0.0
0.1
0.2
V1
0 50 100 150 200 250
-0.2
-0.1
0.0
0.1
0.2
V2
0 50 100 150 200 250
-0.10
-0.05
0.00
0.05
0.10
V3
ICA
Observing signals Original source signal
0 50 100 150 200 250
-0.10
-0.05
0.00
0.05
0.10
V4
![Page 24: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/24.jpg)
Motivation
Two Independent Sources Mixture at two Mics
aIJ ... Depend on the distances of the microphones from the speakers
2221212
2121111
)(
)(
sasatx
sasatx
+=
+=
![Page 25: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/25.jpg)
ICA Model
• Use statistical “latent variables“ system
• Random variable sk instead of time signal
• xj = aj1s1 + aj2s2 + .. + ajnsn, for all j
x = As
• IC‘s s are latent variables & are unknown AND Mixing matrix A is also unknown
• Task: estimate A and s using only the observeable random vector x
• Lets assume that no. of IC‘s = no of observable mixtures
and A is square and invertible
• So after estimating A, we can compute W=A-1 and hence
s = Wx = A-1x
![Page 26: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/26.jpg)
Illustration of ICA with 2 signals
s1
s2
x1
x2
Tt
tsatsatx
tsatsatx
:1
)()()(
)()()(
2221212
2121111
=
+=
+=
Step1:
Sphering
Step2:
Rotatation
Original s Mixed signals
a2
a1
a1
![Page 27: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/27.jpg)
ICA
Fixed Point Algorithm
Input: X
Random init of W
Iterate until convergence:
Output: W, S
1)(
)(
−=
=
=
WWWW
SXW
XWS
T
T
T
g
![Page 28: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/28.jpg)
Basic steps of ICA• Collect data matrix• Whitening • eigenvectors and eigenvalue matrix of .
• Distribute the un-mixing matrix W randomly.• Apply iterative procedure on each vector from un-mixing
matrix W on Y to approximate the corresponding basis S until it converges.
Enhanced ICA▪ Apply PCA first.▪ Apply ICA on the PCs ▪ Project the silhouette features on IC feature space
ICA Model
![Page 29: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/29.jpg)
ICs
![Page 30: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/30.jpg)
▪ The ICA looks for statistically independent basis images.
▪ ICA focuses on the local feature information.
Ten ICs from all activity silhouettes
ICA on Binary Silhouettes
All activity binary silhouettes
30
![Page 31: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/31.jpg)
Solve pixel correspondence problem
– given a pixel in It-1, look for nearby pixels of the same color in It
Key assumptions
color constancy: a point in It-1 looks the same in I
For grayscale images, this is brightness constancy
Optical Flows
How to estimate pixel motion from image It-1 to image It?
I(x, y, t-1) I(x, y, t) = I(x+u, y+v,t-1)
Displacement u, vx+u, y+v
31
![Page 32: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/32.jpg)
Once optical flows of the silhouettes from two consecutive activity frames are obtained, the flow region is divided into 256 sub-blocks to compute the average flow vector of each sub-block where each one becomes a size of 4x4. The average value is calculated as
The flows are augmented and represented as
Finally, the averaged optical flow features are extended by PCA and LDA.
Optical Flow Features
,
1 1 n 16 1 p 256
px
i j pyp
th
p p
KK
Knp sub block
=
= −
1 2 256, , ...,K K K
-0.2
0
0.2 -0.1-0.05
00.05
0.1
-0.15
-0.1
-0.05
0
0.05
walking
running
skipping
sitting down
standing up
3-D plot of LDA on the optical flows of different activities.
Sample optical flows from two (a) walking and (b) running frames.
(a) (b)
32
![Page 33: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/33.jpg)
▪ LBP features are local binary patterns based on the intensity values of surrounding pixels of a center pixel. Then, the LBP pattern at the given pixel ( xc , yc ) can be represented as an ordered set of the binary comparisons as:
▪ where ge represent the intensity of the given pixel and intensity of the surrounding pixels.
Local Binary Pattern (LBP)
7
0
1 0, ( )
0 0( , ) ( )2i
c c i ei
af a
aLBP x y f g g
=
=
= −
33
![Page 34: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/34.jpg)
2685
53
60 45
01
1
1 0
`41
43
25
101 1 11011110=222
1
1
LBP Operator
34
Local Binary Pattern (LBP)
![Page 35: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/35.jpg)
35
A depth activity image is divided into small regions and the regions’ LBP histograms are concatenated to represent features for one image
LBP Features
▪ To reduce the high dimensionality, PCA is applied on LBP
35
Local Binary Pattern (LBP)
![Page 36: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/36.jpg)
▪ The Local Directional Pattern (LDP) assigns an eight-bit binary code to each pixel of an input depth image.
▪ The Kirsch edge detector detects the edges considering all eight neighbors.
▪ Given a central pixel in the image, the eight directional edge response values {mk}, k=0,1,..,7 are computed by Kirsch masks Mk in eight different orientations centered on its position.
Local Directional Pattern (LDP)
36
![Page 37: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/37.jpg)
37
0 1 2 3
3 3 5 3 5 5 5 5 5 5 5 3
3 0 5 3 0 5 3 0 3 5 0 3
3 3 5 3 3 3 3 3 3 3 3 3
east north ast north north est
5 3 3 3 3 3 3 3 3
5 0 3 5 0 3 3 0 3
5 3 3 5 5 3 5 5 5
S e S S w S
− − − − − − − − − − − − − − − − − − − −
− − − − − − − −
− − − − − − −
4 5 6 7
3 3 3
3 0 5
3 5 5
west south est south south astS w S S e S
− − −
− −
Kirsch edge masks in eight directions
Local Directional Pattern (LDP)
![Page 38: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/38.jpg)
▪ It is interesting to know the p most prominent directions in order to generate the LDP feature for a pixel.
▪ Here, the top-p directional bit responses bk are set to 1. The remaining bits of 8-bit LDP pattern are set to 0.
▪ The Local Directional Pattern (LDP) assigns an eight-bit binary code to each pixel of an input depth image.
38
7
0
1 0( ) 2 , ( )
0 0
k
p k k p k
k
aLDP B m m B a
a=
= − =
Local Directional Pattern (LDP)
![Page 39: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/39.jpg)
39
0 0 1
1X1
0 0 0
m0
m1
m4
m2
m3
m7m
6m
5
0 0 1
1X1
0 0 0
B0
B1
B4
B2
B3
B7B
6B
5
Edge response to eight directions
LDP binary bit positions
Local Directional Pattern (LDP)
![Page 40: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/40.jpg)
40
LDP feature example for a pixel considering top 4 positions
85 32 26
105053
60 38 45
313 97 503
393X537
161 97 161
0 0 1
1X1
0 0 0
LDP Binary Code = 00010011 LDP Decimal Code = 19
{Mi} m
k
1
LDP Binary Code=00011011 LDP Decimal Code=27
90
60 414
518122338
562
146 82 318
Local Directional Pattern (LDP)
![Page 41: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/41.jpg)
41
A depth expression image is divided into small regions and the regions’ LDP histograms are concatenated to represent features
for one image
LDP Features
Local Directional Pattern (LDP)
![Page 42: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/42.jpg)
42
The image textual feature is presented by the histogram of the LDP map of which the bin can be defined as follows where n=256 normally for an image I.
The histogram of the LDP map for a region is presented as bellow.
Finally, the whole LDP feature F is expressed as a concatenated sequence of histograms of all regions as bellow where s=number of regions.
,
( , ) , 0,1,... 1x y
qT I LDP x y q q n= = −=
0 1 1( , ,..., ).nH T T T −=
1 2( , ,,..., )sF H H H=
Local Directional Pattern (LDP)
![Page 43: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/43.jpg)
43
Support Vector Machines (SVM): Background
16.10.2017
SVM is used for extreme classification cases.
CAT DOG
?
![Page 44: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/44.jpg)
• Intro. to Support Vector Machines (SVM)
• Properties of SVM
• Applications➢Gene Expression Data Classification
➢Text Categorization if time permits
• Discussion
Support Vector Machines (SVM)
![Page 45: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/45.jpg)
Linear
Classifiers
f(x,w,b) = sign(w x + b)
How would you classify this data?
w x + b<0
w x + b>0
![Page 46: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/46.jpg)
Maximum Margin
denotes +1
denotes -1 The maximum margin linear classifier is the linear classifier with the, um, maximum margin.
This is the simplest kind of SVM (Called an LSVM)
Support Vectors are those datapoints that the margin pushes up against
![Page 47: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/47.jpg)
◼ Goal: 1) Correctly classify all training data
if yi = +1
if yi = -1
for all i
2) Maximize the Margin
same as minimize
◼ We can formulate a Quadratic Optimization Problem and solve for w and b
◼ Minimize
subject to
wM
2=
www t
2
1)( =
1+bwxi
1+bwxi
1)( +bwxy ii
1)( +bwxy ii
i
wwt
2
1
Support Vector Machines (SVM)
![Page 48: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/48.jpg)
Non-linear SVMs
◼ Datasets that are linearly separable with some noise
work out great:
◼ But what are we going to do if the dataset is just too hard?
◼ How about… mapping data to a higher-dimensional
space:
0 x
0 x
0 x
x2
![Page 49: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/49.jpg)
Non-linear SVMs: Feature spaces
◼ General idea: the original input space can always be
mapped to some higher-dimensional feature space
where the training set is separable:
Φ: x→ φ(x)
![Page 50: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/50.jpg)
Binary to multiclass
• One-vs-all• All-vs-all
50
![Page 51: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/51.jpg)
1. One-vs-all classification
• Assumption: Each class individually separable from all the others
• Learning: Given a dataset D = {<xi, yi>}, Note: xi 2 <n, yi 2 {1, 2, , K}• Decompose into K binary classification tasks• For class k, construct a binary classification task as:
• Positive examples: Elements of D with label k• Negative examples: All other elements of D
• Train K binary classifiers w1, w2, wK using any learning algorithm we have seen
• Decision: argmaxi wiTx
51
![Page 52: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/52.jpg)
Visualizing One-vs-all
From the full dataset, construct three binary classifiers, one for each class
wblueTx > 0
for blueinputs
wredTx > 0
for orange inputs
wgreenTx > 0
for gray inputs
Winner Take All will predict the right answer. Only the correct label will have a positive score
Notation: Score for blue label
52
![Page 53: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/53.jpg)
One-vs-all may not always work
Black points are not separable with a single binary classifier
The decomposition will not work for these cases!
wblueTx > 0
for blueinputs
wredTx > 0
for orange inputs
wgreenTx > 0
for gray inputs
???
53
![Page 54: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/54.jpg)
2. All-vs-all classification
• Assumption: Every pair of classes is separable• Learning: Given a dataset D = {<xi, yi>},
Note: xi 2 <n, yi 2 {1, 2, , K}• For every pair of labels (j, k), create a binary classifier with:
• Positive examples: All examples with label j• Negative examples: All examples with label k
• Train classifiers in all
• Prediction: More complex, each label get K-1 votes• How to combine the votes? Many methods
• Majority: Pick the label with maximum votes• Organize a tournament between the labels
54
( 1)
2
K K −
![Page 55: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/55.jpg)
55
Support Vector Machines (SVM): SVM Examples:
16.10.2017
The SVM learning about a linearly separable dataset (top row) and a dataset that needs two straight lines to separate in2D (bottom row) with left the linear kernel, middle the polynomial kernel of degree 3, and right the RBF kernel
![Page 56: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/56.jpg)
Convolutional Neural Network (CNN)
• We know it is good to learn a small model.
• From this fully connected model, do we really need all the edges?
• Can some of these be shared?
![Page 57: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/57.jpg)
A Convolutional Layer
A filter
A CNN is a neural network with some convolutional layers
(and some other layers). A convolutional layer has a number
of filters that does convolutional operation.
Beak detector
![Page 58: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/58.jpg)
Convolution
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
6 x 6 image
1 -1 -1
-1 1 -1
-1 -1 1
Filter 1
-1 1 -1
-1 1 -1
-1 1 -1
Filter 2……
These are the network
parameters to be learned.
Each filter detects a
small pattern (3 x 3).
![Page 59: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/59.jpg)
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
6 x 6 image
1 -1 -1
-1 1 -1
-1 -1 1
Filter 1
3 -1
stride=1
Dot
product
Convolution
![Page 60: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/60.jpg)
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
imageconvolution
-1 1 -1
-1 1 -1
-1 1 -1
1 -1 -1
-1 1 -1
-1 -1 1
1x
2x
……
36x
……
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
Fully-
connected
Convolution & Fully Connected
![Page 61: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/61.jpg)
Fully Connected Feedforward network
cat dog ……Convolution
Max Pooling
Convolution
Max Pooling
Flattened
Can
repeat
many
times
![Page 62: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/62.jpg)
Max Pooling
3 -1 -3 -1
-3 1 0 -3
-3 -3 0 1
3 -2 -2 -1
-1 1 -1
-1 1 -1
-1 1 -1
Filter 2
-1 -1 -1 -1
-1 -1 -2 1
-1 -1 -2 1
-1 0 -4 3
1 -1 -1
-1 1 -1
-1 -1 1
Filter 1
![Page 63: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/63.jpg)
1 0 0 0 0 1
0 1 0 0 1 0
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
6 x 6 image
3 0
13
-1 1
30
2 x 2 image
Each filter
is a channel
New image
but smaller
Conv
MaxPooling
Max Pooling
![Page 64: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/64.jpg)
Convolution
Max Pooling
Convolution
Max Pooling
Can
repeat
many
times
A new image
The number of channels
is the number of filters
Smaller than the original
image
3 0
13
-1 1
30
![Page 65: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/65.jpg)
Fully Connected Feedforward network
cat dog ……Convolution
Max Pooling
Convolution
Max Pooling
Flattened
A new image
A new image
![Page 66: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/66.jpg)
Flattening
3 0
13
-1 1
30 Flattened
3
0
1
3
-1
1
0
3
Fully Connected Feedforward network
![Page 67: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/67.jpg)
Fully Connected Feedforward network
cat dog ……Convolution
Max Pooling
Convolution
Max Pooling
Flattened
Can
repeat
many
times
![Page 68: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/68.jpg)
• Gradient Based Learning Applied To Document Recognition - Y. Lecun, L. Bottou, Y. Bengio, P. Haffner; 1998
• Helped establish how we use CNNs today
• Replaced manual feature extraction
[LeCun et al., 1998]
LeNet-5
![Page 69: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/69.jpg)
• ImageNet Classification with Deep Convolutional Neural Networks - Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton; 2012
• Facilitated by GPUs, highly optimized convolution implementation and large datasets (ImageNet)
• Has 60 Million parameter compared to 60k parameter of LeNet-5
[Krizhevsky et al., 2012]
AlexNet
![Page 70: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/70.jpg)
AlexNet
. . .
227×227 ×3 55×55 × 96 27×27 ×96 27×27 ×256
13×13×256
13×13 ×384 13×13 ×384 13×13 ×256 6×6 ×256
11 × 11s = 4P = 0
3 × 3s = 2
max pool
5 × 5S = 1P = 2
3 × 3s = 2
max pool
3 × 3S = 1P = 1
3 × 3s = 1P = 1
3 × 3S = 1P = 1
3 × 3s = 2
max pool
conv conv
conv conv conv. . .
[Krizhevsky et al., 2012]
. . .
This slide is taken from Andrew Ng
ArchitectureCONV1MAX POOL1 NORM1CONV2MAX POOL2NORM2CONV3CONV4CONV5Max POOL3FC6FC7FC8
![Page 71: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/71.jpg)
AlexNet
[Krizhevsky et al., 2012]
![Page 72: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/72.jpg)
AlexNet
![Page 73: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/73.jpg)
AlexNet
AlexNet was the coming out party for CNNs in the computer vision community. This was the first time a model performed so well on a historically difficult ImageNet dataset. This paper illustrated the benefits of CNNs and backed them up with record breaking performance in the competition.
[Krizhevsky et al., 2012]
![Page 74: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/74.jpg)
GoogleNet
• 22 layers
• Efficient “Inception” module - strayed from
the general approach of simply stacking conv
and pooling layers on top of each other in a
sequential structure
• No FC layers
• Only 5 million parameters!
[Szegedy et al., 2014]
![Page 75: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/75.jpg)
GoogleNet
Introduced the idea that CNN layers didn’t always have to be stacked up sequentially. Coming up with the Inception module, the authors showed that a creative structuring of layers can lead to improved performance and computationally efficiency.
[Szegedy et al., 2014]
![Page 76: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/76.jpg)
ResNet
• Deep Residual Learning for Image Recognition -Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun; 2015
• Extremely deep network – 152 layers
• Deeper neural networks are more difficult to train.
• Deep networks suffer from vanishing and exploding gradients.
• Present a residual learning framework to ease the training of networks that are substantially deeper than those used previously.
[He et al., 2015]
ResNet
![Page 77: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/77.jpg)
ResNet
• ILSVRC’15 classification winner (3.57% top 5 error, humans generally hover around a 5-10% error rate)Swept all classification and detection competitions in ILSVRC’15 and COCO’15!
Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9. [He et al., 2015]
![Page 78: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/78.jpg)
ResNet
• Hypothesis: The problem is an optimization problem. Very deep networks are harder to optimize.
• Solution: Use network layers to fit residual mapping instead of directly trying to fit a desired underlying mapping.
• We will use skip connections allowing us to take the activation from one layer and feed it into another layer, much deeper into the network.
• Use layers to fit residual F(x) = H(x) – xinstead of H(x) directly
Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9. [He et al., 2015]
![Page 79: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/79.jpg)
ResNetResidual BlockInput x goes through conv-relu-conv series and gives us F(x). That result is then added to the original input x. Let’s call that H(x) = F(x) + x. In traditional CNNs, H(x) would just be equal to F(x). So, instead of just computing that transformation (straight from x to F(x)), we’re computing the term that we have to add, F(x), to the input, x.
[He et al., 2015]
![Page 80: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/80.jpg)
ResNet
Full ResNet architecture:
• Stack residual blocks
• Every residual block has two 3x3 conv layers
• Periodically, double # of filters and downsample spatially using stride 2 (in each dimension)
• Additional conv layer at the beginning
• No FC layers at the end (only FC 1000 to output classes)
• Total depths of 34, 50, 101, or 152 layers for ImageNet
[He et al., 2015]Slide taken from Fei-Fei & Justin Johnson & Serena Yeung. Lecture 9.
![Page 81: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/81.jpg)
ResNet
The best CNN architecture that we currently have and is a great innovation for the idea of residual learning.
[He et al., 2015]
![Page 82: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/82.jpg)
Emotion Recognition via CNN (2 Classes)
![Page 83: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/83.jpg)
Emotion Recognition via CNN (4 Classes)
![Page 84: IN5490 Advanced Topics in Artificial Intelligence for ...€¦ · 3-D plot of LDA of the binary silhouettes of different activities.-0.2-0.1 0 0.1 0.2-0.2-0.1 0 0.1 0.2-0.05 0 0.05](https://reader034.fdocuments.net/reader034/viewer/2022052102/603c4be35ae7207a5f440ff7/html5/thumbnails/84.jpg)