i Collections : introduction Gordon Paterson Chair, i Collections
Introduction - Columbia University · Introduction Frank Wood July 6, 2010. Introduction Overview...
Transcript of Introduction - Columbia University · Introduction Frank Wood July 6, 2010. Introduction Overview...
Introduction
Frank Wood
July 6, 2010
IntroductionOverview of Topics
Introduction
I Data mining is the search for patterns in large collections ofdata
I Pattern recognition is concerned with automatically findingpatterns in data
I Machine learning is pattern recognition with concern forcomputational tractability
Example : Handwritten Digit Identification
Figure: Hand written digits from the USPS
Approaches to Identifying Digits
Goal
I Build a machine that can identify digits automatically
Approaches
I Hand craft a set of rules that separate each digit from the next
I Set of rules invariably grows large and unwieldy and requiresmany “exceptions”
I “Learn” a set of models for each digit automatically fromlabeled training data.
Formalism
I Each digit is 28x28 pixel image
I Vectorized into a 784 entry vector x
Machine learning approach
I Obtain a of N digits {x1, . . . , xN} called the training set.
I Label (by hand) the training set to produce a label or “target”t for each digit image x
I Learn a function y(x) which takes an image x as input andreturns an output in the same “format” as the target vector.
�
�
0 1
−1
0
1
Figure: Chapter 1, Figure 2
t
x
y(xn,w)
tn
xn
Figure: Chapter 1, Figure 3
�
�
�����
0 1
−1
0
1
Figure: Chapter 1, Figure 4a
�
�
�����
0 1
−1
0
1
Figure: Chapter 1, Figure 4b
�
�
�����
0 1
−1
0
1
Figure: Chapter 1, Figure 4c
�
�
�����
0 1
−1
0
1
Figure: Chapter 1, Figure 4d
�
�����
0 3 6 90
0.5
1TrainingTest
Figure: Chapter 1, Figure 5
�
�
�������
0 1
−1
0
1
Figure: Chapter 1, Figure 6a
�
�
���������
0 1
−1
0
1
Figure: Chapter 1, Figure 6b
�
�
� ������� �
0 1
−1
0
1
Figure: Chapter 1, Figure 7a
�
�
� ������
0 1
−1
0
1
Figure: Chapter 1, Figure 7b
�����
� ���−35 −30 −25 −200
0.5
1TrainingTest
Figure: Chapter 1, Figure 8
Figure: Chapter 1, Figure 9
}
}ci
rjyj
xi
nij
Figure: Chapter 1, Figure 10
���������
�
����
�����
Figure: Chapter 1, Figure 11a
�������
Figure: Chapter 1, Figure 11b
�������
�
Figure: Chapter 1, Figure 11c
�
��� ��� �����
Figure: Chapter 1, Figure 11d
xδx
p(x) P (x)
Figure: Chapter 1, Figure 12
N (x|µ, σ2)
x
2σ
µ
Figure: Chapter 1, Figure 13
x
p(x)
xn
N (xn|µ, σ2)
Figure: Chapter 1, Figure 14
(a)
(b)
(c)
Figure: Chapter 1, Figure 15
t
xx0
2σy(x0,w)
y(x,w)
p(t|x0,w, β)
Figure: Chapter 1, Figure 16
�
�
0 1
−1
0
1
Figure: Chapter 1, Figure 17
run 1
run 2
run 3
run 4
Figure: Chapter 1, Figure 18
���
���
0 0.25 0.5 0.75 10
0.5
1
1.5
2
Figure: Chapter 1, Figure 19
���
���
0 0.25 0.5 0.75 10
0.5
1
1.5
2
Figure: Chapter 1, Figure 20
x1
D = 1
Figure: Chapter 1, Figure 21a
x1
x2
D = 2
Figure: Chapter 1, Figure 21b
x1
x2
x3
D = 3
Figure: Chapter 1, Figure 21c
�
volu
me
frac
tion
�����
�����
����
�����
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
Figure: Chapter 1, Figure 22
�����
�����
������
��� �
0 2 40
1
2
Figure: Chapter 1, Figure 23
R1 R2
x0 x̂
p(x, C1)
p(x, C2)
x
Figure: Chapter 1, Figure 24
x
p(C1|x) p(C2|x)
0.0
1.0θ
reject region
Figure: Chapter 1, Figure 26
������� ���
����� � ����
�
clas
s de
nsiti
es
0 0.2 0.4 0.6 0.8 10
1
2
3
4
5
Figure: Chapter 1, Figure 27a
�
�������� �� ����� ��� ��
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
1.2
Figure: Chapter 1, Figure 27b
t
xx0
y(x0)
y(x)
p(t|x0)
Figure: Chapter 1, Figure 28
�����
�����
� ����� �
−2 −1 0 1 20
1
2
Figure: Chapter 1, Figure 29a
�����
���
��
� ���
−2 −1 0 1 20
1
2
Figure: Chapter 1, Figure 29b
�����
���
��
� ���
−2 −1 0 1 20
1
2
Figure: Chapter 1, Figure 29c
�����
���
��
� �����
−2 −1 0 1 20
1
2
Figure: Chapter 1, Figure 29d
prob
abili
ties
������� ���
0
0.25
0.5
Figure: Chapter 1, Figure 30a
prob
abili
ties
������� ���
0
0.25
0.5
Figure: Chapter 1, Figure 30b
xa bxλ
chord
xλ
f(x)
Figure: Chapter 1, Figure 31