CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #30: Conclusions C. Faloutsos.
CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C....
-
Upload
levi-haist -
Category
Documents
-
view
218 -
download
0
Transcript of CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C....
![Page 1: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/1.jpg)
CMU SCS
15-826: Multimedia Databases and Data Mining
Lecture #18: SVD - part I (definitions)
C. Faloutsos
![Page 2: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/2.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 2
Must-read Material
• Numerical Recipes in C ch. 2.6;
• MM Textbook Appendix D
![Page 3: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/3.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 3
Outline
Goal: ‘Find similar / interesting things’
• Intro to DB
• Indexing - similarity search
• Data Mining
![Page 4: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/4.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 4
Indexing - Detailed outline• primary key indexing• secondary key / multi-key indexing• spatial access methods• fractals• text• Singular Value Decomposition (SVD)• multimedia• ...
![Page 5: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/5.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 5
SVD - Detailed outline• Motivation• Definition - properties• Interpretation• Complexity• Case studies• Additional properties
![Page 6: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/6.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 6
SVD - Motivation• problem #1: text - LSI: find ‘concepts’• problem #2: compression / dim. reduction
![Page 7: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/7.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 7
SVD - Motivation• problem #1: text - LSI: find ‘concepts’
![Page 8: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/8.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) P1-8
SVD - Motivation• Customer-product, for recommendation
system:
1 1 1 0 0
2 2 2 0 0
1 1 1 0 0
5 5 5 0 0
0 0 0 2 2
0 0 0 3 30 0 0 1 1
bread
lettu
cebe
ef
vegetarians
meat eaters
tom
atos
chick
en
![Page 9: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/9.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 9
SVD - Motivation• problem #2: compress / reduce
dimensionality
![Page 10: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/10.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 10
Problem - specs
• ~10**6 rows; ~10**3 columns; no updates;
• random access to any cell(s) ; small error: OK
![Page 11: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/11.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 11
SVD - Motivation
![Page 12: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/12.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 12
SVD - Motivation
![Page 13: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/13.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 13
SVD - Detailed outline• Motivation• Definition - properties• Interpretation• Complexity• Case studies• Additional properties
![Page 14: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/14.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 14
SVD - Definition(reminder: matrix multiplication
1 2
3 45 6
x 1
-1
3 x 2 2 x 1
=
![Page 15: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/15.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 15
SVD - Definition(reminder: matrix multiplication
1 2
3 45 6
x 1
-1
3 x 2 2 x 1
=
3 x 1
![Page 16: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/16.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 16
SVD - Definition(reminder: matrix multiplication
1 2
3 45 6
x 1
-1
3 x 2 2 x 1
=
-1
3 x 1
![Page 17: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/17.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 17
SVD - Definition(reminder: matrix multiplication
1 2
3 45 6
x 1
-1
3 x 2 2 x 1
=
-1
-1
3 x 1
![Page 18: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/18.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 18
SVD - Definition(reminder: matrix multiplication
1 2
3 45 6
x 1
-1=
-1
-1-1
![Page 19: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/19.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 19
SVD - Definition
A[n x m] = U[n x r] r x r] (V[m x r])T
• A: n x m matrix (eg., n documents, m terms)• U: n x r matrix (n documents, r concepts)• : r x r diagonal matrix (strength of each
‘concept’) (r : rank of the matrix)• V: m x r matrix (m terms, r concepts)
![Page 20: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/20.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 20
SVD - Definition• A = U VT - example:
![Page 21: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/21.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 21
SVD - PropertiesTHEOREM [Press+92]: always possible to
decompose matrix A into A = U VT , where• U, V: unique (*)• U, V: column orthonormal (ie., columns are unit
vectors, orthogonal to each other)– UT U = I; VT V = I (I: identity matrix)
• : singular are positive, and sorted in decreasing order
![Page 22: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/22.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 22
SVD - Example• A = U VT - example:
1 1 1 0 0
2 2 2 0 0
1 1 1 0 0
5 5 5 0 0
0 0 0 2 2
0 0 0 3 30 0 0 1 1
datainf.
retrieval
brain lung
0.18 0
0.36 0
0.18 0
0.90 0
0 0.53
0 0.800 0.27
=CS
MD
9.64 0
0 5.29x
0.58 0.58 0.58 0 0
0 0 0 0.71 0.71
x
![Page 23: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/23.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 23
SVD - Example• A = U VT - example:
1 1 1 0 0
2 2 2 0 0
1 1 1 0 0
5 5 5 0 0
0 0 0 2 2
0 0 0 3 30 0 0 1 1
datainf.
retrieval
brain lung
0.18 0
0.36 0
0.18 0
0.90 0
0 0.53
0 0.800 0.27
=CS
MD
9.64 0
0 5.29x
0.58 0.58 0.58 0 0
0 0 0 0.71 0.71
x
CS-conceptMD-concept
![Page 24: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/24.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 24
SVD - Example• A = U VT - example:
1 1 1 0 0
2 2 2 0 0
1 1 1 0 0
5 5 5 0 0
0 0 0 2 2
0 0 0 3 30 0 0 1 1
datainf.
retrieval
brain lung
0.18 0
0.36 0
0.18 0
0.90 0
0 0.53
0 0.800 0.27
=CS
MD
9.64 0
0 5.29x
0.58 0.58 0.58 0 0
0 0 0 0.71 0.71
x
CS-conceptMD-concept
doc-to-concept similarity matrix
![Page 25: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/25.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 25
SVD - Example• A = U VT - example:
1 1 1 0 0
2 2 2 0 0
1 1 1 0 0
5 5 5 0 0
0 0 0 2 2
0 0 0 3 30 0 0 1 1
datainf.
retrieval
brain lung
0.18 0
0.36 0
0.18 0
0.90 0
0 0.53
0 0.800 0.27
=CS
MD
9.64 0
0 5.29x
0.58 0.58 0.58 0 0
0 0 0 0.71 0.71
x
‘strength’ of CS-concept
![Page 26: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/26.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 26
SVD - Example• A = U VT - example:
1 1 1 0 0
2 2 2 0 0
1 1 1 0 0
5 5 5 0 0
0 0 0 2 2
0 0 0 3 30 0 0 1 1
datainf.
retrieval
brain lung
0.18 0
0.36 0
0.18 0
0.90 0
0 0.53
0 0.800 0.27
=CS
MD
9.64 0
0 5.29x
0.58 0.58 0.58 0 0
0 0 0 0.71 0.71
x
term-to-conceptsimilarity matrix
CS-concept
![Page 27: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/27.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 27
SVD - Example• A = U VT - example:
1 1 1 0 0
2 2 2 0 0
1 1 1 0 0
5 5 5 0 0
0 0 0 2 2
0 0 0 3 30 0 0 1 1
datainf.
retrieval
brain lung
0.18 0
0.36 0
0.18 0
0.90 0
0 0.53
0 0.800 0.27
=CS
MD
9.64 0
0 5.29x
0.58 0.58 0.58 0 0
0 0 0 0.71 0.71
x
term-to-conceptsimilarity matrix
CS-concept
![Page 28: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/28.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 28
SVD - Detailed outline• Motivation• Definition - properties• Interpretation• Complexity• Case studies• Additional properties
![Page 29: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/29.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 29
SVD - Interpretation #1‘documents’, ‘terms’ and ‘concepts’:• U: document-to-concept similarity matrix• V: term-to-concept sim. matrix: its diagonal elements: ‘strength’ of each
concept
![Page 30: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/30.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 30
SVD – Interpretation #1‘documents’, ‘terms’ and ‘concepts’:Q: if A is the document-to-term matrix, what
is AT A?A:Q: A AT ?A:
![Page 31: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/31.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 31
SVD – Interpretation #1‘documents’, ‘terms’ and ‘concepts’:Q: if A is the document-to-term matrix, what
is AT A?A: term-to-term ([m x m]) similarity matrixQ: A AT ?A: document-to-document ([n x n]) similarity
matrix
![Page 32: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/32.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) P1-32Copyright: Faloutsos, Tong (2009) 2-32
SVD properties• V are the eigenvectors of the covariance
matrix ATA
• U are the eigenvectors of the Gram (inner-product) matrix AAT
Further reading:1. Ian T. Jolliffe, Principal Component Analysis (2nd ed), Springer, 2002.2. Gilbert Strang, Linear Algebra and Its Applications (4th ed), Brooks Cole, 2005.
![Page 33: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/33.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 33
SVD - Interpretation #2• best axis to project on: (‘best’ = min sum of
squares of projection errors)
![Page 34: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/34.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 34
SVD - Motivation
![Page 35: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/35.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 35
SVD - interpretation #2
• minimum RMS error
SVD: givesbest axis to project
v1
first singular
vector
![Page 36: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/36.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 36
SVD - Interpretation #2
![Page 37: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/37.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 37
SVD - Interpretation #2• A = U VT - example:
1 1 1 0 0
2 2 2 0 0
1 1 1 0 0
5 5 5 0 0
0 0 0 2 2
0 0 0 3 30 0 0 1 1
0.18 0
0.36 0
0.18 0
0.90 0
0 0.53
0 0.800 0.27
=9.64 0
0 5.29x
0.58 0.58 0.58 0 0
0 0 0 0.71 0.71
x
v1
![Page 38: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/38.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 38
SVD - Interpretation #2• A = U VT - example:
1 1 1 0 0
2 2 2 0 0
1 1 1 0 0
5 5 5 0 0
0 0 0 2 2
0 0 0 3 30 0 0 1 1
0.18 0
0.36 0
0.18 0
0.90 0
0 0.53
0 0.800 0.27
=9.64 0
0 5.29x
0.58 0.58 0.58 0 0
0 0 0 0.71 0.71
x
variance (‘spread’) on the v1 axis
![Page 39: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/39.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 39
SVD - Interpretation #2• A = U VT - example:
– U gives the coordinates of the points in the projection axis
1 1 1 0 0
2 2 2 0 0
1 1 1 0 0
5 5 5 0 0
0 0 0 2 2
0 0 0 3 30 0 0 1 1
0.18 0
0.36 0
0.18 0
0.90 0
0 0.53
0 0.800 0.27
=9.64 0
0 5.29x
0.58 0.58 0.58 0 0
0 0 0 0.71 0.71
x
![Page 40: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/40.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 40
SVD - Interpretation #2• More details• Q: how exactly is dim. reduction done?
1 1 1 0 0
2 2 2 0 0
1 1 1 0 0
5 5 5 0 0
0 0 0 2 2
0 0 0 3 30 0 0 1 1
0.18 0
0.36 0
0.18 0
0.90 0
0 0.53
0 0.800 0.27
=9.64 0
0 5.29x
0.58 0.58 0.58 0 0
0 0 0 0.71 0.71
x
![Page 41: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/41.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 41
SVD - Interpretation #2• More details• Q: how exactly is dim. reduction done?• A: set the smallest singular values to zero:
1 1 1 0 0
2 2 2 0 0
1 1 1 0 0
5 5 5 0 0
0 0 0 2 2
0 0 0 3 30 0 0 1 1
0.18 0
0.36 0
0.18 0
0.90 0
0 0.53
0 0.800 0.27
=9.64 0
0 5.29x
0.58 0.58 0.58 0 0
0 0 0 0.71 0.71
x
![Page 42: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/42.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 42
SVD - Interpretation #2
1 1 1 0 0
2 2 2 0 0
1 1 1 0 0
5 5 5 0 0
0 0 0 2 2
0 0 0 3 30 0 0 1 1
0.18 0
0.36 0
0.18 0
0.90 0
0 0.53
0 0.800 0.27
~9.64 0
0 0x
0.58 0.58 0.58 0 0
0 0 0 0.71 0.71
x
![Page 43: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/43.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 43
SVD - Interpretation #2
1 1 1 0 0
2 2 2 0 0
1 1 1 0 0
5 5 5 0 0
0 0 0 2 2
0 0 0 3 30 0 0 1 1
0.18 0
0.36 0
0.18 0
0.90 0
0 0.53
0 0.800 0.27
~9.64 0
0 0x
0.58 0.58 0.58 0 0
0 0 0 0.71 0.71
x
![Page 44: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/44.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 44
SVD - Interpretation #2
1 1 1 0 0
2 2 2 0 0
1 1 1 0 0
5 5 5 0 0
0 0 0 2 2
0 0 0 3 30 0 0 1 1
0.18
0.36
0.18
0.90
0
00
~9.64
x
0.58 0.58 0.58 0 0
x
![Page 45: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/45.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 45
SVD - Interpretation #2
1 1 1 0 0
2 2 2 0 0
1 1 1 0 0
5 5 5 0 0
0 0 0 2 2
0 0 0 3 30 0 0 1 1
~
1 1 1 0 0
2 2 2 0 0
1 1 1 0 0
5 5 5 0 0
0 0 0 0 0
0 0 0 0 00 0 0 0 0
![Page 46: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/46.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 46
SVD - Interpretation #2Exactly equivalent:‘spectral decomposition’ of the matrix:
1 1 1 0 0
2 2 2 0 0
1 1 1 0 0
5 5 5 0 0
0 0 0 2 2
0 0 0 3 30 0 0 1 1
0.18 0
0.36 0
0.18 0
0.90 0
0 0.53
0 0.800 0.27
=9.64 0
0 5.29x
0.58 0.58 0.58 0 0
0 0 0 0.71 0.71
x
![Page 47: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/47.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 47
SVD - Interpretation #2Exactly equivalent:‘spectral decomposition’ of the matrix:
1 1 1 0 0
2 2 2 0 0
1 1 1 0 0
5 5 5 0 0
0 0 0 2 2
0 0 0 3 30 0 0 1 1
= x xu1 u2
1
2
v1
v2
![Page 48: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/48.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 48
SVD - Interpretation #2Exactly equivalent:‘spectral decomposition’ of the matrix:
1 1 1 0 0
2 2 2 0 0
1 1 1 0 0
5 5 5 0 0
0 0 0 2 2
0 0 0 3 30 0 0 1 1
= u11 vT1 u22 vT
2+ +...n
m
![Page 49: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/49.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 49
SVD - Interpretation #2Exactly equivalent:‘spectral decomposition’ of the matrix:
1 1 1 0 0
2 2 2 0 0
1 1 1 0 0
5 5 5 0 0
0 0 0 2 2
0 0 0 3 30 0 0 1 1
= u11 vT1 u22 vT
2+ +...n
m
n x 1 1 x m
r terms
![Page 50: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/50.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 50
SVD - Interpretation #2approximation / dim. reduction:by keeping the first few terms (Q: how many?)
1 1 1 0 0
2 2 2 0 0
1 1 1 0 0
5 5 5 0 0
0 0 0 2 2
0 0 0 3 30 0 0 1 1
= u11 vT1 u22 vT
2+ +...n
m
assume: 1 >= 2 >= ...
![Page 51: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/51.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 51
SVD - Interpretation #2A (heuristic - [Fukunaga]): keep 80-90% of
‘energy’ (= sum of squares of i ’s)
1 1 1 0 0
2 2 2 0 0
1 1 1 0 0
5 5 5 0 0
0 0 0 2 2
0 0 0 3 30 0 0 1 1
= u11 vT1 u22 vT
2+ +...n
m
assume: 1 >= 2 >= ...
![Page 52: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/52.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 52
Pictorially: matrix form of SVD
– Best rank-k approximation in L2
Am
n
m
n
U
VT
![Page 53: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/53.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 53
Pictorially: Spectral form of SVD
– Best rank-k approximation in L2
Am
n
+
1u1v1 2u2v2
![Page 54: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/54.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 54
SVD - Detailed outline• Motivation• Definition - properties• Interpretation
– #1: documents/terms/concepts– #2: dim. reduction– #3: picking non-zero, rectangular ‘blobs’
• Complexity• Case studies• Additional properties
![Page 55: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/55.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 55
SVD - Interpretation #3• finds non-zero ‘blobs’ in a data matrix
1 1 1 0 0
2 2 2 0 0
1 1 1 0 0
5 5 5 0 0
0 0 0 2 2
0 0 0 3 30 0 0 1 1
0.18 0
0.36 0
0.18 0
0.90 0
0 0.53
0 0.800 0.27
=9.64 0
0 5.29x
0.58 0.58 0.58 0 0
0 0 0 0.71 0.71
x
![Page 56: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/56.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 56
SVD - Interpretation #3• finds non-zero ‘blobs’ in a data matrix
1 1 1 0 0
2 2 2 0 0
1 1 1 0 0
5 5 5 0 0
0 0 0 2 2
0 0 0 3 30 0 0 1 1
0.18 0
0.36 0
0.18 0
0.90 0
0 0.53
0 0.800 0.27
=9.64 0
0 5.29x
0.58 0.58 0.58 0 0
0 0 0 0.71 0.71
x
![Page 57: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/57.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) P1-57
SVD - Interpretation #3• finds non-zero ‘blobs’ in a data matrix =• ‘communities’ (bi-partite cores, here)
1 1 1 0 0
2 2 2 0 0
1 1 1 0 0
5 5 5 0 0
0 0 0 2 2
0 0 0 3 30 0 0 1 1
Row 1
Row 4
Col 1
Col 3
Col 4Row 5
Row 7
![Page 58: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/58.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 58
SVD - Interpretation #3• Drill: find the SVD, ‘by inspection’!• Q: rank = ??
1 1 1 0 0
1 1 1 0 0
1 1 1 0 0
0 0 0 1 1
0 0 0 1 1
= x x?? ??
??
![Page 59: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/59.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 59
SVD - Interpretation #3• A: rank = 2 (2 linearly independent
rows/cols)
1 1 1 0 0
1 1 1 0 0
1 1 1 0 0
0 0 0 1 1
0 0 0 1 1
= x x??
??
?? 0
0 ??
??
??
![Page 60: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/60.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 60
SVD - Interpretation #3• A: rank = 2 (2 linearly independent
rows/cols)
1 1 1 0 0
1 1 1 0 0
1 1 1 0 0
0 0 0 1 1
0 0 0 1 1
= x x?? 0
0 ??
1 0
1 0
1 0
0 1
0 11 1 1 0 0
0 0 0 1 1
orthogonal??
![Page 61: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/61.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 61
SVD - Interpretation #3• column vectors: are orthogonal - but not
unit vectors:
1 1 1 0 0
1 1 1 0 0
1 1 1 0 0
0 0 0 1 1
0 0 0 1 1
= x x?? 0
0 ??
1/sqrt(3) 1/sqrt(3) 1/sqrt(3) 0 0
0 0 0 1/sqrt(2) 1/sqrt(2)
1/sqrt(3) 0
1/sqrt(3) 0
1/sqrt(3) 0
0 1/sqrt(2)
0 1/sqrt(2)
![Page 62: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/62.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 62
SVD - Interpretation #3• and the singular values are:
1 1 1 0 0
1 1 1 0 0
1 1 1 0 0
0 0 0 1 1
0 0 0 1 1
= x x3 0
0 2
1/sqrt(3) 1/sqrt(3) 1/sqrt(3) 0 0
0 0 0 1/sqrt(2) 1/sqrt(2)
1/sqrt(3) 0
1/sqrt(3) 0
1/sqrt(3) 0
0 1/sqrt(2)
0 1/sqrt(2)
![Page 63: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/63.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 63
SVD - Interpretation #3• Q: How to check we are correct?
1 1 1 0 0
1 1 1 0 0
1 1 1 0 0
0 0 0 1 1
0 0 0 1 1
= x x3 0
0 2
1/sqrt(3) 1/sqrt(3) 1/sqrt(3) 0 0
0 0 0 1/sqrt(2) 1/sqrt(2)
1/sqrt(3) 0
1/sqrt(3) 0
1/sqrt(3) 0
0 1/sqrt(2)
0 1/sqrt(2)
![Page 64: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/64.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 64
SVD - Interpretation #3• A: SVD properties:
– matrix product should give back matrix A– matrix U should be column-orthonormal, i.e.,
columns should be unit vectors, orthogonal to each other
– ditto for matrix V– matrix should be diagonal, with positive
values
![Page 65: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/65.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 65
SVD - Detailed outline• Motivation• Definition - properties• Interpretation• Complexity• Case studies• Additional properties
![Page 66: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/66.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 66
SVD - Complexity• O( n * m * m) or O( n * n * m) (whichever
is less)• less work, if we just want singular values• or if we want first k singular vectors• or if the matrix is sparse [Berry]• Implemented: in any linear algebra package
(LINPACK, matlab, Splus/R, mathematica ...)
![Page 67: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/67.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 67
SVD - conclusions so far• SVD: A= U VT : unique (*)• U: document-to-concept similarities• V: term-to-concept similarities• : strength of each concept• dim. reduction: keep the first few strongest
singular values (80-90% of ‘energy’)– SVD: picks up linear correlations
• SVD: picks up non-zero ‘blobs’
![Page 68: CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.](https://reader030.fdocuments.net/reader030/viewer/2022032516/56649c745503460f949273e2/html5/thumbnails/68.jpg)
CMU SCS
15-826 Copyright: C. Faloutsos (2012) 68
References• Berry, Michael: http://www.cs.utk.edu/~lsi/• Fukunaga, K. (1990). Introduction to Statistical
Pattern Recognition, Academic Press.• Press, W. H., S. A. Teukolsky, et al. (1992).
Numerical Recipes in C, Cambridge University Press.