Extensions of Non-negative Matrix Factorization to Higher Order data

37
Informatics and Mathematical Modelling / Intelligent Signal Processing 1 Morten Mørup Extensions of Non-negative Matrix Factorization to Higher Order data Morten Mørup Informatics and Mathematical Modeling Intelligent Signal Processing Technical University of Denmark

description

Extensions of Non-negative Matrix Factorization to Higher Order data. Morten Mørup Informatics and Mathematical Modeling Intelligent Signal Processing Technical University of Denmark. Sæby, May 22-2006. Parts of the work done in collaboration with. Sidse M. Arnfred, Dr. Med. PhD - PowerPoint PPT Presentation

Transcript of Extensions of Non-negative Matrix Factorization to Higher Order data

Informatics and Mathematical Modelling / Intelligent Signal Processing

1Morten Mørup

Extensions of Non-negative Matrix Factorization to Higher Order dataMorten Mørup Informatics and Mathematical ModelingIntelligent Signal ProcessingTechnical University of Denmark

Informatics and Mathematical Modelling / Intelligent Signal Processing

2Morten Mørup

Parts of the work done in collaboration with

Sæby, May 22-2006

Lars Kai Hansen, ProfessorDepartment of Signal Processing

Informatics and Mathematical Modeling,Technical University of Denmark

Mikkel N. Schmidt, Stud. PhDDepartment of Signal Processing

Informatics and Mathematical Modeling,Technical University of Denmark

Sidse M. Arnfred, Dr. Med. PhDCognitive Research Unit

Hvidovre HospitalUniversity Hospital of Copenhagen

Informatics and Mathematical Modelling / Intelligent Signal Processing

3Morten Mørup

Outline Non-negativity Matrix Factorization

(NMF) Sparse coding

(SNMF)

Convolutive PARAFAC models (cPARAFAC) Higher Order Non-negative Matrix Factorization

(an extension of NMF to the Tucker model)

Informatics and Mathematical Modelling / Intelligent Signal Processing

4Morten Mørup

NMF is based on Gradient Descent

NMF: VWH s.t. Wi,d,Hd,j0

Let C be a given cost function, then update the parameters according to:

Informatics and Mathematical Modelling / Intelligent Signal Processing

5Morten Mørup

The idea behind multiplicative updates

Positive termNegative term

Informatics and Mathematical Modelling / Intelligent Signal Processing

6Morten Mørup

Non-negative matrix factorization (NMF)

(Lee & Seung - 2001)

NMF gives Part based representation(Lee & Seung – Nature 1999)

Informatics and Mathematical Modelling / Intelligent Signal Processing

7Morten Mørup

The NMF decomposition is not unique

Simplical ConeHWH)(WP)(PWHV -1 ~~

NMF only unique when data adequately spans the positive orthant (Donoho & Stodden - 2004)

Informatics and Mathematical Modelling / Intelligent Signal Processing

8Morten Mørup

Sparse Coding NMF (SNMF)

(Eggert & Körner, 2004)

Informatics and Mathematical Modelling / Intelligent Signal Processing

9Morten Mørup

Illustration (the swimmer problem)

True Expressions

Sw

imm

er A

rtic

ulat

ions

NMF Expressions SNMF Expressions

)()()( pixelExpressionExpressiononArticulatipixelonArticulati HWV

Informatics and Mathematical Modelling / Intelligent Signal Processing

10Morten Mørup

Why sparseness? Ensures uniqueness Eases interpretability

(sparse representation factor effects pertain to fewer dimensions)

Can work as model selection(Sparseness can turn off excess factors by letting them become zero)

Resolves over complete representations (when model has many more free variables than data points)

Informatics and Mathematical Modelling / Intelligent Signal Processing

11Morten Mørup

PART I: Convolutive PARAFAC (cPARAFAC)

Informatics and Mathematical Modelling / Intelligent Signal Processing

12Morten Mørup

By cPARAFAC means PARAFAC convolutive in at least one modality

c2PARAFAC

,,,,,

cPARAFAC

,,,,,

PARAFACRegular

,,,,,

kjikji

kjikji

kjikji

HWDV

HWDV

HWDV

Convolution can be in any combination of modalities-Single convolutive, double convolutive etc.

,,, jiji SAX

Convolution: The process of generating Xby convolving (sending) the sources S

through the filter A

Deconvolution: The process of estimating

the filter A from X and S

Informatics and Mathematical Modelling / Intelligent Signal Processing

13Morten Mørup

Relation to other models PARAFAC2 (Harshman, Kiers, Bro) Shifted PARAFAC (Hong and Harshman, 2003)

,,,,, kjikji HWDV

cPARAFAC can account for echo effects cPARAFAC becomes shifted PARAFACwhen convolutive filter is sparse

3

3

Informatics and Mathematical Modelling / Intelligent Signal Processing

14Morten Mørup

Application example of cPARAFAC

Transcription and separation of music

Informatics and Mathematical Modelling / Intelligent Signal Processing

15Morten Mørup

The ‘ideal’ Log-frequency Magnitude Spectrogram of an instrument Different notes played by an

instrument corresponds on a logarithmic frequency scale to a translation of the same harmonicstructure of a fixed temporal pattern

Time [s]

Fre

qu

en

cy

[H

z]

0 0.5 1 1.5 2 2.5 3 3.5

200

400

800

1600

3200

Tchaikovsky: Violin Concert in D Major

Mozart Sonate no,. 16 in C Major

Informatics and Mathematical Modelling / Intelligent Signal Processing

16Morten Mørup

NMF 2D deconvolution (NMF2D1): The Basic Idea

Model a log-spectrogram of polyphonic music by an extended type of non-negative matrix factorization:– The frequency signature of a specific note played by an

instrument has a fixed temporal pattern (echo) model convolutive in time

– Different notes of same instrument has same time-log-frequency signature but varying in fundamental frequency (shift) model convolutive in the log-frequency axis.

(1Mørup & Scmidt, 2006)

Informatics and Mathematical Modelling / Intelligent Signal Processing

17Morten Mørup

NMF2D Model

NMF2D Model – extension of NMFD1:

(1Smaragdis, 2004, Eggert et al. 2004, Fitzgerald et al. 2005)

Informatics and Mathematical Modelling / Intelligent Signal Processing

18Morten Mørup

048

0 2 4 6

Time [s]

Fre

qu

ency

[H

z]

0 0.2 0.4 0.6 0.8200

400

800

1600

3200

Understanding the NMF2D Model

Informatics and Mathematical Modelling / Intelligent Signal Processing

19Morten Mørup

The NMF2D has inherent ambiguity between the structure in W and H

To resolve this ambiguity sparsity is imposed on H to force ambiguous structure onto W

Informatics and Mathematical Modelling / Intelligent Signal Processing

20Morten Mørup

NMF2D SNMF2D

Real music example of how imposing sparseness resolves the ambiguity between W and H

Informatics and Mathematical Modelling / Intelligent Signal Processing

21Morten Mørup

Unique!!

Not unique

PARAFAC(Harshman & Carrol and Chang 1970)

Factor analysis(Charles Spearman ~1900)

Extension to multi channel analysis by the PARAFAC model

λb

Informatics and Mathematical Modelling / Intelligent Signal Processing

22Morten Mørup

cPARAFAC: Sparse Non-negative Tensor Factor 2D deconvolution (SNTF2D)

(Extension of Fitzgerald et al. 2005, 2006 to form a sparse double deconvolution)

Informatics and Mathematical Modelling / Intelligent Signal Processing

23Morten Mørup

SNTF2D algorithms

Informatics and Mathematical Modelling / Intelligent Signal Processing

24Morten Mørup

Tchaikovsky: Violin Concert in D Major Mozart Sonate no. 16 in C Major

Informatics and Mathematical Modelling / Intelligent Signal Processing

25Morten Mørup

Stereo recording of ”Fog is Lifting” by Carl Nielsen

6850

0.72

86

0.4209

0.90

71

Estimated Harp Estimated Flute

Stereo Channel 1 Stereo Channel 2

25.9 ms

50 Hz

22 kHz

25.9 ms

22 kHz

50 Hz

25.9 ms25.9 ms

50 Hz

22 kHz 22 kHz

50 Hz

Log-Spectrogram Channel 1 Log-Spectrogram Channel 2

Informatics and Mathematical Modelling / Intelligent Signal Processing

26Morten Mørup

Applications Applications

– Source separation.– Music information retrieval. – Automatic music transcription (MIDI compression).– Source localization (beam forming)

Informatics and Mathematical Modelling / Intelligent Signal Processing

27Morten Mørup

PART II: Higher Order NMF (HONMF)

Informatics and Mathematical Modelling / Intelligent Signal Processing

28Morten Mørup

Higher Order Non-negative Matrix Factorization (HONMF)Motivation:Many of the data sets previously explored by the Tucker model are non-negative and could with good reason be decomposed under constraints of non-negativity on all modalities including the core.

Spectroscopy data(Smilde et al. 1999,2004, Andersson & Bro 1998, Nørgard & Ridder 1994)

Web mining(Sun et al., 2004)

Image Analysis(Vasilescu and Terzopoulos, 2002, Wang and Ahuja, 2003, Jian and Gong, 2005)

Semantic Differential Data(Murakami and Kroonenberg, 2003)

And many more……

TimeSpectreBatchStrength

X

pagesWebQueriesUserscountsClick

X

PixelssExpressiononsIlluminatiViewsPeopleIntensityImage

X

ScalesPiecesMusicJudgesGrade

X

Informatics and Mathematical Modelling / Intelligent Signal Processing

29Morten Mørup

However, non-negative Tucker decompositions are notin general unique!

But - Imposing sparseness overcomes this problem!

Informatics and Mathematical Modelling / Intelligent Signal Processing

30Morten Mørup

The Tucker Model

Informatics and Mathematical Modelling / Intelligent Signal Processing

31Morten Mørup

Algorithms for HONMF

Informatics and Mathematical Modelling / Intelligent Signal Processing

32Morten Mørup

Results

HONMF with sparseness, above imposed on the core canbe used for model selection -here indicating the PARAFACmodel is the appropriate model to the data.Furthermore, the HONMF gives a more part based hence easy interpretable solution than the HOSVD.

Informatics and Mathematical Modelling / Intelligent Signal Processing

33Morten Mørup

Evaluation of uniqueness

Informatics and Mathematical Modelling / Intelligent Signal Processing

34Morten Mørup

Data of a Flow Injection Analysis (Nørrgaard, 1994)

HONMF with sparse core and mixing captures unsupervisedthe true mixing and model order!

Informatics and Mathematical Modelling / Intelligent Signal Processing

35Morten Mørup

Conclusion HONMF not in general unique, however when

imposing sparseness uniqueness can be achieved. Algorithms devised for LS and KL able to impose

sparseness on any combination of modalities The HONMF decompositions more part based hence

easier to interpret than other Tucker decompositions such as the HOSVD.

Imposing sparseness can work as model selection turning of excess components

Informatics and Mathematical Modelling / Intelligent Signal Processing

36Morten Mørup

Coming soon in a MATLAB implementation near You

Informatics and Mathematical Modelling / Intelligent Signal Processing

37Morten Mørup

ReferencesCarroll, J. D. and Chang, J. J. Analysis of individual differences in multidimensional scaling via an N-way generalization of "Eckart-Young" decomposition, Psychometrika 35 1970 283--319Eggert, J. and Korner, E. Sparse coding and NMF. In Neural Networks volume 4, pages 2529-2533, 2004Eggert, J et al Transformation-invariant representation and nmf. In Neural Networks, volume 4 , pages 535-2539, 2004Fiitzgerald, D. et al. Non-negative tensor factorization for sound source separation. In proceedings of Irish Signals and Systems Conference, 2005FitzGerald, D. and Coyle, E. C Sound source separation using shifted non.-negative tensor factorization. In ICASSP2006, 2006Fitzgerald, D et al. Shifted non-negative matrix factorization for sound source separation. In Proceedings of the IEEE conference on Statistics in Signal Processing. 2005Harshman, R. A. Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-modal factor analysis},UCLA Working Papers in Phonetics 16 1970 1—84

Harshman, Richard A.Harshman and Hong, Sungjin Lundy, Margaret E. Shifted factor analysis—Part I: Models and properties J. Chemometrics (17) pages 379–388, 2003Kiers, Henk A. L. and Berge, Jos M. F. ten and Bro, Rasmus PARAFAC2 - Part I. A direct fitting algorithm for the PARAFAC2 model, Journal of Chemometrics (13) nr.3-4 pages 275-294, 1999Lathauwer, Lieven De and Moor, Bart De and Vandewalle, Joos MULTILINEAR SINGULAR VALUE DECOMPOSITION.SIAM J. MATRIX ANAL. APPL.2000 (21)1253–1278Lee, D.D. and Seung, H.S. Algorithms for non-negative matrix factorization. In NIPS, pages 556-462, 2000Lee, D.D and Seung, H.S. Learning the parts of objects by non-negative matrix factorization, NATURE 1999

Murakami, Takashi and Kroonenberg, Pieter M. Three-Mode Models and Individual Differences in Semantic Differential Data, Multivariate Behavioral Research(38) no. 2 pages 247-283, 2003

Mørup, M. and Hansen, L.K.and Arnfred, S.M.Decomposing the time-frequency representation of EEG using nonnegative matrix and multi-way factorization Technical report, Institute for Mathematical Modeling, Technical University of Denmark, 2006a

Mørup, M. and Schmidt, M.N. Sparse non-negative matrix factor 2-D deconvolution. Technical report, Institute for Mathematical Modeling, Tehcnical University of Denmark, 2006b

Mørup, M and Schmidt, M.N. Non-negative Tensor Factor 2D Deconvolution for multi-channel time-frequency analysis. Technical report, Institute for Mathematical Modeling, Technical University of Denmark, 2006c

Schmidt, M.N. and Mørup, M. Non-negative matrix factor 2D deconvolution for blind single channel source separation. In ICA2006, pages 700-707, 2006d

Mørup, M. and Hansen, L.K.and Arnfred, S.M. Algorithms for Sparse Higher Order Non-negative Matrix Factorization (HONMF), Technical report, Institute for Mathematical Modeling, Technical University of Denmark, 2006e

Nørgaard, L and Ridder, C.Rank annihilation factor analysis applied to flow injection analysis with photodiode-array detection Chemometrics and Intelligent Laboratory Systems 1994 (23) 107-114

Schmidt, M.N. and Mørup, M. Sparse Non-negative Matrix Factor 2-D Deconvolution for Automatic Transcription of Polyphonic Music, Technical report, Institute for Mathematical Modelling, Tehcnical University of Denmark, 2005

Smaragdis, P. Non-negative Matrix Factor deconvolution; Extraction of multiple sound sources from monophonic inputs. International Symposium on independent Component Analysis and Blind Source Separation (ICA)W

Smilde, Age K. Smilde and Tauller, Roma and Saurina, Javier and Bro, Rasmus, Calibration methods for complex second-order data Analytica Chimica Acta 1999 237-251

Sun, Jian-Tao and Zeng, Hua-Jun and Liu, Huanand Lu Yuchang and Chen Zheng CubeSVD: a novel approach to personalized Web search WWW '05: Proceedings of the 14th international conference on World Wide Web pages 382—390, 2005

Tamara G. Kolda Multilinear operators for higher-order decompositions technical report Sandia national laboratory 2006 SAND2006-2081.

Tucker, L. R. Some mathematical notes on three-mode factor analysis Psychometrika 31 1966 279—311

Welling, M. and Weber, M. Positive tensor factorization. Pattern Recogn. Lett. 2001

Vasilescu , M. A. O. and Terzopoulos , Demetri Multilinear Analysis of Image Ensembles: TensorFaces, ECCV '02: Proceedings of the 7th European Conference on Computer Vision-Part I, 2002