Hybrid NMF APSIPA2014 invited

31
Hybrid Multichannel Signal Separation Using Supervised Nonnegative Matrix Factorization Daichi Kitamura, (The University of Tokyo, Japan) Hiroshi Saruwatari, (The University of Tokyo, Japan) Satoshi Nakamura, (Nara Institute of Science and Technology, Japan) Yu Takahashi, (Yamaha Corporation, Japan) Kazunobu Kondo, (Yamaha Corporation, Japan) Hirokazu Kameoka, (The University of Tokyo, Japan) 東東東東YAMAHA

Transcript of Hybrid NMF APSIPA2014 invited

Page 1: Hybrid NMF APSIPA2014 invited

Hybrid Multichannel Signal Separation Using Supervised Nonnegative Matrix Factorization

Daichi Kitamura, (The University of Tokyo, Japan)

Hiroshi Saruwatari, (The University of Tokyo, Japan)

Satoshi Nakamura, (Nara Institute of Science and Technology, Japan)

Yu Takahashi, (Yamaha Corporation, Japan)

Kazunobu Kondo, (Yamaha Corporation, Japan)

Hirokazu Kameoka, (The University of Tokyo, Japan)

東京大学, YAMAHA

Page 2: Hybrid NMF APSIPA2014 invited

2

Outline• 1. Research background• 2. Conventional methods

– Nonnegative matrix factorization– Supervised nonnegative matrix factorization– Multichannel NMF

• 3. Proposed method– SNMF with spectrogram restoration and its Hybrid method

• 4. Experiments– Closed data experiment– Open data experiment

• 5. Conclusions

Page 3: Hybrid NMF APSIPA2014 invited

3

Outline• 1. Research background• 2. Conventional methods

– Nonnegative matrix factorization– Supervised nonnegative matrix factorization– Multichannel NMF

• 3. Proposed method– SNMF with spectrogram restoration and its Hybrid method

• 4. Experiments– Closed data experiment– Open data experiment

• 5. Conclusions

Page 4: Hybrid NMF APSIPA2014 invited

4

Research background• Signal separation have received much attention.

• Music signal separation based on nonnegative matrix factorization (NMF) is a very active research area.

• Supervised NMF (SNMF) achieves the highest separation performance.

• To improve its performance, SNMF-based multichannel signal separation method is required.

• Automatic music transcription• 3D audio system, etc.

Applications

Separate!

Separate the target signal from multichannel signals with high accuracy.

Page 5: Hybrid NMF APSIPA2014 invited

5

Outline• 1. Research background• 2. Conventional methods

– Nonnegative matrix factorization– Supervised nonnegative matrix factorization– Multichannel NMF

• 3. Proposed method– SNMF with spectrogram restoration and its Hybrid method

• 4. Experiments– Closed data experiment– Open data experiment

• 5. Conclusions

Page 6: Hybrid NMF APSIPA2014 invited

6

• NMF can extract significant spectral patterns.

– Basis matrix has frequently-appearing spectral patterns in .

NMF [Lee, et al., 2001]

Amplitude

Am

plitu

de

Observed matrix(spectrogram)

Basis matrix(spectral patterns)

Activation matrix(Time-varying gain)

Time

: Number of frequency bins: Number of time frames: Number of bases

Time

Freq

uenc

y

Freq

uenc

y

Basis

Page 7: Hybrid NMF APSIPA2014 invited

7

• SNMF – Supervised spectral separation method

Supervised NMF [Smaragdis, et al., 2007]

Separation process Optimize

Training process

Supervised basis matrix (spectral dictionary)

Sample sounds of target signal

Fixed

Sample sound

Target signal Other signalMixed signal

Page 8: Hybrid NMF APSIPA2014 invited

8

Problems of SNMF• SNMF is only for a single-channel signal

– For multichannel signal, SNMF cannot use information between channels.

• When many interference sources exist, separation performance of SNMF markedly degrades.

Separate

Residual components

Page 9: Hybrid NMF APSIPA2014 invited

9

• Multichannel NMF – is a natural extension of NMF for a multichannel signal– uses spatial information for the clustering of bases to

achieve the unsupervised separation task.

Multichannel NMF [Sawada, et al., 2013]

Problems: Multichannel NMF involve strong dependence on initial values and lack robustness.

Microphone array

Page 10: Hybrid NMF APSIPA2014 invited

10

Outline• 1. Research background• 2. Conventional methods

– Nonnegative matrix factorization– Supervised nonnegative matrix factorization– Multichannel NMF

• 3. Proposed method– Motivation and strategy– SNMF with spectrogram restoration and its Hybrid method

• 4. Experiments– Closed data experiment– Open data experiment

• 5. Conclusions

Page 11: Hybrid NMF APSIPA2014 invited

11

• Sawada’s multichannel NMF– is unified method to solve spatial and spectral separations.– Maximizes a likelihood:

– For supervised situation, target spectral patterns is given.

– Too much difficult to solve (lack robustness)– Computationally inefficient (much computational time)

Motivation and strategy

Spatial direction of target signal

Source components of all signals

Target Other

Observed spectrograms

Page 12: Hybrid NMF APSIPA2014 invited

12

• Proposed hybrid method– divides the problems as follows:

– The spatial separation should be carried out with classical D.O.A. estimation methods.• These methods are very efficient and stable.

– Divide and conquer method

Motivation and strategy

Unsupervised spatial separation

Supervised spectral separation

Approximation

Classical D.O.A. estimation SNMF-based method

Page 13: Hybrid NMF APSIPA2014 invited

13

Directional clustering [Araki, et al., 2007]

• Directional clustering– Unsupervised spatial separation method– k-means clustering (fast and stable)

• Problems– Artificial distortion arises owing to the binary masking.

Right

L R

CenterLeft

L R

Center

Binary masking

Input signal (stereo) Separated signal

1 

1 

1 

0 

0 

0 

1 

0 

0 

0 

0 

0 

1 

1  1 1

 0 

0 

1 

0 

0 

0 

0 

0 

1  1 1

  1 1 

1 

Freq

uenc

y

Time

C 

C 

C 

R  L R

 C 

L 

L 

L 

R 

R 

C 

C  C C

 R 

R 

C 

R 

R 

L 

L 

L 

C C C  C C

 C 

Freq

uenc

y

Time

Binary maskSpectrogram

Entry-wise product

Page 14: Hybrid NMF APSIPA2014 invited

14

Proposed method: hybrid separation• Hybrid separation method

Input stereo signal

Spatial separation method (Directional clustering)

SNMF-based separation method(SNMF with spectrogram restoration)

Separated signal

L R

Page 15: Hybrid NMF APSIPA2014 invited

15

SNMF with spectrogram restoration

: Holes

Time

Freq

uenc

y

Separated cluster Spectral holes (lost components)

The proposed SNMF treats these holes as unseen observationsSupervised basis

Extrapolate the fittest bases

(dictionary of target signal)

Fix up

Page 16: Hybrid NMF APSIPA2014 invited

16

SNMF with spectrogram restoration

Center RightLeftDirection

sour

ce c

ompo

nent

z

(b)

Center RightLeftDirection

sour

ce c

ompo

nent (a)

Target

Center RightLeftDirection

sour

ce c

ompo

nent (c)

Extrapolated components

Freq

uenc

y of

Freq

uenc

y of

Freq

uenc

y of

After

Input

After

signal

directionalclustering

super-resolution-based SNMF

Binary masking

Time

Freq

uenc

yObserved spectrogram

Target

Interference

Time

Time

Freq

uenc

y

Extrapolate

Freq

uenc

y

Separated cluster

Reconstructed data

Supervised spectral bases

Directional clustering

SNMF with spectrogram restoration

Page 17: Hybrid NMF APSIPA2014 invited

17

• The divergence is defined at all grids except for the holes by using the Binary mask matrix .

Decomposition model and cost function

Decomposition model: Supervised bases (Fixed)

: Entries of matrices, , and , respectively: Weighting parameters,: Binary complement, : Frobenius norm

Cost function:

: Binary masking matrix obtained from directional clustering

Page 18: Hybrid NMF APSIPA2014 invited

18

• The divergence is defined at all grids except for the holes by using the Binary mask matrix .

Decomposition model and cost function

Decomposition model: Supervised bases (Fixed)

: Entries of matrices, , and , respectively: Weighting parameters,: Binary complement, : Frobenius norm

Cost function:

: Binary masking matrix obtained from directional clustering

Binary index to exclude the holes

Page 19: Hybrid NMF APSIPA2014 invited

19

• The divergence is defined at all grids except for the holes by using the Binary mask matrix .

Decomposition model and cost function

Decomposition model: Supervised bases (Fixed)

: Entries of matrices, , and , respectively: Weighting parameters,: Binary complement, : Frobenius norm

Regularization term

Cost function:

: Binary masking matrix obtained from directional clustering

Binary index to exclude the holes

Page 20: Hybrid NMF APSIPA2014 invited

20

• The divergence is defined at all grids except for the holes by using the Binary mask matrix .

Decomposition model and cost function

Decomposition model: Supervised bases (Fixed)

: Entries of matrices, , and , respectively: Weighting parameters,: Binary complement, : Frobenius norm

Regularization termPenalty term[Kitamura, et al. 2014]

Cost function:

: Binary masking matrix obtained from directional clustering

Binary index to exclude the holes

Page 21: Hybrid NMF APSIPA2014 invited

21

• : -divergence [Eguchi, et al., 2001]

– EUC-distance

– KL-divergence

– IS-divergence

Generalized divergence: b -divergence

The best criterion for signal separation [Kitamura, et al., 2014]

Page 22: Hybrid NMF APSIPA2014 invited

22

• We used two -divergences for the main cost and the regularization cost as and .

Decomposition model and cost function

Decomposition model:

Cost function: Supervised bases (Fixed)

Page 23: Hybrid NMF APSIPA2014 invited

23

Update rules• We can obtain the update rules for the optimization of

the variables matrices , , and .

Update rules:

Page 24: Hybrid NMF APSIPA2014 invited

24

Outline• 1. Research background• 2. Conventional methods

– Nonnegative matrix factorization– Supervised nonnegative matrix factorization– Multichannel NMF

• 3. Proposed method– SNMF with spectrogram restoration and its Hybrid method

• 4. Experiments– Closed data experiment– Open data experiment

• 5. Conclusions

Page 25: Hybrid NMF APSIPA2014 invited

25

• Mixed signal includes four melodies (sources).• Three compositions of instruments

– We evaluated the average score of 36 patterns.

Experimental condition

Center

12 3

Left Right

Target source

Supervision signal

24 notes that cover all the notes in the target melody

Dataset Melody 1 Melody 2 Midrange BassNo. 1 Oboe Flute Piano TromboneNo. 2 Trumpet Violin Harpsichord FagottoNo. 3 Horn Clarinet Piano Cello

Page 26: Hybrid NMF APSIPA2014 invited

26

14121086420

SD

R [d

B]

43210bNMF

• Signal-to-distortion ratio (SDR)– total quality of the separation, which includes the degree of

separation and absence of artificial distortion.

Experimental result: closed data

Good

Bad

Conventional SNMF(single-channel SNMF)

Proposed hybrid method

Directional clustering

Supervised Multichannel NMF [Sawada]

KL-divergence EUC-distance

Page 27: Hybrid NMF APSIPA2014 invited

27

SNMF with spectrogram restoration• SNMF with spectrogram restoration has two tasks.

• The optimal divergence for source separation is KL-divergence ( ).

• In contrast, a divergence with higher value is suitable for the basis extrapolation.

Source separation

SNMF with spectrogram restoration

Basis extrapolation

Page 28: Hybrid NMF APSIPA2014 invited

28

Trade-off: separation and restoration• The optimal divergence for SNMF with spectrogram

restoration and its hybrid method is based on the trade-off between separation and restoration abilities.

-10-8-6-4-20

Am

plitu

de [d

B]

543210Frequency [kHz]

-10-8-6-4-20

Am

plitu

de [d

B]

543210Frequency [kHz]

Sparseness: strong Sparseness: weak

Per

form

ance

Separation

Total performance of the hybrid method

Restoration

0 1 2 3 4

Page 29: Hybrid NMF APSIPA2014 invited

29

• Closed data experiment– used different Tone generator for training and test signals

Experimental condition

Supervision signal

24 notes that cover all the notes in the target melody

Provided by Tone generator A

Provided by Tone generator B (more real sound)

+ back ground noise (SNR = 10 dB)

Center

12 3

Left Right

Target source

Page 30: Hybrid NMF APSIPA2014 invited

30

1086420-2-4

SD

R [d

B]

43210bNMF

• Signal-to-distortion ratio (SDR)– total quality of the separation, which includes the degree of

separation and absence of artificial distortion.

Experimental result: open data

Good

Bad

Conventional SNMF(single-channel SNMF)

Proposed hybrid method

Directional clustering

Supervised Multichannel NMF [Sawada]

KL-divergence EUC-distance

Page 31: Hybrid NMF APSIPA2014 invited

31

Conclusions• We proposed a hybrid multichannel signal separation

method combining directional clustering and SNMF with spectrogram restoration.

• There is a trade-off between separation and restoration abilities.

Thank you for your attention!

Demonstration is available!