Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding •...

31
SIP - Sound and Image Processing Lab, EE, KTH Stockholm 1 FlexCode The Sensitivity Matrix Integrating Perception into the Flexcode Project Jan Plasberg Flexcode Seminar Lannion June 6, 2007

Transcript of Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding •...

Page 1: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 1

FlexCode

The Sensitivity Matrix

Integrating Perception into the Flexcode Project

Jan PlasbergFlexcode Seminar Lannion

June 6, 2007

Page 2: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 2

FlexCode Outline

• Flexcode in a Nutshell• The Sensitivity Matrix• Coding with the Sensitivity Matrix• Open Issues

Page 3: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 3

FlexCode Who?

France Telecom

RWTH Aachen

KTH

NokiaEricsson

Page 4: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 4

FlexCode The Problem

• Heterogeneity of networks increasing• Networks inherently variable (mobile users)

• But:– Coders not designed for specific environment– Coders inflexible (codebooks and FEC)– Feedback channel underutilized

Page 5: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 5

FlexCode Adaptation and Coding

set of models, fixed quantizer

flexible

rigid

CELP with FEC

fixed model

fixed quantizer

adaptive coding

AMR

FlexCode

PCM

Page 6: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 6

FlexCode The Tools

• Tools include– Models of source, channel, receiver– High-rate quantization theory– Multiple description coding (MDC)– Iterative source-channel decoding– Distortion measures using the sensitivity matrix

Page 7: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 7

FlexCode Outline

• Flexcode in a Nutshell• The Sensitivity Matrix

– Why, what and how?– What can it tell us?

• Coding with the Sensitivity Matrix• Open Issues

Page 8: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 8

FlexCode

• Simultaneous masking (frequency masking)

Human Auditory Masking

Page 9: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 9

FlexCode

• Non-simultaneous masking (time masking)

• Not accounted for by simple auditory models• Coding standards use (heuristic) work-arounds

Human Auditory Masking

Page 10: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 10

FlexCode

• Merge the worlds ofauditory modeling and audio coding

• Advanced auditory models too complex for coding

AM

AM+

x

x̂⋅ )ˆ,( xxd

perceptual distortion criterion

Motivation

Page 11: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 11

FlexCode The Sensitivity Matrix

• Sensitivity Matrix M from Taylor expansion

Complexity problem solved!

Analysis of model properties using linear algebra• Masking curve• Subspace-based analysis

0)ˆ,(),ˆ()()ˆ(21)ˆ,( →−−≈ xxxxxMxxxx dd x

H

xx

xxxM=

∂∂∂

2

, ˆˆ)ˆ,()]([

jijix xx

d

Page 12: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 12

FlexCode Application to an Auditory Model

• A typical auditory model (here: the Dau model)

• Distortion is sum of per-channel distortions

Filte

rban

k

Stage 1

Stage 2

Stage KL yx

channel m

Page 13: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 13

FlexCode Application to an Auditory Model

• Find sensitivity matrix for model output y per channel m

Filte

rban

kStage

1Stage

2Stage

KL yx

)ˆ()()ˆ(21)ˆ,( )()( yyyMyyyy −−= m

yHmd

IyM 2)()( =myhere

Page 14: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 14

FlexCode Application to an Auditory Model

• Each model stage linearized by respective Jacobian

Filte

rban

kStage

1Stage

2Stage

KL yx

z∂ y∂)(zJ z

Page 15: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 15

FlexCode Application to an Auditory Model

• Get sensitivity matrix for channel m in input signal domain from chain rule;

Filte

rban

kStage

1Stage

2Stage

KL yx

∏∏⎥⎥⎦

⎢⎢⎣

⎡=

k

mk

H

k

mk

mx x )()()( 2)( JJM

Page 16: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 16

FlexCode Application to an Auditory Model

• Sensitivity Matrix is sum of per-channel matrices

Filte

rban

kStage

1Stage

2Stage

KL yx

∑ ∏∏∑⎥⎥⎦

⎢⎢⎣

⎡==

m k

mk

H

k

mk

m

mxx

)()()( 2)()( JJxMxM

Page 17: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 17

FlexCode Validity of the Approximation

• Narrow-band music + white noise at different SNR (10-60 dB), correlation > 0.9

(dB)Daud

(dB)SMd coding possible

Page 18: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 18

FlexCode Analysis: Fourier Transform

• Sensitivity matrix for frequency domain vectors, with = DFT in matrix notation

)(XM X

HxX FxMFXM )()( =

F

)ˆ()ˆ( XXFxx −=− H

Page 19: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 19

FlexCode Analysis: Masking Curve

• Assume masking threshold is at • Masking curve at frequency i:

gain for unit distortion in DFT bin i

to approach masking threshold

iiXHi guXMu )(

211 =

[ ]Hi 010 LL=u

[ ] iiXig

,)(2XM

=⇔

ig

1)ˆ( =XX,d

pos. i

Page 20: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 20

FlexCode

0 500 1000 1500 2000 2500 3000 3500 40000

10

20

30

40

50

60

F

MA

SK

Example 1: Simultaneous Masking

• Masking curve for 50 dB sinusoid at 1000 Hz

dB

f (Hz)

- sensitivity matrix for Dau et al.- spectral model (v.d.Par, 2002)O human listeners (Zwicker, 1982)

Page 21: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 21

FlexCodeExample 2: Non-Simultaneous Masking

• Subspace analysis of M: low sensitivity space

signal

noise

projection

t (ms)

Page 22: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 22

FlexCodeF

200

700

F

200

700

F

T 50 100

200

700

A Spectro-Temporal Experiment

• Spectrograms for two sinusoids (200 Hz and 700 Hz) coded with an APCM coder

signal

WMSE-coder

Sens.-coder

f (H

z)

t (ms)

f (H

z)f (

Hz)

Page 23: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 23

FlexCode Outline

• Flexcode in a Nutshell• The Sensitivity Matrix• Coding with the Sensitivity Matrix• Open Issues

Page 24: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 24

FlexCode Problem

• Sensitivity Matrix generally not diagonal in typical coding domains (e.g., time, frequency, etc.)

– distortion not single-letter:

– no analytical solution known for optimal coding– so far only trained VQ available

(undoable for large dimensions)!

)ˆ()()ˆ(21)ˆ,( xxxMxxxx −−≈ x

Hd

Page 25: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 25

FlexCode Solution Outline

• Sensitivity Matrix diagonal in perceptual domain– transform signal vectors into perceptual domain– (W)MSE-optimal coding in perceptual domain results in

minimum perceptual distortion • Optimal Transform:

)()()( xMxFxF xoptT

opt c=′′

Page 26: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 26

FlexCode Making it Practical

• Problem: optimal transform depends on x– not known at decoder!

• Solution: Parameterize transform as DCT + diagonal weights

– combines DCT coding gain with perceptual transform!– Model-driven approach to temporal noise shaping (TNS)

tDCTf WTWxF =′ )(

Page 27: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 27

FlexCode Examples

• Original

• Freq. weights

• Time-freq. weights

DCTf TWxF =′ )(

tDCTf WTWxF =′ )(

Page 28: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 28

FlexCode Outline

• Flexcode in a Nutshell• The Sensitivity Matrix• Coding with the Sensitivity Matrix• Open Issues

Page 29: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 29

FlexCode Complexity

• Main complexity lies here:

• Can we make multiplications ‘sparse’?• Can we reduce the size of the matrices?

∑ ∏∏⎥⎥⎦

⎢⎢⎣

⎡=

m k

mk

H

k

mk

)()(2)( JJxM

Page 30: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 30

FlexCode Conclusion

• Using auditory models in coding is possible• General method• Direct applicability in AbS structures (CELP)• Solution outline for transform coding

Page 31: Integrating Perception into the Flexcode Project FlexCode...auditory modeling and audio coding • Advanced auditory models too complex for coding AM AM + x x ˆ ⋅ d ( , xx ˆ) perceptual

SIP - Sound and Image Processing Lab, EE, KTH Stockholm 31

FlexCode The End

Thank you!