Wavelets and Sparse Signal Processingmin.sjtu.edu.cn/files/wavelet/intro.pdf · Stéphane Mallat, A...

Wavelets and Sparse Signal Processing

Instructor: Hongkai Xiong (熊红凯) Distinguished Professor (特聘教授)

http://min.sjtu.edu.cn

Department of Electronic Engineering Department of Computer Science and Engineering

Shanghai Jiao Tong University

2019

http://min.sjtu.edu.cn/

1

Hongkai Xiong, Distinguished Professor Office : 1-309, No.1, SEIEE Bld. Email: [email protected] Web-page: http://min.sjtu.edu.cn

Wenrui Dai, Associate Professor Office : 1-304, No.1, SEIEE Bld. Email: [email protected]




Teaching Assistants: • Ph.D Candidate: Mr. Wen Fei Email: [email protected]

• Ph.D Candidate: Ms. Tianran Wu Email: [email protected]

Office : 1-304, No.1, SEIEE Bld.

单击此处编辑母版标题样式

Part I - fundamentals

Continuous time Fourier transform Discrete time Fourier transform Discrete Fourier transform Z transform


Part II – wavelets and sparse signal processing

Time meets frequency Wavelet frames Wavelet zoom Wavelet bases Multiscale geometric analysis Lifting wavelet and filter banks Sparse representation Scattering transform Graph signal processing


Text books and references Stéphane Mallat, A Wavelet Tour to Signal

Processing, The Sparse Way, Third Edition,

Elsevier, 2009

Michael Elad, Sparse and Redundant

Representations, From Theory to Applications

in Signal and Image Processing, Springer, 2010

Alan V. Oppenheim, Signals & Systems, Second

Edition, Publishing House of Electronics Industry of

China

Website: http://min.sjtu.edu.cn/courses/wt.htm







Related Sources “Sparse and Redundant Representations and Their

Applications in Signal and Image Processing” https://elad.cs.technion.ac.il/236862-course-webpage-winter-semester-2018-2019/

“Wavelets in Signal Processing” http://www.ifp.illinois.edu/~minhdo/teaching/wavelets.html

“Wavelets, Filter Banks and Applications” https://ocw.mit.edu/courses/mathematics/18-327-wavelets-filter-banks-and-applications-spring-2003/ http://www.numerical-tours.com/

https://elad.cs.technion.ac.il/236862-course-webpage-winter-semester-2018-2019/


















http://www.ifp.illinois.edu/%7Eminhdo/teaching/wavelets.html







https://ocw.mit.edu/courses/mathematics/18-327-wavelets-filter-banks-and-applications-spring-2003/





















http://www.numerical-tours.com/







Requirements and grading

Homework and attendance (20%)

Projects (40%)

Final Examination (40%)


Projects (report + source code)

Harmonic analysis

Multi-scale geometry analysis

Wavelet and Filter bank design

Compressive sensing

Sparse coding, representation, dictionary learning

Generalized source coding, and subband coding

Multidimensional signal processing

Other relevant topics


Final Examination (online)

• 3 mandatories + 2 optionals (3 days)

• Theoretical analysis

• Algorithm implementations

Check Yourself

• Computer generated music 𝑓𝑓(𝑡𝑡)

𝑓𝑓(𝑡𝑡)

Check Yourself

• Listen to the following three manipulated signals:

𝑓𝑓1(𝑡𝑡) 𝑓𝑓2(𝑡𝑡) 𝑓𝑓3(𝑡𝑡), try to find the correct answer

𝑓𝑓1(𝑡𝑡)

𝑓𝑓2(𝑡𝑡)

𝑓𝑓3(𝑡𝑡)

-𝑓𝑓(𝑡𝑡)

0.5𝑓𝑓(𝑡𝑡)

𝑓𝑓(2𝑡𝑡)

Check Yourself

14

SCIENCE

Multimedia Signal

Genome

Computer Vision

Real-Time Detection and Recognition

Multi-person pose estimation

Multi-objects detection

3D object tracking 17

18

Biomedical Data

Biomedical Signal

Cell Segmentation Scale-space shape representation (Detail preservation)

Nuclei Tracking Multi-cell segmentation

and tracking

19/9/18 熊红凯 21

Nuclei Tracking

Y

X

Time

frame 7

frame 30

frame 48

frame 84

5192

AVI Result with marked cell index

Nuclei Segmentation and Tracking 22

Hongkai Xiong, Wang B, Zheng Y. F. “A Structured Learning-Based Graph Matching Method For Tracking Dynamic Multiple Objects,” IEEE Trans. Circuits and Systems for Video Technology, Mar. 2013.

Zebrafish Cell Tracking

Zebrafish Cell Segmentation

Biomedical Imaging 3D Axon cell rendering and manual effect

Biomedical Imaging Volume Modeling (e.g. Vessel Modeling)

Biomedical Imaging Shifted normal plane (TMI 2002) Minimum Cost (Optimization) Sphere-kernel Decomposition

Biomedical Imaging Sphere-kernel Current improvement

Light Field

3D Dynamic Model Reconstruction (PLEX Inc.)

Refocusing Images (Lytro Inc.)

…..

Applications:

The original Lytro camera Refocusing Collection of Light Field Information Reconstruction

30

Light-Field Camera — Refocusing 31

This video shows the refocusing results in different depths

Trillion Frames per Second Imaging 32

http://web.media.mit.edu/~raskar//trillionfps/

Fruit Bottle: light bullet

Ultrafast Camera

Femtosecond Laser 2 picosecond per frame femtosecond long laser pulse

http://web.media.mit.edu/%7Eraskar/trillionfps/

33

Separation based on Time Resolved Images

Virtual Reality

Light Field Recording

Share

http://www.google.com/url?sa=i&rct=j&q=&esrc=s&frm=1&source=images&cd=&ved=0ahUKEwj979D11vjPAhWDEbwKHTcVDOwQjRwIBw&url=http://www.360doc.com/content/16/0822/09/26915118_585021117.shtml&psig=AFQjCNFQCOYCaWr-hUKg7I26q8iZjmqANw&ust=1477572126477644

35

Point Cloud

𝛼𝛼1

𝛼𝛼2 𝛼𝛼3 𝛼𝛼4

𝛼𝛼𝑝𝑝

𝛼𝛼6 𝛼𝛼5

Dictionary Atoms

⋮

Ongoing: 3D hierarchical sparse representation Point cloud color compression

36

⋮

Multiscale Dictionary Learning for Hierarchical Sparse Representation

𝐦𝐦𝐦𝐦𝐦𝐦𝐃𝐃∈𝓒𝓒,𝐀𝐀∈ℝ𝒑𝒑×𝒏𝒏

𝟏𝟏𝒏𝒏�

𝟏𝟏𝟐𝟐

𝐱𝐱𝒊𝒊 − 𝐌𝐌𝒊𝒊𝐃𝐃𝜶𝜶𝒊𝒊 𝟐𝟐𝟐𝟐 + 𝝀𝝀𝓗𝓗 𝜶𝜶𝒊𝒊

𝒏𝒏

𝒊𝒊=𝟏𝟏

Octree Decomposition

Input Voxel

Hierarchical Representation

Multiscale Dictionary

37

Signal Processing

H. Xiong, et al., “Scalable Video Compression Framework with Adaptive Orientational Multiresolution Transform and Nonuniform Directional Filterbank Design”, IEEE Trans. CSVT, 2011.

Reconstruction

Multiscale Multi-directional

Implementation Image Coding

Implementation Video Coding

Protein Phenotype DNA

(Genotype)

Purine Bases: Adenine (A); Guanine (G) Pyramidine Bases:Thymine (T); Cytosine (C)

RNA

40

Conceptual Diagram: Genome Sequence Genome Coding

Many signal processing techniques are based on transform methods

Signal in original domain

Fourier Transform Global basis no location information

Short-time Fourier Transform

Uniform time-frequency Wavelet Transform

Multi-dimension signal? 2D separable

Wavelet Transform

Undesirable bias for coordinate axis directions

Signal Processing Road

A large family of alternative multiscale transforms has been developed.

41

42

1928 H.Nyquist, an engineer at Bell Laboratories, first found the so called Sampling theorem

1933 V. Kotelnikov, an information theory and radar astronomy pioneer from the Soviet Union, was the first to write down a precise statement of Sampling theorem

1949 C.E.Shannon, an American mathematician, electronic engineer, and known as "the father of information theory", stated and proved the sampling theorem

Theoretic Problem Conventional signal processing system

Original continuous-time signal Sampling Transform

Recovered discrete-time signal

Coding/Compression

Reconstruction storage/transmission

Nyquist, Harry. "Certain topics in telegraph transmission theory", Trans. AIEE, vol. 47, pp. 617–644, Apr. 1928 Reprint as classic paper in: Proc. IEEE, Vol. 90, No. 2, Feb 2002.

C. E. Shannon, "Communication in the presence of noise", Proc. Institute of Radio Engineers, vol. 37, no. 1, pp. 10–21, Jan. 1949. Reprint as classic paper in: Proc. IEEE, vol. 86, no. 2, (Feb. 1998)

V. A. Kotelnikov, "On the carrying capacity of the ether and wire in telecommunications", Material for the First All-Union Conference on Questions of Communication, Izd. Red. Upr. Svyazi RKKA, Moscow, 1933 (Russian).

Sparse Representation

Sparse representation

where

θΨ=x

Ψ

L

N

x

=

θ

Σ

LNKKRRRx LLNN ,,,,,0

11 <<=∈∈Ψ∈ ××× θθ

Sparse Representation

“General” measurements instead of samples

[Candes, Romberg, & Tao `04, Donoho `06, Candes ‘06, Tsaig & Donoho `06]

Directly obtain compressed data ?

Nonlinear compressing

Linear sampling

with nonzero components

Abel Prize 2017

Yves Meyer École normale supérieure Paris-Saclay, France

“for his pivotal role in the development of the mathematical theory of wavelets.”

Gauss Prize 2018 David Donoho Stanford University, USA

“for his fundamental contributions to the mathematical, statistical and computational analysis of signal processing.”

47 Wavelet Transform Fast 2D wavelet transform

48 Wavelet Transform Inverse 2D wavelet transform

Multiscale geometric analysis is an emerging area of high-dimensional signal processing and data analysis.

(a) Example ridgelet function 𝜓𝜓𝑎𝑎,𝑏𝑏,𝜃𝜃(𝑥𝑥1, 𝑥𝑥2) (b) Relations between transforms

Applying 1-D wavelet transform to the slices of the Radon transform

Ridgelet Transform : good at capture line sigularity

(c) Reconstruction image

Wavelet

Ridgelet

Not good at handle with curves

49 Multiscale Geometry Analysis Ridgelet transform

Multiscale geometric analysis is an emerging area of high-dimensional signal processing and data analysis.

Applying Ridgelet transform to small blocks (a curved edge is almost straight at sufficiently fine scales)

Curvelet Transform : good at capture curve

Have no ideal discrete implementations

Ridgelet Transform

50 Multiscale Geometry Analysis Curvelet Transform

Discrete Fourier Transform (DFT)

51

Key element

Graph Fourier Transform (GFT)

Graph Signal Processing: Spectrum of Graphs

Basis Frequency Index

Eigenvectors of Laplacian matrix L Eigenvalues of Laplacian matrix L index

Convolutional Neural Network 52

Fig. 2 Structure of AlexNet Feature Extractor Classifier

We can divide the CNN models into two parts – one is a feature extractor and another is a classifier.

The feature extractor reminds us of some commonly used techniques in signal processing, including filter banks and operators exploited in edge detection.

The difference between convolutional kernels and commonly used filters in signal processing is that the former is learnt with huge datasets and the latter is handcrafted.

So is it possible to construct some interpretable models similar to deep CNNs with signal processing methods?

Interpretable CNN 53

Deep Convolutional Neural Networks have been widely applied since the breakthrough in 2012 ImageNet competition (Russakovsky et al., 2015) achieved by AlexNet (Krizhevsky et al., 2012).

Convolutional neural networks (CNNs) have achieved superior performance in many visual tasks, such as object classification and detection. However, the interpretability of the model is always an Achilles’ heel of neural networks.

Fig. 1 Top1 accuracies on ImageNet of different networks

Things we want to know: • Why CNNs performed so well? • What knowledge do CNNs

learn with huge datasets? • What can we learn from CNNs

to construct further signal processing tools?

54

Frequency

Salvador Dali “Gala Contemplating the Mediterranean Sea, which at 30 meters becomes the portrait of Abraham Lincoln”, 1976

Jean Baptiste Joseph Fourier (1768-1830)

• had crazy idea (1807): • Any periodic function can be

rewritten as a weighted sum of sines and cosines of different frequencies.

• Don’t believe it? ▫ Neither did Lagrange, Laplace,

Poisson and other big wigs ▫ Not translated into English

until 1878! • But it’s true! ▫ called Fourier Series

Frequency Spectra • example : g(t) = sin(2πf t) + (1/3)sin(2π(3f) t)

= +

Slides: Efros

Frequency Spectra

= +

=

Frequency Spectra

= 1

1 sin(2 )k

A ktk

π∞

=∑

Frequency Spectra

Deep Networks using Fourier Analysis 64

DNNs can exploit the geometry of low dimensional data manifolds to approximate complex functions that exist along the manifold with simple functions when seen with respect to the input space. The magnitude of a particular frequency component (k) of deep

ReLU network function decays at least as fast as O( ), with width and depth helping polynomially and exponentially (respectively) in modeling higher frequencies. This shows for instance why DNNs cannot perfectly memorize peaky delta-like functions. DNN parameters corresponding to functions with higher

frequency components occupy a smaller volume in the parameter.

Sparseland : A Formal Description 65

m

n

A Dictionary 𝐃𝐃

α A Sparse Vector

= n

Signal x

• Every column in 𝐃𝐃 (dictionary) is a prototype signal (atom)

• The vector α is generated with few non-zeros at arbitrary locations and values

minα α 0 s. t. x = 𝐃𝐃α

minα α 0 s. t. 𝐃𝐃α − y 2 ≤ ε

Approximation Algorithms

Greedy methods Thresholding/OMP

Relaxation methods Basis-Pursuit

L0 – counting number of non-zeros in the vector

This is a projection onto the Sparseland model

These problems are known to be NP-Hard problem

Convolution Sparse Coding (CSC) 66

• What is the corresponding global model? This brings us to … the Convolutional Sparse Coding (CSC)

• When handling images, Sparseland is typically deployed on small overlapping patches due to the desire to train the model to fit the data better

• The model assumption is: each patch in the image is believed to have a sparse representation w.r.t. a common local dictionary

Convolution Sparse Coding (CSC) 67

[𝐗𝐗] = � di

𝑚𝑚

i=1

∗ [Γi]

An image with 𝑁𝑁 pixels

The i-th filter of small size 𝑛𝑛

i-th feature-map: An image of the same size as 𝐗𝐗 holding the sparse representation related to the i-filter

𝑚𝑚 filters convolved with their sparse representations

Why CSC? 68

=

𝐗𝐗 = 𝐃𝐃𝐃𝐃

𝐑𝐑𝒊𝒊𝐗𝐗 𝜸𝜸𝒊𝒊

𝑛𝑛

(2𝑛𝑛 − 1)𝑚𝑚

stripe-dictionary

𝛀𝛀

stripe vector

𝐑𝐑i𝐗𝐗 = 𝛄𝛄i

𝐑𝐑𝒊𝒊+𝟏𝟏𝐗𝐗 𝑛𝑛

(2𝑛𝑛 − 1)𝑚𝑚

𝜸𝜸𝒊𝒊+𝟏𝟏

𝐑𝐑i+1𝐗𝐗 = 𝛀𝛀𝛄𝛄i+1 • Every patch has a sparse representation w.r.t. to the

same local dictionary (𝛀𝛀) just as assumed for images

• There is a rough analogy between CSC and CNN: 1. Convolutional structure 2. Data driven model 3. ReLU is a sparsifying operator

• We shall now propose a principled way to analyze CNN

From CSC to Multi-Layered CSC 69

𝐗𝐗 ∈ ℝ𝑁𝑁 𝑚𝑚1

𝑛𝑛0

𝐃𝐃1 ∈ ℝ𝑁𝑁×𝑁𝑁𝑚𝑚1

𝑛𝑛1𝑚𝑚1 𝑚𝑚2 𝐃𝐃2 ∈ ℝ𝑁𝑁𝑚𝑚1×𝑁𝑁𝑚𝑚2

𝑚𝑚1

𝐃𝐃1 ∈ ℝ𝑁𝑁𝑚𝑚1



Convolutional sparsity (CSC) assumes an

inherent structure is present in natural

signals

We propose to impose the same structure on the

representations themselves

Multi-Layer CSC (ML-CSC)

Multi-Layer CSC 70

𝐗𝐗 ∈ ℝ𝑁𝑁 𝐃𝐃1 ∈ ℝ𝑁𝑁×𝑁𝑁𝑚𝑚1 𝐃𝐃2 ∈ ℝ𝑁𝑁𝑚𝑚1×𝑁𝑁𝑚𝑚2



• We can chain the all the dictionaries into one effective dictionary 𝐃𝐃eff = 𝐃𝐃1𝐃𝐃2𝐃𝐃3 ∙∙∙ 𝐃𝐃K→ 𝐱𝐱 = 𝐃𝐃eff 𝐃𝐃K

• This is a special Sparseland (indeed, a CSC) model


• However: A key property in this model: sparsity of the intermediate representations The effective atoms: atoms → molecules → cells → tissue → body-parts …

71

• Scattering Convolutional Networks [S. Mallat, PAMI13]

Convolution + modulus pooling

network architecture Translation invariance

Deformation Stability

Energy Propagation

Geometric image priors

Wavelet decomposition

(Stable to deformation)

Convolution

Average Pooling

Modulus

Scaling

Rotate

Input signals

Various Wavelet

Filters U

S

Wavelet Filter

A special type of CNN with pre- defined complex wavelet filters and modulus operator

Convolutional Networks Scattering Networks

Not invertible!

Sparse Auto-encoder

The auto-encoder tries to learn a function it is trying to learn an approximation to the identity function, so as to output is similar to .

xxh bw ≈)(,

x̂ x

Convolutional Autoencoder

Winner-Take-All Auto-encoders (Alireza et.al. 2015) Propose the convolutional winner-take-all auto-encoder which combines the benefits of convolutional architectures and auto-encoders for learning sparse representations.

Auto-encoder with adversarial training (Oren et.al. 2017) Introduce adversarial training in convolutional auto-encoder, which enables to produce pleasing reconstruction for very low bitrates.

Autoencoder with Recurrent Neural Networks

Autoencoder with Recurrent Neural Networks (George et.al.

2017)

The architecture consists of a reurrent neural network(RNN)–based

encoder and decoder, a binarizer, and a neural network for entropy

coding.

Ongoing Work

Introduce structural sparsity learning in convolutional

autoencoder

The sparsity penalty(Grouped-lasso) helps the architecture adaptively

produce less feature map and keeping structure information.

En-coder De-coder

Grouped-lasso Penalty

Reconstruction Loss

Structural Loss

Many Thanks

Q & A

Wavelets and Sparse Signal Processingmin.sjtu.edu.cn/files/wavelet/intro.pdf · Stéphane Mallat, A...

Documents

Transcript of Wavelets and Sparse Signal Processingmin.sjtu.edu.cn/files/wavelet/intro.pdf · Stéphane Mallat, A...