Meg preprocessing

Magnetoencephalography Preprocessing and Noise

Reduction TechniquesEliezer Kanal

2/20/2012MEG Basics Course

1

About Me

• 2005 - 2009!! ! University of Pittsburgh! ! ! ! ! ! PhD, Bioengineering

• 2009 - 2011!! ! Carnegie Mellon University! ! ! ! ! ! Postdoctoral fellow, CNBC

• 2011 - current! ! PNC Financial Services! ! ! ! ! ! Quantitative Analyst, Risk Analytics

2

Dealing with Noisy Data

• Overview of MEG Noise

• Noise Reduction

- Averaging, thresholding, frequency filters

- SSP

- SSS/tSSS

• Source Extraction

- PCA

- ICA

3

MEG Noise

4

Breathing

5

Brea

thin

g

6

Freq

uenc

y

7

Freq

uenc

y

8

Tim

e-Fr

eque

ncy

9

Vigário, Jousmäki, Hämäläinen, Hari, & Oja (1997)

Biol

ogic

al N

oise

10

Line Noise

Subject

Empty Room

50 Hz Line Noise(60 Hz in USA)

11

Bad Channels

Find the bad one:

12

Noise from nearby construction

13

Noise Reduction Techniques

• Averaging, thresholding, frequency filters

• SSP

• SSS/tSSS

14

Averaging

• Removes non-timelocked noise

• Requires:

- Time-locked block paradigm design

- Temporal or low-frequency analyses

15

Thresholding

• Discarding trials/channels with maximum signal intensity greater than some user-defined value

• Removes most “data blips”

• Rudimentary, better technique is to simply examine each trial/channel

16

Frequency Filter

• Very good first step, remove data you won’t analyze (don’t waste time cleaning what you won’t examine)

• Use more advanced techniques for specific noise signals

Filter Removes…

High-pass Lower frequencies

Low-pass Higher frequencies

Band-pass Outside specified band

Notch All except specified

17

Signal Space Projection

20


• Overview: SSP uses the difference between source orientations and locations to differentiate distinct sources.

• Theory: Since the field pattern from a single source is

1) unique

2) time-invariant,

we can differentiate sources by examining the angle between their “signal space representations”, and project noise signals out of the dataset.

21


• In general,

m(t) =MX

i=1

ai(t)si + n(t)

24


• In general,

m(t) =MX

i=1

ai(t)si + n(t)measured

signal

24


• In general,

m(t) =MX

i=1


signal

source i

M = Total number of channels

24


• In general,

m(t) =MX

i=1


signal

source amplitude source i

M = Total number of channels

24


• In general,

m(t) =MX

i=1


signal


noiseM = Total number of channels

24


• In general,

• SSP states that s can be split in two:

- s‖ ! = signals from known sources

- s⟂ ! = signals from unknown sources

m(t) =MX

i=1


signal



sk = Pkm

s? = P?m

24


• In general,




m(t) =MX

i=1


signal



sk = Pkm

s? = P?munknown sources

known sources MEG signal

Projection operators

24


• In general,




m(t) =MX

i=1


signal



sk = Pkm

s? = P?munknown sources

known sources MEG signal

Projection operators

sk + s? = sWorth mentioning that

24

Signal Space ProjectionHow find P‖ and P⟂?

25


• Ingenious application of the magic1 technique of Singular Value Decomposition (SVD)

1 Not really magic

25



• Let . Using SVD, we find a basis for s‖, and therefore P‖.2

1 Not really magic

K = {s1, s2, . . . , sk} 2 sk

a matrix of all known sources

25



• Let . Using SVD, we find a basis for s‖, and therefore P‖.2

1 Not really magic

2 Let . By the properties of the SVD, the first k columns of U form an orthonormal basis for the column space of K, so we can define

K = {s1, s2, . . . , sk} 2 sk

K = U⇤VT

Pk = UkUTk

P? = I�Pk

a matrix of all known sources

since sk + s? = Pkm+P?m = s

25


• Recall . To find a(t), invert s‖:

• In practice, s‖ often consists of known noise signals specific to a particular MEG scanner. The final step is simply to project those out of m(t), leaving only unknown (and presumably neural) sources in s.

m(t) =MX

i=1

ai(t)si + n(t)

m(t) = a(t)sk

a(t) = s�1k m(t)

a = V⇤�1UTm(t)

26


• Recall . To find a(t), invert s‖:

• In practice, s‖ often consists of known noise signals specific to a particular MEG scanner. The final step is simply to project those out of m(t), leaving only unknown (and presumably neural) sources in s.

m(t) =MX

i=1

ai(t)si + n(t)

m(t) = a(t)sk

a(t) = s�1k m(t)

a = V⇤�1UTm(t)

K = {s1, s2, . . . , sk} 2 sk

= U⇤VT

| {z }

Recall that

26

Signal Space Separation (SSS)

27

Signal Space Separation

• Overview: Separate MEG signal into sources (1) outside and (2) inside the MEG helmet

• Theory: Analyzing the MEG data using a basis which expresses the magnetic field as a “gradient of the harmonic scalar potential” (defined below) allows the field to be separated into internal and external components.

By simply dropping the external component, we can significantly reduce the MEG signal noise.

28

MEG data – raw

29

MEG data – SSP

30

MEG data – SSS

31

Signal Space Separation• Begin with Maxwell’s laws:

⇤⇥H = J (1)⇤⇥B = µ0J (2)⇤ · B = 0 (3)

32


⇤⇥H = J (1)⇤⇥B = µ0J (2)⇤ · B = 0 (3)

sourcesmagneticfield

32

Taulu et al, 2005


• Note that on surface of sensor array, J = 0. As such,

⇤⇥H = J (1)⇤⇥B = µ0J (2)⇤ · B = 0 (3)

⇥�H = 0 on array surface

sourcesmagneticfield i.e., no

sources!

32

Taulu et al, 2005



• Defining H = ∇Ψ, we obtain the identity ∇ × ∇Ψ = 0 in (1). This term (∇Ψ) is called the “scalar potential.”

• “Scalar potential” has no physical correlate.

• Often written with a negative sign (–∇Ψ) for convenience.

• H = –∇Ψ → B = –μ0∇Ψ… used interchangeably

⇤⇥H = J (1)⇤⇥B = µ0J (2)⇤ · B = 0 (3)



sources!

32

Taulu et al, 2005



• Defining H = ∇Ψ, we obtain the identity ∇ × ∇Ψ = 0 in (1). This term (∇Ψ) is called the “scalar potential.”

• “Scalar potential” has no physical correlate.

• Often written with a negative sign (–∇Ψ) for convenience.

• H = –∇Ψ → B = –μ0∇Ψ… used interchangeably

• Substituting scalar potential into (3) we obtain the Laplacian:

⇤⇥H = J (1)⇤⇥B = µ0J (2)⇤ · B = 0 (3)

⇥ ·⇥� = ⇥2� = 0



sources!

32

Signal Space Separation• Substituting the scalar potential into (3), we obtain the

Laplacian:⇥ ·⇥� = ⇥2� = 0

⇥ · B = 0

33


Laplacian:

• We can express the scalar potential using spherical coordinates ( Ψ(Φ, θ, r) ), separate the variables ( Ψ(Φ,θ,r) = Φ(φ)Θ(θ)R(r) ), and solve the harmonic to obtain

⇥ ·⇥� = ⇥2� = 0⇥ · B = 0

B(r) = �µ0

⇥�

l=0

l�

m=�l

�lm�lm(⇥, ⌅)

rl+1

⇥ B�(r) + B�(r)

� µ0

⇥�

l=0

l�

m=�l

�lmrl�lm(⇥,⌅)

externalsignalinternal

signal

|{z}1

r2 sin ✓

sin ✓

@

@r

✓r2

@

@r

◆+

@

@✓

✓sin ✓

@

@✓

◆+

1

sin ✓

@2

@�2

�+K2 = 0

33


Laplacian:

• We can express the scalar potential using spherical coordinates ( Ψ(Φ, θ, r) ), separate the variables ( Ψ(Φ,θ,r) = Φ(φ)Θ(θ)R(r) ), and solve the harmonic to obtain

⇥ ·⇥� = ⇥2� = 0⇥ · B = 0

B(r) = �µ0

⇥�

l=0

l�

m=�l

�lm�lm(⇥, ⌅)

rl+1

⇥ B�(r)internal

internalsignal

|{z}1

r2 sin ✓

sin ✓

@

@r

✓r2

@

@r

◆+

@

@✓

✓sin ✓

@

@✓

◆+

1

sin ✓

@2

@�2

�+K2 = 0

33

Signal Space Separation

34

Temporally-extended Signal Space Separation

(tSSS)

35


Conceptually very simple:

36



• Recall that the SSS algorithm ends with two signal components – Bα(r) and Bβ(r), or Bin(r) and Bout(r) – and we discard the Bout(r) component

- Rationale: signals originating outside MEG sensor helmet cannot be brain signal

36



• Recall that the SSS algorithm ends with two signal components – Bα(r) and Bβ(r), or Bin(r) and Bout(r) – and we discard the Bout(r) component

- Rationale: signals originating outside MEG sensor helmet cannot be brain signal

• tSSS looks for correlations between Bout(r) and Bin(r) and projects those correlations out of Bin(r)

- Rationale: Any internal signal correlated with the external noise component must represent noise that leaked into the Bin(r) component

36


• From theoriginal article:

37


• From the original article:

38


• Without tSSS:

39


• With tSSS:

40

Source Separation Algorithms

41

Primary Component Analysis (PCA)

42

• Ordinary Least Squares (OLS) regression of X to Y

Following five plots from http://stats.stackexchange.com/a/2700/2019

43

• Ordinary Least Squares (OLS) regression of Y to X

44

• Regression lines are different!

45

• PCA minimizes error orthogonal to the model line

(Yes, this is a different dataset)

46

• “Most accurate” regression line for the data

(Yes, this is another different dataset)

Primary Component Analysis

47

PCA – Formal Definition

48

PCA – Formal Definition

http://stat.ethz.ch/~maathuis/teaching/fall08/Notes3.pdf

49

PCA shortcomings

• Will only detectorthogonal signals

•• Cannot detect

polymodal distributions

Appl. Environ. Microbiol. May 2007 vol. 73 no. 9 2878-2890

“A Tutorial on Principal Component Analysis”, Jonathon Shlens, April 2009

50

Independent Component Analysis (ICA)

51

Independent Component Analysis

• Assumptions: Each signal is…

1. Statistically independent

2. Non-gaussian

• Recall Central Limit Theorem:

! “Given independent random variables x + y = z, z is ! more gaussian than x or y.”

• Theory: We can find S by iteratively identifying and extracting the most independent and non-gaussian components of X

52

ICA in FieldTrip package

53

ICA – Mixing matrix

54

s1s2


54

s1s2

x2x1


54

s1s2

x2x1

x1 = a11s1 + a12s2

x2 = a21s1 + a22s2

�⌘ x = As


54

s1s2

x2x1 Goal: Separate s1 and s2 using

information from x1 and x2

x1 = a11s1 + a12s2

x2 = a21s1 + a22s2

�⌘ x = As


54


• Consider the general mixing equation:

x1 = a11s1 + . . .+ a1nsn... =

...xn = an1s1 + . . .+ annsn

9>=

>;⌘ x = As

55



sensorssources

x1 = a11s1 + . . .+ a1nsn... =

...xn = an1s1 + . . .+ annsn

9>=

>;⌘ x = As

mixing matrix

55



• If we could find one of the rows of A-1 (let’s call that vector w), we could reconstruct a row of s. Mathematically:

sensorssources

x1 = a11s1 + . . .+ a1nsn... =

...xn = an1s1 + . . .+ annsn

9>=

>;⌘ x = As

mixing matrix

w

Tx =

X

i

wixi = y

55




sensorssources

x1 = a11s1 + . . .+ a1nsn... =

...xn = an1s1 + . . .+ annsn

9>=

>;⌘ x = As

mixing matrix

w

Tx =

X

i

wixi = y

Some row from A-1

55




sensorssources

x1 = a11s1 + . . .+ a1nsn... =

...xn = an1s1 + . . .+ annsn

9>=

>;⌘ x = As

mixing matrix

w

Tx =

X

i

wixi = y

Some row from A-1

One of the ICs

(independent components)

that make up S

55

• Working through the math… let


z = ATw

w

Tx =

X

i

wixi = y

x = As

56



z = ATw

w

Tx =

X

i

wixi = y

x = As

mixing matrix Some row from A-1

56


• So,


z = ATw

y = w

Tx

= w

TAs

= z

Ts

w

Tx =

X

i

wixi = y

x = As


56


• So,


z = ATw

y = w

Tx

= w

TAs

= z

Ts

w

Tx =

X

i

wixi = y

x = As


One of the ICs

56


• So,

• y (an IC) is a linear combination of s, with weights zT.


z = ATw

y = w

Tx

= w

TAs

= z

Ts

w

Tx =

X

i

wixi = y

x = As


One of the ICs

56


• So,




zT is more gaussian than any of si, and is least gaussian when equal to one of the si.


z = ATw

y = w

Tx

= w

TAs

= z

Ts

w

Tx =

X

i

wixi = y

x = As


One of the ICs

56


• So,




zT is more gaussian than any of si, and is least gaussian when equal to one of the si.


z = ATw

y = w

Tx

= w

TAs

= z

Ts

w

Tx =

X

i

wixi = y

x = As

We want to take wT as a vector that maximizes the nongaussianity of

wTx, ensuring that wTx = zTs One of the ICs

56


• How can we find wT so as to maximize the nongaussianity of wTx?

• Numerous methods:

- Kurtosis

- Negentropy

- Approximations of Negentropy

• Once find, similar to PCA… find wT, remove, find next best wT, remove, repeat until no more sensors available.

57

ICA in Fieldtrip (2)

58

Mantini, Franciotti, Romani, & Pizzella (2007)

59


1


61

ICA – Method Comparison

Zavala-Fernández, Sander, Burghoff, Orglmeister, & Trahms (2006)

62

Summary

• Examine your data in as many ways as possible

• Use SSS & tSSS to best clean data

• Use ICA to find specific artifacts

• Always check your data!

63

Questions?64

Meg preprocessing

Technology

Transcript of Meg preprocessing