Meg preprocessing

102
Magnetoencephalography Preprocessing and Noise Reduction Techniques Eliezer Kanal 2/20/2012 MEG Basics Course 1

description

Slides from an invited talk I gave at the MEG Basics series in the winter of 2012. Covers the theory behind signal processing techniques used in magnetoencephalography (MEG), including:- Signal Space Projection (SSP)- Signal Space Separation (SSS)- Temporally-extended Signal Space Separation (tSSS)- Principle Component Analysis (PCA)- Independent Component Analysis (ICA)

Transcript of Meg preprocessing

Page 1: Meg preprocessing

Magnetoencephalography Preprocessing and Noise

Reduction TechniquesEliezer Kanal

2/20/2012MEG Basics Course

1

Page 2: Meg preprocessing

About Me

• 2005 - 2009!! ! University of Pittsburgh! ! ! ! ! ! PhD, Bioengineering

• 2009 - 2011!! ! Carnegie Mellon University! ! ! ! ! ! Postdoctoral fellow, CNBC

• 2011 - current! ! PNC Financial Services! ! ! ! ! ! Quantitative Analyst, Risk Analytics

2

Page 3: Meg preprocessing

Dealing with Noisy Data

• Overview of MEG Noise

• Noise Reduction

- Averaging, thresholding, frequency filters

- SSP

- SSS/tSSS

• Source Extraction

- PCA

- ICA

3

Page 4: Meg preprocessing

MEG Noise

4

Page 5: Meg preprocessing

Breathing

5

Page 6: Meg preprocessing

Brea

thin

g

6

Page 7: Meg preprocessing

Freq

uenc

y

7

Page 8: Meg preprocessing

Freq

uenc

y

8

Page 9: Meg preprocessing

Tim

e-Fr

eque

ncy

9

Page 10: Meg preprocessing

Vigário, Jousmäki, Hämäläinen, Hari, & Oja (1997)

Biol

ogic

al N

oise

10

Page 11: Meg preprocessing

Line Noise

Subject

Empty Room

50 Hz Line Noise(60 Hz in USA)

11

Page 12: Meg preprocessing

Bad Channels

Find the bad one:

12

Page 13: Meg preprocessing

Bad Channels

Find the bad one:

12

Page 14: Meg preprocessing

Noise from nearby construction

13

Page 15: Meg preprocessing

Noise Reduction Techniques

• Averaging, thresholding, frequency filters

• SSP

• SSS/tSSS

14

Page 16: Meg preprocessing

Averaging

• Removes non-timelocked noise

• Requires:

- Time-locked block paradigm design

- Temporal or low-frequency analyses

15

Page 17: Meg preprocessing

Thresholding

• Discarding trials/channels with maximum signal intensity greater than some user-defined value

• Removes most “data blips”

• Rudimentary, better technique is to simply examine each trial/channel

16

Page 18: Meg preprocessing

Frequency Filter

• Very good first step, remove data you won’t analyze (don’t waste time cleaning what you won’t examine)

• Use more advanced techniques for specific noise signals

Filter Removes…

High-pass Lower frequencies

Low-pass Higher frequencies

Band-pass Outside specified band

Notch All except specified

17

Page 19: Meg preprocessing

18

Page 20: Meg preprocessing

19

Page 21: Meg preprocessing

Signal Space Projection

20

Page 22: Meg preprocessing

Signal Space Projection

• Overview: SSP uses the difference between source orientations and locations to differentiate distinct sources.

• Theory: Since the field pattern from a single source is

1) unique

2) time-invariant,

we can differentiate sources by examining the angle between their “signal space representations”, and project noise signals out of the dataset.

21

Page 23: Meg preprocessing

22

Page 24: Meg preprocessing

23

Page 25: Meg preprocessing

Signal Space Projection

• In general,

m(t) =MX

i=1

ai(t)si + n(t)

24

Page 26: Meg preprocessing

Signal Space Projection

• In general,

m(t) =MX

i=1

ai(t)si + n(t)measured

signal

24

Page 27: Meg preprocessing

Signal Space Projection

• In general,

m(t) =MX

i=1

ai(t)si + n(t)measured

signal

source i

M = Total number of channels

24

Page 28: Meg preprocessing

Signal Space Projection

• In general,

m(t) =MX

i=1

ai(t)si + n(t)measured

signal

source amplitude source i

M = Total number of channels

24

Page 29: Meg preprocessing

Signal Space Projection

• In general,

m(t) =MX

i=1

ai(t)si + n(t)measured

signal

source amplitude source i

noiseM = Total number of channels

24

Page 30: Meg preprocessing

Signal Space Projection

• In general,

• SSP states that s can be split in two:

- s‖ ! = signals from known sources

- s⟂ ! = signals from unknown sources

m(t) =MX

i=1

ai(t)si + n(t)measured

signal

source amplitude source i

noiseM = Total number of channels

sk = Pkm

s? = P?m

24

Page 31: Meg preprocessing

Signal Space Projection

• In general,

• SSP states that s can be split in two:

- s‖ ! = signals from known sources

- s⟂ ! = signals from unknown sources

m(t) =MX

i=1

ai(t)si + n(t)measured

signal

source amplitude source i

noiseM = Total number of channels

sk = Pkm

s? = P?munknown sources

known sources MEG signal

Projection operators

24

Page 32: Meg preprocessing

Signal Space Projection

• In general,

• SSP states that s can be split in two:

- s‖ ! = signals from known sources

- s⟂ ! = signals from unknown sources

m(t) =MX

i=1

ai(t)si + n(t)measured

signal

source amplitude source i

noiseM = Total number of channels

sk = Pkm

s? = P?munknown sources

known sources MEG signal

Projection operators

sk + s? = sWorth mentioning that

24

Page 33: Meg preprocessing

Signal Space ProjectionHow find P‖ and P⟂?

25

Page 34: Meg preprocessing

Signal Space ProjectionHow find P‖ and P⟂?

• Ingenious application of the magic1 technique of Singular Value Decomposition (SVD)

1 Not really magic

25

Page 35: Meg preprocessing

Signal Space ProjectionHow find P‖ and P⟂?

• Ingenious application of the magic1 technique of Singular Value Decomposition (SVD)

• Let . Using SVD, we find a basis for s‖, and therefore P‖.2

1 Not really magic

K = {s1, s2, . . . , sk} 2 sk

a matrix of all known sources

25

Page 36: Meg preprocessing

Signal Space ProjectionHow find P‖ and P⟂?

• Ingenious application of the magic1 technique of Singular Value Decomposition (SVD)

• Let . Using SVD, we find a basis for s‖, and therefore P‖.2

1 Not really magic

2 Let . By the properties of the SVD, the first k columns of U form an orthonormal basis for the column space of K, so we can define

K = {s1, s2, . . . , sk} 2 sk

K = U⇤VT

Pk = UkUTk

P? = I�Pk

a matrix of all known sources

since sk + s? = Pkm+P?m = s

25

Page 37: Meg preprocessing

Signal Space Projection

• Recall . To find a(t), invert s‖:

• In practice, s‖ often consists of known noise signals specific to a particular MEG scanner. The final step is simply to project those out of m(t), leaving only unknown (and presumably neural) sources in s.

m(t) =MX

i=1

ai(t)si + n(t)

m(t) = a(t)sk

a(t) = s�1k m(t)

a = V⇤�1UTm(t)

26

Page 38: Meg preprocessing

Signal Space Projection

• Recall . To find a(t), invert s‖:

• In practice, s‖ often consists of known noise signals specific to a particular MEG scanner. The final step is simply to project those out of m(t), leaving only unknown (and presumably neural) sources in s.

m(t) =MX

i=1

ai(t)si + n(t)

m(t) = a(t)sk

a(t) = s�1k m(t)

a = V⇤�1UTm(t)

K = {s1, s2, . . . , sk} 2 sk

= U⇤VT

| {z }

Recall that

26

Page 39: Meg preprocessing

Signal Space Separation (SSS)

27

Page 40: Meg preprocessing

Signal Space Separation

• Overview: Separate MEG signal into sources (1) outside and (2) inside the MEG helmet

• Theory: Analyzing the MEG data using a basis which expresses the magnetic field as a “gradient of the harmonic scalar potential” (defined below) allows the field to be separated into internal and external components.

By simply dropping the external component, we can significantly reduce the MEG signal noise.

28

Page 41: Meg preprocessing

MEG data – raw

29

Page 42: Meg preprocessing

MEG data – SSP

30

Page 43: Meg preprocessing

MEG data – SSS

31

Page 44: Meg preprocessing

Signal Space Separation• Begin with Maxwell’s laws:

⇤⇥H = J (1)⇤⇥B = µ0J (2)⇤ · B = 0 (3)

32

Page 45: Meg preprocessing

Signal Space Separation• Begin with Maxwell’s laws:

⇤⇥H = J (1)⇤⇥B = µ0J (2)⇤ · B = 0 (3)

sourcesmagneticfield

32

Page 46: Meg preprocessing

Taulu et al, 2005

Signal Space Separation• Begin with Maxwell’s laws:

• Note that on surface of sensor array, J = 0. As such,

⇤⇥H = J (1)⇤⇥B = µ0J (2)⇤ · B = 0 (3)

⇥�H = 0 on array surface

sourcesmagneticfield i.e., no

sources!

32

Page 47: Meg preprocessing

Taulu et al, 2005

Signal Space Separation• Begin with Maxwell’s laws:

• Note that on surface of sensor array, J = 0. As such,

• Defining H = ∇Ψ, we obtain the identity ∇ × ∇Ψ = 0 in (1). This term (∇Ψ) is called the “scalar potential.”

• “Scalar potential” has no physical correlate.

• Often written with a negative sign (–∇Ψ) for convenience.

• H = –∇Ψ → B = –μ0∇Ψ… used interchangeably

⇤⇥H = J (1)⇤⇥B = µ0J (2)⇤ · B = 0 (3)

⇥�H = 0 on array surface

sourcesmagneticfield i.e., no

sources!

32

Page 48: Meg preprocessing

Taulu et al, 2005

Signal Space Separation• Begin with Maxwell’s laws:

• Note that on surface of sensor array, J = 0. As such,

• Defining H = ∇Ψ, we obtain the identity ∇ × ∇Ψ = 0 in (1). This term (∇Ψ) is called the “scalar potential.”

• “Scalar potential” has no physical correlate.

• Often written with a negative sign (–∇Ψ) for convenience.

• H = –∇Ψ → B = –μ0∇Ψ… used interchangeably

• Substituting scalar potential into (3) we obtain the Laplacian:

⇤⇥H = J (1)⇤⇥B = µ0J (2)⇤ · B = 0 (3)

⇥ ·⇥� = ⇥2� = 0

⇥�H = 0 on array surface

sourcesmagneticfield i.e., no

sources!

32

Page 49: Meg preprocessing

Signal Space Separation• Substituting the scalar potential into (3), we obtain the

Laplacian:⇥ ·⇥� = ⇥2� = 0

⇥ · B = 0

33

Page 50: Meg preprocessing

Signal Space Separation• Substituting the scalar potential into (3), we obtain the

Laplacian:

• We can express the scalar potential using spherical coordinates ( Ψ(Φ, θ, r) ), separate the variables ( Ψ(Φ,θ,r) = Φ(φ)Θ(θ)R(r) ), and solve the harmonic to obtain

⇥ ·⇥� = ⇥2� = 0⇥ · B = 0

B(r) = �µ0

⇥�

l=0

l�

m=�l

�lm�lm(⇥, ⌅)

rl+1

⇥ B�(r) + B�(r)

� µ0

⇥�

l=0

l�

m=�l

�lmrl�lm(⇥,⌅)

externalsignalinternal

signal

|{z}1

r2 sin ✓

sin ✓

@

@r

✓r2

@

@r

◆+

@

@✓

✓sin ✓

@

@✓

◆+

1

sin ✓

@2

@�2

�+K2 = 0

33

Page 51: Meg preprocessing

Signal Space Separation• Substituting the scalar potential into (3), we obtain the

Laplacian:

• We can express the scalar potential using spherical coordinates ( Ψ(Φ, θ, r) ), separate the variables ( Ψ(Φ,θ,r) = Φ(φ)Θ(θ)R(r) ), and solve the harmonic to obtain

⇥ ·⇥� = ⇥2� = 0⇥ · B = 0

B(r) = �µ0

⇥�

l=0

l�

m=�l

�lm�lm(⇥, ⌅)

rl+1

⇥ B�(r)internal

internalsignal

|{z}1

r2 sin ✓

sin ✓

@

@r

✓r2

@

@r

◆+

@

@✓

✓sin ✓

@

@✓

◆+

1

sin ✓

@2

@�2

�+K2 = 0

33

Page 52: Meg preprocessing

Signal Space Separation

34

Page 53: Meg preprocessing

Temporally-extended Signal Space Separation

(tSSS)

35

Page 54: Meg preprocessing

Temporally-extended Signal Space Separation

Conceptually very simple:

36

Page 55: Meg preprocessing

Temporally-extended Signal Space Separation

Conceptually very simple:

• Recall that the SSS algorithm ends with two signal components – Bα(r) and Bβ(r), or Bin(r) and Bout(r) – and we discard the Bout(r) component

- Rationale: signals originating outside MEG sensor helmet cannot be brain signal

36

Page 56: Meg preprocessing

Temporally-extended Signal Space Separation

Conceptually very simple:

• Recall that the SSS algorithm ends with two signal components – Bα(r) and Bβ(r), or Bin(r) and Bout(r) – and we discard the Bout(r) component

- Rationale: signals originating outside MEG sensor helmet cannot be brain signal

• tSSS looks for correlations between Bout(r) and Bin(r) and projects those correlations out of Bin(r)

- Rationale: Any internal signal correlated with the external noise component must represent noise that leaked into the Bin(r) component

36

Page 57: Meg preprocessing

Temporally-extended Signal Space Separation

• From theoriginal article:

37

Page 58: Meg preprocessing

Temporally-extended Signal Space Separation

• From the original article:

38

Page 59: Meg preprocessing

Temporally-extended Signal Space Separation

• Without tSSS:

39

Page 60: Meg preprocessing

Temporally-extended Signal Space Separation

• With tSSS:

40

Page 61: Meg preprocessing

Source Separation Algorithms

41

Page 62: Meg preprocessing

Primary Component Analysis (PCA)

42

Page 63: Meg preprocessing

• Ordinary Least Squares (OLS) regression of X to Y

Following five plots from http://stats.stackexchange.com/a/2700/2019

43

Page 64: Meg preprocessing

• Ordinary Least Squares (OLS) regression of Y to X

44

Page 65: Meg preprocessing

• Regression lines are different!

45

Page 66: Meg preprocessing

• PCA minimizes error orthogonal to the model line

(Yes, this is a different dataset)

46

Page 67: Meg preprocessing

• “Most accurate” regression line for the data

(Yes, this is another different dataset)

Primary Component Analysis

47

Page 68: Meg preprocessing

PCA – Formal Definition

48

Page 69: Meg preprocessing

PCA – Formal Definition

http://stat.ethz.ch/~maathuis/teaching/fall08/Notes3.pdf

49

Page 70: Meg preprocessing

PCA – Formal Definition

http://stat.ethz.ch/~maathuis/teaching/fall08/Notes3.pdf

49

Page 71: Meg preprocessing

PCA shortcomings

• Will only detectorthogonal signals

•• Cannot detect

polymodal distributions

Appl. Environ. Microbiol. May 2007 vol. 73 no. 9 2878-2890

“A Tutorial on Principal Component Analysis”, Jonathon Shlens, April 2009

50

Page 72: Meg preprocessing

Independent Component Analysis (ICA)

51

Page 73: Meg preprocessing

Independent Component Analysis

• Assumptions: Each signal is…

1. Statistically independent

2. Non-gaussian

• Recall Central Limit Theorem:

! “Given independent random variables x + y = z, z is ! more gaussian than x or y.”

• Theory: We can find S by iteratively identifying and extracting the most independent and non-gaussian components of X

52

Page 74: Meg preprocessing

ICA in FieldTrip package

53

Page 75: Meg preprocessing

ICA – Mixing matrix

54

Page 76: Meg preprocessing

s1s2

ICA – Mixing matrix

54

Page 77: Meg preprocessing

s1s2

x2x1

ICA – Mixing matrix

54

Page 78: Meg preprocessing

s1s2

x2x1

x1 = a11s1 + a12s2

x2 = a21s1 + a22s2

�⌘ x = As

ICA – Mixing matrix

54

Page 79: Meg preprocessing

s1s2

x2x1 Goal: Separate s1 and s2 using

information from x1 and x2

x1 = a11s1 + a12s2

x2 = a21s1 + a22s2

�⌘ x = As

ICA – Mixing matrix

54

Page 80: Meg preprocessing

Independent Component Analysis

• Consider the general mixing equation:

x1 = a11s1 + . . .+ a1nsn... =

...xn = an1s1 + . . .+ annsn

9>=

>;⌘ x = As

55

Page 81: Meg preprocessing

Independent Component Analysis

• Consider the general mixing equation:

sensorssources

x1 = a11s1 + . . .+ a1nsn... =

...xn = an1s1 + . . .+ annsn

9>=

>;⌘ x = As

mixing matrix

55

Page 82: Meg preprocessing

Independent Component Analysis

• Consider the general mixing equation:

• If we could find one of the rows of A-1 (let’s call that vector w), we could reconstruct a row of s. Mathematically:

sensorssources

x1 = a11s1 + . . .+ a1nsn... =

...xn = an1s1 + . . .+ annsn

9>=

>;⌘ x = As

mixing matrix

w

Tx =

X

i

wixi = y

55

Page 83: Meg preprocessing

Independent Component Analysis

• Consider the general mixing equation:

• If we could find one of the rows of A-1 (let’s call that vector w), we could reconstruct a row of s. Mathematically:

sensorssources

x1 = a11s1 + . . .+ a1nsn... =

...xn = an1s1 + . . .+ annsn

9>=

>;⌘ x = As

mixing matrix

w

Tx =

X

i

wixi = y

Some row from A-1

55

Page 84: Meg preprocessing

Independent Component Analysis

• Consider the general mixing equation:

• If we could find one of the rows of A-1 (let’s call that vector w), we could reconstruct a row of s. Mathematically:

sensorssources

x1 = a11s1 + . . .+ a1nsn... =

...xn = an1s1 + . . .+ annsn

9>=

>;⌘ x = As

mixing matrix

w

Tx =

X

i

wixi = y

Some row from A-1

One of the ICs

(independent components)

that make up S

55

Page 85: Meg preprocessing

• Working through the math… let

Independent Component Analysis

z = ATw

w

Tx =

X

i

wixi = y

x = As

56

Page 86: Meg preprocessing

• Working through the math… let

Independent Component Analysis

z = ATw

w

Tx =

X

i

wixi = y

x = As

mixing matrix Some row from A-1

56

Page 87: Meg preprocessing

• Working through the math… let

• So,

Independent Component Analysis

z = ATw

y = w

Tx

= w

TAs

= z

Ts

w

Tx =

X

i

wixi = y

x = As

mixing matrix Some row from A-1

56

Page 88: Meg preprocessing

• Working through the math… let

• So,

Independent Component Analysis

z = ATw

y = w

Tx

= w

TAs

= z

Ts

w

Tx =

X

i

wixi = y

x = As

mixing matrix Some row from A-1

One of the ICs

56

Page 89: Meg preprocessing

• Working through the math… let

• So,

Independent Component Analysis

z = ATw

y = w

Tx

= w

TAs

= z

Ts

w

Tx =

X

i

wixi = y

x = As

mixing matrix Some row from A-1

One of the ICs

56

Page 90: Meg preprocessing

• Working through the math… let

• So,

Independent Component Analysis

z = ATw

y = w

Tx

= w

TAs

= z

Ts

w

Tx =

X

i

wixi = y

x = As

mixing matrix Some row from A-1

One of the ICs

56

Page 91: Meg preprocessing

• Working through the math… let

• So,

Independent Component Analysis

z = ATw

y = w

Tx

= w

TAs

= z

Ts

w

Tx =

X

i

wixi = y

x = As

mixing matrix Some row from A-1

One of the ICs

56

Page 92: Meg preprocessing

• Working through the math… let

• So,

• y (an IC) is a linear combination of s, with weights zT.

Independent Component Analysis

z = ATw

y = w

Tx

= w

TAs

= z

Ts

w

Tx =

X

i

wixi = y

x = As

mixing matrix Some row from A-1

One of the ICs

56

Page 93: Meg preprocessing

• Working through the math… let

• So,

• y (an IC) is a linear combination of s, with weights zT.

• Recall Central Limit Theorem:

! “Given independent random variables x + y = z, z is ! more gaussian than x or y.”

zT is more gaussian than any of si, and is least gaussian when equal to one of the si.

Independent Component Analysis

z = ATw

y = w

Tx

= w

TAs

= z

Ts

w

Tx =

X

i

wixi = y

x = As

mixing matrix Some row from A-1

One of the ICs

56

Page 94: Meg preprocessing

• Working through the math… let

• So,

• y (an IC) is a linear combination of s, with weights zT.

• Recall Central Limit Theorem:

! “Given independent random variables x + y = z, z is ! more gaussian than x or y.”

zT is more gaussian than any of si, and is least gaussian when equal to one of the si.

Independent Component Analysis

z = ATw

y = w

Tx

= w

TAs

= z

Ts

w

Tx =

X

i

wixi = y

x = As

We want to take wT as a vector that maximizes the nongaussianity of

wTx, ensuring that wTx = zTs One of the ICs

56

Page 95: Meg preprocessing

Independent Component Analysis

• How can we find wT so as to maximize the nongaussianity of wTx?

• Numerous methods:

- Kurtosis

- Negentropy

- Approximations of Negentropy

• Once find, similar to PCA… find wT, remove, find next best wT, remove, repeat until no more sensors available.

57

Page 96: Meg preprocessing

ICA in Fieldtrip (2)

58

Page 97: Meg preprocessing

Mantini, Franciotti, Romani, & Pizzella (2007)

59

Page 98: Meg preprocessing

Mantini, Franciotti, Romani, & Pizzella (2007)

1

Page 99: Meg preprocessing

Mantini, Franciotti, Romani, & Pizzella (2007)

61

Page 100: Meg preprocessing

ICA – Method Comparison

Zavala-Fernández, Sander, Burghoff, Orglmeister, & Trahms (2006)

62

Page 101: Meg preprocessing

Summary

• Examine your data in as many ways as possible

• Use SSS & tSSS to best clean data

• Use ICA to find specific artifacts

• Always check your data!

63

Page 102: Meg preprocessing

Questions?64