A (very) brief introduction to multivoxel analysis “stuff”

A (very) brief introduction to multivoxel analysis “stuff”

Jo Etzel, Social Brain Lab

[email protected]

Mass univariate (spm) Test every voxel

separately Fit a linear model to each

voxel Look for brain structures

where the blobs occur Obtain a p-value for each

voxel Use parametric (or

permutation) statistics to evaluate significance

Multivariate Test groups of voxels

(ROIs) at once Use machine learning

algorithms Structures first (no blobs)

Obtain a classification accuracy for each ROI

Evaluate significance of accuracy by permutation testing and perhaps subsetting (parametric statistics also possible)

Classification

Beard?

man deep voice?

man flat chest?

man woman

yes

yes

no

no

no

yes

from www.cac.science.ru.nl/people/ustun/index.html

What sort of flower is this?

Is this example type 1 or type 2?

from http://spect.yale.edu/displaying_results.html

listening to hand action sounds listening to mouth action sounds

Can the activity in the right premotor cortex classify the volumes into hand and mouth action sounds?

- or -Is the brain activity in this ROI different while listening to hand and mouth action sounds?

subject stimulus example voxel 1 voxel 2

1 mouth 1 0.8464 -0.8932

1 mouth 2 0.5776 0.0494

1 hand 1 1.4955 0.2002

1 hand 2 -0.3844 -0.3322

2 mouth 1 0.5432 -0.9855

www.cs.sunysb.edu/~mueller/research/brainMiner/

0.3543 -0.1734 2.3444 0.3832 -0.9844 0.9864 0.0033for each ROI in each volume (or summary volume)

temporal compression; extract GLM parameter estimates

to classifiers

subject

stimulus example voxel 1 voxel 2

1 mouth 1 0.8464 -0.8932

1 mouth 2 0.5776 0.0494

1 hand 1 1.4955 0.2002

1 hand 2 -0.3844

-0.3322

1 hand 3 0.5432 -0.9855

subject


1 mouth 1 0.8464 -0.8932

1 mouth 2 0.5776 0.0494

1 hand 1 1.4955 0.2002

1 hand 2 -0.332 2.1134

subject


1 hand 5 1.4955 0.2002

1 hand 6 -0.3844

-0.3322

1 mouth 3 0.5432 -0.9855

1 hand 4 2.3145 1.3342

testing set

training setclassifier

accuracy (from test set)

1st train: mouth vs. hand?

2nd test: mouth vs. hand?

classify each subject separately, average accuracy across subjects subjects can have their own patterns

classify subjects together, test left-out subject subjects need the same patterns

all data

preM

L

preM

R

M1

L

M1

R

S1

L

S1

R

S2

L

S2

R

aud

L

aud

R

othe

r L

othe

r R

mea

n ac

cura

cy

0.45

0.50

0.55

0.60

0.65 *

*

*

**

*

ROIs

* P < 0. 0042, permutation test (Bonferroni correction) of 0.05 for 12 ROIs.

So, have which ROIs allowed significant classification accuracy for separating mouth and hand action sounds: preML, preMR, M1L, S2R, audL, audR.

Permutation TestWhat?

null hypothesis is arbitrary labeling If labeling is arbitrary

(random), then there is no relationship between the labels and the activity, so classification should not be possible.

real data classified about the same as arbitrary data: no evidence for a relationship between the labels and the activity: not significant.

real data is classified better than the arbitrary data: it is “unusual:” significant.

How? classify the real data make many randomized-label

data sets (arbitrary or “fake” data).

classify each fake data set in the same way as the real data

proportion of fake data sets classified better than the real data is the p-value

1. Analyze the real data (get accuracy). M1 L = 0.6000

2. Make lots of permuted-label data sets (all if possible, at least 1,000).

0.4944 0.510 0.499 0.5211 0.480 0.5002 0.498 0.519 0.5720 0.4789 …

3. Analyze the fake data sets (get accuracies).

4. Count how many of the fake data sets were classified more accurately than the real data.

Of 1000 fake data sets, none had accuracy higher than 0.60.

5. Divide and get the p-value.

1/1001 = 0.000999, soM1 L p = 0.001

(50/1001 = 0.0499)

Done!

Permutation test results, showing true results with uncorrected p-value cutoff lines. True mean accuracy lines that fall above the range of the permuted lines are highly significant (p = 0.002).

0.40

0.45

0.50

0.55

0.60

0.65

0.70

mouth vs. handm

ean

accu

racy

audL audR S2L S2R M1L M1R S1L S1R preML preMR

0.40

0.45

0.50

0.55

0.60

0.65

0.70

env vs. hand

mea

n ac

cura

cy


0.40

0.45

0.50

0.55

0.60

0.65

0.70

Smouth vs. Shand

mea

n ac

cura

cy


all voxels in ROI (mean)permutation (mean)permutation, uncorr p=0.002 (max, min)permutation, uncorr p=0.005 (0.5%)permutation, uncorr p=0.05 (5%)



Black box?

Random Forests (RFs) first fMRI use, I think algorithm makes 10,000

CARTs, all try to classify, majority vote final class

algorithm makes random variable selections for the “forest” of “trees”

support vector machines (svms)

previously used with fMRI data algorithm converts data to a

higher dimension to try to find a linear separating line (hyperplane) between the classes.

classifierBeard?

man deep voice?

man flat chest?

man woman

yes

yes

no

no

no

yes

Beard?

man deep voice?

man flat chest?

man woman

yes

yes

no

no

no

yesBeard?

man deep voice?

man flat chest?

man woman

yes

yes

no

no

no

yes

k-nearest neighbor Linear discriminant analysis Gaussian Naïve Bayes Hidden Markov Models Partial Least Squares

For more info/review: Mitchell, Machine Learning,

2004, 145-175; O’toole, Journal of Cognitive

Neuroscience, 2007, 1735-1753.

Other classification-type algorithms

classifier

Neural Networks A generalization of linear regression

functions; many variations. Each node calculates a weighted

sum, and does a binary input to the next layer, so simple networks can be expressed as (long) equations.

Training the network for each person involves setting the weights on each voxel.

Idea: brain activity pattern during remembering an item should be similar to the pattern when learning the item.

Subjects learned 3 lists of 10 items each; items were labeled pictures of famous people, famous locations, and everyday objects.

After learning, subjects tried to remember all of the items; saying them aloud as they remembered them.

2: testingvolumes when recalling

1: trainingvolumes when learning items in each category classifier

(neural network)3: accuracy

(sort of)

“For each brain scan, the classifier produced an estimate of the match between the current testing pattern and each of the three study contexts.”

“However, follow-up analyses indicated that voxels outside of peak category-selective areas are also important for establishing this result.”

Take-Home Message

Classification (multivariate methods) can answer questions and find patterns not available with spm-type methods. spm is still useful, though!

Method is (for me) ROI-focused: start with hypotheses about which ROIs can classify (or not) which stimuli.

Differences in activation to stimuli can be restated as classification predictions.

That’s It!

Permutation vs. t-test p-values

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

mouth_hand, p=0, r2=0.9933

t-test p

perm

utat

ion

p

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

env_hand, p=1e-05, r2=0.873

t-test p

perm

utat

ion

p0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Smouth_Shand, p=0, r2=0.9739

t-test p

perm

utat

ion

p

The permutation and t-test p-values are very highly correlated, though not always identical.

Calculated correlation line in solid, perfect in dotted.

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

mouth_hand, p=0, r2=0.9933

t-test p

perm

utat

ion

p

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

env_hand, p=1e-05, r2=0.873

t-test p

perm

utat

ion

p

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Smouth_Shand, p=0, r2=0.9739

t-test p

perm

utat

ion

p

A (very) brief introduction to multivoxel analysis “stuff”

Documents

Transcript of A (very) brief introduction to multivoxel analysis “stuff”