Estimating Power for fMRI & Classification Directions in fMRI Thomas Nichols Clinical Imaging Centre...

Estimating Power for fMRI&

Classification Directions in fMRI

Thomas Nichols

Clinical Imaging Centre

GlaxoSmithKline

Overview

• Power Exploration– ROIs (small/big, lots/few) ?

GD Mitsis, GD Iannetti, TS Smart, I Tracey & R WiseRegions of interest analysis in pharmacological fMRI: How do the definition criteria influence the inferred result?Epub NeuroImage

• Power Prediction

• Classification

Power Review: 1 Test

• Power: The probability of rejecting H0 when HA is true

• Specify your null distribution– Mean=0, variance=σ2

• Specify the effect size (Δ), which leads to alternative distribution

• Specify the false positive rate, α 0 2 4 6 8-4 -2

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

α Power

Null Distribution Alternative Distribution

Δ/σ

Power: 100,000 Tests?

• Avoid Multiple Testing Problem if possible– Typically study will use well-characterized paradigm– Expected region of response should be known

• But…– Variation in functional and structural anatomy– “Perfect” region never known

• Should we use focal ROI?• Voxel-wise search in neighborhood?• Over whole brain anyway?

Qualitative Power Exploration

• Simplified power setting– Not voxel-wise; instead largish (>1000 voxel) VOIs

– Large VOIs: Assuming σwithin << σbetween • Hence different sized VOI’s will have similar variance

– Large VOIs: Assuming independence between VOIs

• Consider impact of many vs. fewer VOI’s– Many VOIs

• Better follows anatomy, possible shape of signal• Worse multiple testing correction

– Fewer VOIs• Will dilute localized signal • Fewer tests to correct for

Atlas 0 (AAL)k = 116 regionsαFWE = 0.00043

(surrogate for correlatedvoxel-wise search)

Atlas 3k = 17 regionsαFWE = 0.00294

Atlas 1 (AAL symmetric)k = 58 regionsαFWE = 0.00086

Atlas 4 (Lobar AAL)k = 6 regionsαFWE = 0.00833

Atlas 2k = 28 regionsαFWE = 0.00179

Atlas 5 (whole GM)k = 1 regionαFWE =0.05000

AAL & Derived ROI Atlases

Atlas 0 (AAL)k = 116 regionsαFWE = 0.00043Signal # VOIs = 1 Strength = 100%

Atlas 3k = 17 regionsαFWE = 0.00294Signal # VOIs = 1 Strength = 4.9%

Atlas 1 (AAL symmetric)k = 58 regionsαFWE = 0.00086Signal # VOIs = 1 Strength = 47%

Atlas 4 (Lobar AAL)k = 6 regionsαFWE = 0.00833Signal # VOIs = 1 Strength = 0.6%

Atlas 2k = 28 regionsαFWE = 0.00179Signal # VOIs = 1 Strength = 47%

Atlas 5 (whole GM)k = 1 regionαFWE =0.05000Signal # VOIs = 1 Strength 0.1%

L Amygdala

Power: L Amygdala, True ROI

• True ROI best(of course)

• Rich ROI atlas (k=116) beats coarser atlases– Dilution more

punishing than greater multiple testing

Power: L Amygdala, Shifted ROI

• True ROI best

• Wrong (unshifted) ROI next

• Rich ROI atlas still beats coarser atlases

Power: ½ of Mid-Cingulate

• Whole Mid-Cing ROI best

• Again, huge (k=116) atlas next best

• But we’ve assumed RFX– No precision

gain for large ROI’s, as shrinking σWiN is no help

Power: ½ of Mid-Cingulate: FFX

• Whole Mid-Cing ROI best

• Now Symmetric AAL atlas (k=58) best!– If σBTW small,

precision increase with large ROIs has impact

Power Exploration Conclusions

• Compared Range of Scales– Whole Brain, Lobar (k=6),…, AAL (k=116)

• Focal structures – Focal ROI’s best• More extended signals, with heterogeneity

– Rich atlas best• Dilution of signal worse than Bonferroni

• But whole-brain always less powerful than reduced volume– Suggests voxel-wise / “Multiple Endpoint” result

preferred, constrained coarsely

Why Doesn’t Bonf. Hurt More?

H0 True

H0 False

RejectH0

Type I

Error

α

Power

Accept H0

Correct Type II

Error

Truth (unobserved)

Tes

t R

esu

lt (

ob

serv

ed)

1000

950

50

100

50

50

• Example– 1100 total voxels– 100 voxels have β=Δ

• A test with 50% power on average will detect 50 of these voxels with true activation

– 1000 voxels have β=0• α=5% implies on average

50 null voxels will have false positives

• 1 Signal ROI– 1 opportunity for a positive

• 100 Signal Voxels– 100 opportunities for a

positive

Formal Power analysis

• N: Number of Subjects– Adjusted to achieve sufficient power

• α: The size of the test you’d like to use– Commonly set to 0.05 (5% false positive rate)

• Δ: The size of the effect you’re interested in detecting– Based on intuition or similar studies

• σ2: The variance of Δ– Has a complicated structure with very little intuition– Depends on many things …

Power for Group fMRI

. . .

Tim

e

Subject 1

Temporal autocorr.

Cov(Y)=σ2wV

Subject N. . .Between subject variability, σ2

B

. . .

. . .

Subject 2

J. Mumford & TE. Nichols. NeuroImage 39:261–268, 2008 http://www.fmripower.org

Level 1

• Yk : Tk-vector timeseries for subject k

• Xk : Tk p design matrix

• βk : p-vector of parameters

• εk : Tk-vector error term, Cov(εk)=σ2kVk

=

βk0

βk1

βk2

βk3

+

Yk = Xk βK + εk

Level 2

• cβk • Xg : N pg design matrix• βg : pg-vector of parameters• εg : N-vector error term

– Cov(εg) = Vg = diag{c(XTkVk

-1Xk)-1σk2cT} + σB

2 IN

^

βcont ^

=

βg1

βg2

+

= Xg βg + εg

Within subject variability Between subject variability

Alternative distribution

• For a specific HA:cgβg=Δ

• t is distributed Tn-pg, ncp

– ncp= Δ/cg(XgTVg

-1Xg)cgT

N α Δ σ2 cg Xg

σ2WV σ2

B

c Xk σ2k Vk(σWN,σAR,ρ)

# subj FPR Effect Mag. 2nd Level Model

known

guessed

Effect SD

W/in Subj SD Btw Subj SD

1st Level Model Noise Mag. Noise Autocorrelation

Model

• Block design 15s on 15s off

• TR=3s

• Hrf: Gamma, sd=3• Parameters estimated from Block study

– FIAC single subject data

– Read 3 little pigs• Same/different speaker, same/different sentence• Looked at blocks with same sentence same speaker

Power as a function of run

length and sample size

• Assumes fixed maximal scanner time

• 21 Ss optimal• Btw 23 and 18

subjects sufficient– 17 subjects

cannot obtain sufficient power

More importantly….cost!•Cost to achieve 80% power

•Cost=$300 per subject+$10 per each extra minute

Power, Accounting for searching over space?

S Hayasaka, AM Peiffer, CE Hugenschmidt, PJ Laurienti.Power and sample size calculation for neuroimaging studies by non-central random field theory. NeuroImage 37 (2007) 721–730

Univariate vs. Multivariate

• Mass Univariate Modelling– Model each voxel independently

(account for dependence at inference stage)

– Great for localization– Doesn’t acknowledge spatial structure

• Multivariate Modelling– Model entire volume simultaneously– Explicitly uses spatial structure– Not as good for localization

Multivariate Classification:Classification of Subjects

• ICA Components appear to distinguishNC vs. SZ vs. BP– fMRI Experiment: Auditory oddball task

• But no one voxel responsible

VD Calhoun, PK Maciejewski, GD Pearlson, KA Kiehl. Temporal Lobe and ‘‘Default’’ Hemodynamic Brain Modes Discriminate Between Schizophrenia and Bipolar Disorder. Human Brain Mapping, Epub 2007 Sep 25

Multivariate Classification• Even very simple method can give very

good performance– Define average ICgrp for each group

– Label subj k with group that has minimum Euclidian distance (btw ICk & ICgrp)

Multivariate Classification:Prediction Time Series

Inferring Experience

Based Cognition

from Virtual Reality fMRI

Greg Siegle, Walter Schneider, Maureen McHugo, Melissa Thomas, Lori Koerbel, Lena Gemmer, Kate Fissell, Sudhir Pathak, Dan Jones, Kevin Jarbo

University of Pittsburgh

Pittsburgh Brain Activity Interpretation Competition

• Virtual Reality fMRI Paradigm– Subjects explore neighborhood, looking for fruit,

guns, dogs– 11 features rated continuously

• e.g. arousal, valance, movement, dog, cell phone, etc

– 3 Sessions of fMRI data• Features only given for 1st 2 sessions

Inferring Cognition

R2=.79

17 minutes

• Very different methods gave similar scores (based on pre- and post-processing)

• Similar methods (e.g., support vector machines) gave very different results.

Arousal

Valence

Hits

SearchPeople

SearchWeapons

SearchFruit

Instructions

Dog

Faces

FruitsVegetables

WeaponsTools

InteriorExterior

Velocity

-.2 0 .2 .4 .6 .8 1

1st place

Correlation

Surprisingly accurate results

Lessons from Contest

• Pre-processing mattered– Detrending details had big

impact

• Multivariate, but not un-informed– Winners used masks

• Weighting salient voxels, ignoring uninformative ones

– Post-processing clean up

• In general, extensive tuningper feature to be predicted

Subject14 visual cortexUse for “Interior Exterior”

Subject13 auditory cortex Use for “Dog”

Conclusions

• Power for fMRI– Focused ROI’s, but not too focused– Exact power predictions possible

• As always, based on guesses

• Classification– Uses entire brain to predict subject identity or

cognitive state– New direction, methods still evolving

• e.g. Support Vector Machines work well, but never with out appreciable feature selection/tuning

Estimating Power for fMRI & Classification Directions in fMRI Thomas Nichols Clinical Imaging Centre...

Documents

Transcript of Estimating Power for fMRI & Classification Directions in fMRI Thomas Nichols Clinical Imaging Centre...