Estimating Power for fMRI & Classification Directions in fMRI Thomas Nichols Clinical Imaging Centre...
-
Upload
davin-chamblin -
Category
Documents
-
view
224 -
download
1
Transcript of Estimating Power for fMRI & Classification Directions in fMRI Thomas Nichols Clinical Imaging Centre...
Estimating Power for fMRI&
Classification Directions in fMRI
Thomas Nichols
Clinical Imaging Centre
GlaxoSmithKline
Overview
• Power Exploration– ROIs (small/big, lots/few) ?
GD Mitsis, GD Iannetti, TS Smart, I Tracey & R WiseRegions of interest analysis in pharmacological fMRI: How do the definition criteria influence the inferred result?Epub NeuroImage
• Power Prediction
• Classification
Power Review: 1 Test
• Power: The probability of rejecting H0 when HA is true
• Specify your null distribution– Mean=0, variance=σ2
• Specify the effect size (Δ), which leads to alternative distribution
• Specify the false positive rate, α 0 2 4 6 8-4 -2
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
α Power
Null Distribution Alternative Distribution
Δ/σ
Power: 100,000 Tests?
• Avoid Multiple Testing Problem if possible– Typically study will use well-characterized paradigm– Expected region of response should be known
• But…– Variation in functional and structural anatomy– “Perfect” region never known
• Should we use focal ROI?• Voxel-wise search in neighborhood?• Over whole brain anyway?
Qualitative Power Exploration
• Simplified power setting– Not voxel-wise; instead largish (>1000 voxel) VOIs
– Large VOIs: Assuming σwithin << σbetween • Hence different sized VOI’s will have similar variance
– Large VOIs: Assuming independence between VOIs
• Consider impact of many vs. fewer VOI’s– Many VOIs
• Better follows anatomy, possible shape of signal• Worse multiple testing correction
– Fewer VOIs• Will dilute localized signal • Fewer tests to correct for
Atlas 0 (AAL)k = 116 regionsαFWE = 0.00043
(surrogate for correlatedvoxel-wise search)
Atlas 3k = 17 regionsαFWE = 0.00294
Atlas 1 (AAL symmetric)k = 58 regionsαFWE = 0.00086
Atlas 4 (Lobar AAL)k = 6 regionsαFWE = 0.00833
Atlas 2k = 28 regionsαFWE = 0.00179
Atlas 5 (whole GM)k = 1 regionαFWE =0.05000
AAL & Derived ROI Atlases
Atlas 0 (AAL)k = 116 regionsαFWE = 0.00043Signal # VOIs = 1 Strength = 100%
Atlas 3k = 17 regionsαFWE = 0.00294Signal # VOIs = 1 Strength = 4.9%
Atlas 1 (AAL symmetric)k = 58 regionsαFWE = 0.00086Signal # VOIs = 1 Strength = 47%
Atlas 4 (Lobar AAL)k = 6 regionsαFWE = 0.00833Signal # VOIs = 1 Strength = 0.6%
Atlas 2k = 28 regionsαFWE = 0.00179Signal # VOIs = 1 Strength = 47%
Atlas 5 (whole GM)k = 1 regionαFWE =0.05000Signal # VOIs = 1 Strength 0.1%
L Amygdala
Power: L Amygdala, True ROI
• True ROI best(of course)
• Rich ROI atlas (k=116) beats coarser atlases– Dilution more
punishing than greater multiple testing
Power: L Amygdala, Shifted ROI
• True ROI best
• Wrong (unshifted) ROI next
• Rich ROI atlas still beats coarser atlases
Power: ½ of Mid-Cingulate
• Whole Mid-Cing ROI best
• Again, huge (k=116) atlas next best
• But we’ve assumed RFX– No precision
gain for large ROI’s, as shrinking σWiN is no help
Power: ½ of Mid-Cingulate: FFX
• Whole Mid-Cing ROI best
• Now Symmetric AAL atlas (k=58) best!– If σBTW small,
precision increase with large ROIs has impact
Power Exploration Conclusions
• Compared Range of Scales– Whole Brain, Lobar (k=6),…, AAL (k=116)
• Focal structures – Focal ROI’s best• More extended signals, with heterogeneity
– Rich atlas best• Dilution of signal worse than Bonferroni
• But whole-brain always less powerful than reduced volume– Suggests voxel-wise / “Multiple Endpoint” result
preferred, constrained coarsely
Why Doesn’t Bonf. Hurt More?
H0 True
H0 False
RejectH0
Type I
Error
α
Power
Accept H0
Correct Type II
Error
Truth (unobserved)
Tes
t R
esu
lt (
ob
serv
ed)
1000
950
50
100
50
50
• Example– 1100 total voxels– 100 voxels have β=Δ
• A test with 50% power on average will detect 50 of these voxels with true activation
– 1000 voxels have β=0• α=5% implies on average
50 null voxels will have false positives
• 1 Signal ROI– 1 opportunity for a positive
• 100 Signal Voxels– 100 opportunities for a
positive
Formal Power analysis
• N: Number of Subjects– Adjusted to achieve sufficient power
• α: The size of the test you’d like to use– Commonly set to 0.05 (5% false positive rate)
• Δ: The size of the effect you’re interested in detecting– Based on intuition or similar studies
• σ2: The variance of Δ– Has a complicated structure with very little intuition– Depends on many things …
Power for Group fMRI
. . .
Tim
e
Subject 1
Temporal autocorr.
Cov(Y)=σ2wV
Subject N. . .Between subject variability, σ2
B
. . .
. . .
Subject 2
J. Mumford & TE. Nichols. NeuroImage 39:261–268, 2008 http://www.fmripower.org
Level 1
• Yk : Tk-vector timeseries for subject k
• Xk : Tk p design matrix
• βk : p-vector of parameters
• εk : Tk-vector error term, Cov(εk)=σ2kVk
=
βk0
βk1
βk2
βk3
+
Yk = Xk βK + εk
Level 2
• cβk • Xg : N pg design matrix• βg : pg-vector of parameters• εg : N-vector error term
– Cov(εg) = Vg = diag{c(XTkVk
-1Xk)-1σk2cT} + σB
2 IN
^
βcont ^
=
βg1
βg2
+
= Xg βg + εg
Within subject variability Between subject variability
Alternative distribution
• For a specific HA:cgβg=Δ
• t is distributed Tn-pg, ncp
– ncp= Δ/cg(XgTVg
-1Xg)cgT
N α Δ σ2 cg Xg
σ2WV σ2
B
c Xk σ2k Vk(σWN,σAR,ρ)
# subj FPR Effect Mag. 2nd Level Model
known
guessed
Effect SD
W/in Subj SD Btw Subj SD
1st Level Model Noise Mag. Noise Autocorrelation
Model
• Block design 15s on 15s off
• TR=3s
• Hrf: Gamma, sd=3• Parameters estimated from Block study
– FIAC single subject data
– Read 3 little pigs• Same/different speaker, same/different sentence• Looked at blocks with same sentence same speaker
Power as a function of run
length and sample size
• Assumes fixed maximal scanner time
• 21 Ss optimal• Btw 23 and 18
subjects sufficient– 17 subjects
cannot obtain sufficient power
More importantly….cost!•Cost to achieve 80% power
•Cost=$300 per subject+$10 per each extra minute
Power, Accounting for searching over space?
S Hayasaka, AM Peiffer, CE Hugenschmidt, PJ Laurienti.Power and sample size calculation for neuroimaging studies by non-central random field theory. NeuroImage 37 (2007) 721–730
Univariate vs. Multivariate
• Mass Univariate Modelling– Model each voxel independently
(account for dependence at inference stage)
– Great for localization– Doesn’t acknowledge spatial structure
• Multivariate Modelling– Model entire volume simultaneously– Explicitly uses spatial structure– Not as good for localization
Multivariate Classification:Classification of Subjects
• ICA Components appear to distinguishNC vs. SZ vs. BP– fMRI Experiment: Auditory oddball task
• But no one voxel responsible
VD Calhoun, PK Maciejewski, GD Pearlson, KA Kiehl. Temporal Lobe and ‘‘Default’’ Hemodynamic Brain Modes Discriminate Between Schizophrenia and Bipolar Disorder. Human Brain Mapping, Epub 2007 Sep 25
Multivariate Classification:Classification of Subjects
• ICA Components appear to distinguishNC vs. SZ vs. BP– fMRI Experiment: Auditory oddball task
• But no one voxel responsible
VD Calhoun, PK Maciejewski, GD Pearlson, KA Kiehl. Temporal Lobe and ‘‘Default’’ Hemodynamic Brain Modes Discriminate Between Schizophrenia and Bipolar Disorder. Human Brain Mapping, Epub 2007 Sep 25
Multivariate Classification• Even very simple method can give very
good performance– Define average ICgrp for each group
– Label subj k with group that has minimum Euclidian distance (btw ICk & ICgrp)
Multivariate Classification:Prediction Time Series
Inferring Experience
Based Cognition
from Virtual Reality fMRI
Greg Siegle, Walter Schneider, Maureen McHugo, Melissa Thomas, Lori Koerbel, Lena Gemmer, Kate Fissell, Sudhir Pathak, Dan Jones, Kevin Jarbo
University of Pittsburgh
Pittsburgh Brain Activity Interpretation Competition
• Virtual Reality fMRI Paradigm– Subjects explore neighborhood, looking for fruit,
guns, dogs– 11 features rated continuously
• e.g. arousal, valance, movement, dog, cell phone, etc
– 3 Sessions of fMRI data• Features only given for 1st 2 sessions
Inferring Cognition
R2=.79
17 minutes
• Very different methods gave similar scores (based on pre- and post-processing)
• Similar methods (e.g., support vector machines) gave very different results.
Arousal
Valence
Hits
SearchPeople
SearchWeapons
SearchFruit
Instructions
Dog
Faces
FruitsVegetables
WeaponsTools
InteriorExterior
Velocity
-.2 0 .2 .4 .6 .8 1
1st place
Correlation
Surprisingly accurate results
Lessons from Contest
• Pre-processing mattered– Detrending details had big
impact
• Multivariate, but not un-informed– Winners used masks
• Weighting salient voxels, ignoring uninformative ones
– Post-processing clean up
• In general, extensive tuningper feature to be predicted
Subject14 visual cortexUse for “Interior Exterior”
Subject13 auditory cortex Use for “Dog”
Conclusions
• Power for fMRI– Focused ROI’s, but not too focused– Exact power predictions possible
• As always, based on guesses
• Classification– Uses entire brain to predict subject identity or
cognitive state– New direction, methods still evolving
• e.g. Support Vector Machines work well, but never with out appreciable feature selection/tuning