All Hands Meeting 2005 FBIRN AHM 2006 Statistics Working Group Update Greg Brown, UCSD Hal Stern,...
Transcript of All Hands Meeting 2005 FBIRN AHM 2006 Statistics Working Group Update Greg Brown, UCSD Hal Stern,...
All Hands Meeting 2005FBIRN AHM 2006
Statistics Working Group Update
Greg Brown, UCSD
Hal Stern, UCI
Statistics Update Discussion Points
Aims of Statistics Workgroup
Activities last 6 months (highlights only)
Future Plans
Aims
Aim 1. Refine tools to assess the quality and reliability of fMRI data, and apply these tools to guide the collection and analysis of multi-site imaging data
Aim 2. Develop statistical methods to analyze multi-site fMRI data, while accounting for between site variation
Aim 3: Develop statistical and machine learning tools to identify homogenous subgroups
Statistics Workgroup Structure
Statistical and FBIRNProgram m ing Inte gration
Greg Brow n- Leader
Re liability andCalibration
Hal S tern- Leader
Data Proce ssingAnthony Gam s t- Leader
Algorithm De v e lopm e ntPadhraic Sm yth- Leader
Statistical W GHal S tern- C o-C hair
Greg B row n- C o-C hair
UC SD (Greg B row n)UNM (Lee Friedm an)
S tanford (Gary Glover)Yale (D an Mathalon)
D uke (Greg McC arthy)BW H (Kelly Zou)
UC SD (Greg B row n)UC I (Hyo Jong Lee)
D uke (Greg McC arthy)UC SD (Burak O zyurt)UC SD (R andy Yum el)
UC SD (D ata Tec hnic ian TBD )
UC SD (Anders D ale)BW H (O la Frim an)
UC SD (Anthony Gam s t)BW H (S teve P ieper)
UC I (Hal S tern)MGH (Mark Vangel)
BW H (S im on W arfield)BW H/MIT (Sandy W ells )
UC SD (Anders D ale)D uke (Syam Gadde)
UC SD (Anthony Gam s t)MGH (D oug Greve)UC I (Hyo Jong Lee)
UC SD (R andy Notes tine)UC SD (Burak O zyurt)BW H (S teve P ieper)UC SD (Nik Sc hork)
Data Processing Statistics WG
Developed download scripts at several sites
Continual download script running at San Diego site
• Field maps for GE sites need special file structure to upload
• Down load time varies by download site and by down load software options
Data Processing Statistics WG
Preprocessing Scripts (stand alone modules and integration with FIPS are available for most scripts) • All scripts run on Analyze 7.5 format• Some scripts run on AFNI BRIK format also. • Scripts available for
Slice time correction Motion correction B0 inhomogeniety warping Spatial smoothing to a target smoothness (several approaches
are available)• Scripts are in place for Siemen’s sites and scripts for GE
sites are being integrated into fBIRN stream• These scripts have been run on all auditory oddball
images from Minnesota, MGH, and New Mexico
Data Processing Statistics WG
Several QA tools have been tested: Duke tools, GabLab tools, AIRT, AFNI tools
• Goal is to develop automated or semi-automated QA tools usable with large image datasets
• Validation of these tools will require visual inspection
Migrating Functional Image Processing System (FIPS) throughout the fBIRN consortium.
• Five fBIRN sites are currently using FIPS to test the processing of images at their site
• FIPS Power Users are being trained at several sites. These power uses are meant to be a regional fBIRN resource as well as a local resource. They will relieve the FIPS developers from day to day consultations about FIPS
Data Processing Statistics WG
Processing strategy to test between-group hypotheses involving auditory oddball and Sternberg Memory Scanning paradigms.
• One site will be the lead site for this analysis so that the fBIRN community presents a uniform report of results to the general imaging community
• UCSD has volunteered to be the lead site for between group hypotheses
• Other sites have volunteered to analyze images from their site UCI University of Minnesota
• The lead site will re-analyze a subset of images from volunteer sites to insure uniformity of analysis results.
Reliability and Calibration WG
What is the outer limit of reliability of robustly activating paradigms in multi-site fMRI studies?
How reliably did the Phase I traveling subjects study measure site variation?
How much unwanted variance can be reduced in multisite sensorimotor imaging data by breath hold calibration once between site differences in image intensity are controlled?
Outer limit of reliability: Sensorimotor Task and Breath Hold Tasks
FIPS
Signed Magnitude
Top 10%
AFNI % Signal Change from Average
Image Value
Mean Across ROI
Task Generalizability
Coefficient
Dependability
Coefficient
Generalizability
Coefficient
Dependability
Coefficient
Sensorimotor
(Visual ROI)
.92 .79 .93 .80
Breath Hold
(Average across ROIs)
.92 .86 .94 .88
Outer limit of reliability: Conclusions
For simple sensorimotor and breath hold tasks, the reliability of intensity corrected measures of BOLD response for a region of interest can be very good to excellent.
One month test-retest correlation coefficients for subtests of the Wechsler Adult Intelligence Test III for adults 30 to 54 ranges from .70 to .93.
Consistency measures of fMRI reliability can be as good or better than that of well constructed psychological tests scores.
Sensorimotor: Variance Components Analysis
Percent Variance Accounted For in Visual ROI (GENOVA)
FIPS Signed Magnitude Top 10%
AFNI
Variance Source
Uncorrected
Uncorrected
Person 18.24 16.67
Day 0.13 Neg
Run 0.02 . Neg
Site 32.43 29.01
Person by Day Neg Neg
Person by Run 0.65 Neg
Person by Site 7.61 5.28
Person by Hemisphere 2.81 Neg
Person 3-ways 34.02 41.78
Residual (4-way + ) 2.89 5.67
Breath Hold Task: Variance Components Analysis
Percent Variance Accounted For (GENOVA)
FIPS Signed Magnitude Top 10%
AFNI
Variance Source
6 ROIs
10 ROIs
Person 37.02 37.33
Day 0.27 0.03
Run 0.27 . 0.30
Site 22.44 23.87
Person by Day 0.76 1.77
Person by Run 1.65 0.63
Person by Site 4.90 7.58
Person by Hemisphere - -
Person 3-ways 13.46 25.93
Residual (4-way + ) 16.22 0.90
Sensorimotor – Breath Hold Task Comparison
The sensorimotor task is more sensitive to site effects than to subject effects.
The breath hold task is more sensitive to subject than to site effects.
Reliability of Site Differences
Treat Site rather than Subject as Measurement Object
Same variance components tables presented previously can be used to estimate how reliably site differences were measured across the study factors of run, day, and person.
Reliability of Site Differences: AFNI Analysis
Task Consistency Dependability
Sensorimotor
(visual ROI)
.97 .92
Measurement of Site Variance
Measurements of site variability provided by the Phase I traveling subject study were very reliable, at least in the visual region of interest.
How much unwanted variance can be reduced in multisite sensorimotor imaging data by breath hold calibration once between site differences in image intensity are controlled?
Breath Hold Correction of Sensorimotor Data
Our previous work showed that breath hold calibration improved the dependability of native regression weights that were not intensity corrected.
Site specific calibration:
Subject tailored calibration
site specific a at hold breath mean
mean hold breath grand value orsensorimot
run and day, site, subject, comparable for value hold breath
mean hold breath grandvalue orsensorimot
Reliability of Breath Hold Calibrated Intensity Corrected Values: Visual ROI
FIPS Sign Mag
Top 10%
AFNI %Change
Entire ROI
Condition Generalizability Dependability Generalizability Dependability
Uncorrected .92 .79 .93 .80
Site Specific .93 .83 .94 .86
Subject Tailored
Analyses done but not double checked. Subject Tailored calibration appears to be no better than Site Specific calibration.
Reliability of Breath Hold Calibrated Intensity Corrected Values: Hand ROI
FIPS Sign Mag
Top 10%
AFNI %Change
Entire ROI
Condition Generalizability Dependability Generalizability Dependability
Uncorrected .92 .81 .88 .84
Site Specific .92 .83 .87 .85
Subject Tailored
Analyses done but not double checked. Subject Tailored calibration appears to be no better than Site Specific calibration.
Visual ROI Task: Site Specific Correction
Percent Variance Accounted For in Visual ROI (GENOVA)
FIPS Signed Magnitude Top 10%
AFNI %Change Entire ROI
Variance Source
Uncorrected
Site Specific Breath Hold Corrected
Uncorrected
Site Specific Breath HoldCorrected
Person 18.24 20.08 16.67 20.87
Day 0.13 .19 Neg Neg
Run 0.02 .06 Neg 0.03
Site 32.43 27.04 29.01 21.47
Person by Day Neg Neg Neg Neg
Person by Run 0.65 0.67 Neg 1.62
Person by Site 7.61 7.29 5.28 6.95
Person by Hemisphere 2.81 2.26 Neg Neg
Person 3-ways 34.02 37.47 41.78 41.85
Residual (4-way + ) 2.89 3.55 5.67 5.89
Hand Area ROI: Site Specific Correction
Percent Variance Accounted For in Hand ROI (GENOVA)
FIPS Signed Magnitude Top 10%
AFNI %Change Entire ROI
Variance Source
Uncorrected
Site Specific Breath Hold Corrected
Uncorrected
Site Specific Breath HoldCorrected
Person 15.154 15.69 23.32 23.68
Day Neg Neg Neg Neg
Run 0.46 .46 0.0 0.0
Site 21.80 17.03 11.28 7.81
Person by Day 0.41 0.70 Neg Neg
Person by Run 0.14 0.22 Neg Neg
Person by Site 6.68 6.60 28.92 31.26
Person by Hemisphere 0.58 0.25 0.38 0.29
Person 3-ways 38.61 41.53 27.04 27.66
Residual (4-way + ) 11.51 12.31 6.47 7.60
% of Site Variance Reduced by Site-Specific Breath Hold Calibration:
ROI FIPS AFNI
Visual
16.62% 27.88%
Hand
21.88% 30.76%
%var uncorrected site factor - %var correct site factor
%var uncorrected site factor
Conclusions on Breath Hold Correction (for intensity normalized MR images) Site specific breath hold calibration does not improve consistency
measures of reliability, at least for the highly consistent fBIRN sensorimotor task.
Site specific breath hold calibration produces modest increases in absolute agreement measures of reliability for the fBIRN sensorimotor task.
Site specific breath hold calibration reduces the unwanted variance
associated with site by 16% to 31%, depending on ROI and processing choices.
Site specific breath hold calibration did not reduce unwanted person by site variance for intensity normalized MR image.
Statistical and Programming Integration WG
Much of this work has migrated to other Statistics WGs, such as Data Processing
The WG is working with the BIRN CC to implement the program work flow scheduling system, Condor, at the San Diego site.
Future Plans: Reliability and Calibration
Complete Phase I variance components analysis of breath hold calibration• Confirm subject specific analysis• Complete analysis of auditory ROI• Compare results from completely crossed design with
run and day nested under site• Compare traditional method of moments method for
estimating variance components with Bayesian method Mean squares
Perform a generalizability and variance components analysis of smooth-to correction
Test newly developed calibration methods on Phase I data
Future Plans – Data Processing: Preprocessing
Preprocess all image sets
Complete artifact detection
Upload preprocessed images and artifact detection log into the Federated Database
Artifact correct images and upload corrected images
Data Processing
Train FIPS Power Users
Complete all subject level analyses and upload them into the database
Extend FIPS to level II (several extensions might be required)
Compare between group analyses plans• Conventional fixed effects design with site and group• Conventional fixed effects design with covariates
Site specific covariate adjustment Pooled covariate adjustement
• Meta-analytic methods site-specific error-weighting
Future Plans - Statistical and Programming Integration
Make the integration of FreeSurfer into FIPS pipeline more generally available.
Complete an implementation of Condor at San Diego site.
Algorithm Development
Extend work on independent components analysis done at Yale to multi-group and multi-task applications and incorporate into FIPS (perhaps through FSL Melodic).
Extend work done at BWH/MIT on Multivariate Autoregressive (MAR) Model for effective connectivity analyses to multi-group context.
Extend work with the expectation-maximization STAPLE method of analyzing inter-site differences to the voxel level.
Further develop the UCI parametric response surface model and integrate into the analysis pipeline.
Continue work on group classifiers.
Future Plans: Revise Aim 3 Current Aim 3: Develop statistical and machine learning tools to identify
homogenous subgroups
Proposed New Aim: Develop novel statistical and machine learning tools to analyze multisite imaging data.
(eg. STAPLE, Independent Components Analysis for Multisite-Multitask image data, Parametric Response Surface Modeling, MAR, etc.)
New aim would include the search for homogenous subgroups to the extent that it is feasible, but acknowledge other novel methods in development.
New Aim elevates the creative work being done in the Statistical Working Group to a formal project goal.
Revise Statistics Workgroup Structure
Data Processing
Greg Brown
Image Pipeline Forum
Doug Greve, Lee Freidman
Level II Statistical Modeling of MultiSite-MultiGroup Imaging
Data
Timeline
Train FIPS Power Users Vince’s work Sandy’s work Condor
Activities last 6 months• Data download • Using FIPS• Extending FIPS to second level analysis• Phase II image analysis plan• Variance Components analysis of Phase I images
Future Plans
Testing the Testbed Hypothesis
Testbed Hypothesis:
Before a federated imaging database can be released to the medical and scientific
community, it must be tested by performing a large-scale study involving patients.
Alternative Hypothesis:
(Field of Dreams Hypothesis): If you build it they will come.
Confirming the Testbed Hypothesis
The Testbed Hypothesis is being confirmed (with a vengence)
Revisions of the Testbed need to be programmed into our resource planning (especially personnel)
What are the implications of the Testbed Thesis for the use of distributed imaging databases outside of arenas where they have been tested? (Eg., longitudinal studies, drug trials).
Need for advocacy and exchange with medical scientists outside of BIRN
Statistics Update Discussion Points
Aims
Activities last 6 months• Data download and Testing the Database • Creation of Preprocessing Scripts• Using FIPS • Extending FIPS to second level analysis• Phase II image analysis plan• Variance Components analysis of Phase I images• Pipeline Forum
Future Plans
Subject Tailored Correction
Percent Variance Accounted For in Visual ROI (GENOVA)
FIPS Signed Magnitude Top 10%
AFNI
Variance Source
Uncorrected
Site Mean Breath Hold Corrected
Uncorrected
Site Mean Breath HoldCorrected
Person 18.24 37.02 16.67 37.33
Day 0.13 0.27 Neg .03
Run 0.02 Neg Neg .30
Site 32.43 22.44 29.01 23.87
Person by Day Neg 0.76 Neg 1.77
Person by Run 0.65 1.74 Neg 0.63
Person by Site 7.61 4.90 5.28 7.58
Person by Hemisphere 2.81 - Neg - Person 3-ways 34.02 13.46 41.78 25.93
Residual (4-way + ) 2.89 16.22 5.67 0.90