A New, Nonparametric Information- Splitting Image Analysis Technique Mark Inlow Jing Wan, Sungeun...
-
Upload
hortense-scott -
Category
Documents
-
view
218 -
download
3
Transcript of A New, Nonparametric Information- Splitting Image Analysis Technique Mark Inlow Jing Wan, Sungeun...
A New, Nonparametric Information-Splitting Image Analysis Technique
Mark InlowJing Wan, Sungeun Kim, Kwansik Nho,
Shannon Risacher, Andrew Saykin, Li Shen
Life as a Statistics Professor…
Image Analysis Setup
• Data: image value at location for subject .• Question: Does the image mean depend on
predictor at any location ?• Methods:
1. Parametric: Random Field Theory• Con: Assumptions
2. Nonparametric: Permutation• Con: Slow
3. New Approach: weaker assumptions, faster?
Theoretical Basis
• One-sample case: test vs. for at least one location .
• Theorem 1 (New Result):– Let be the t-test statistic for location .– Let – If is then and are independent under .
• Note: is an increasing function of .
Information Splitting
Suppose we have a continuous predictor:
1. Partition the sample into subsamples2. Let be t-stat for , subsample . 3. Define , 4. If large, .5. Compute and ; apply Theorem 1
One Monotonic Recipe
1. else 2. Let = average of for smallest 1% of ; Let = average of for next smallest 1% of ; … Let = average of for largest 1% of .3. Fit model .4. Test using permutation.5. If normal, use permutation t-test.
Hippocampus Surface Normal Data
• = value of normal at left hippocampus at location for subject j
• = value of normal for right hippocampus• n = 582 subjects; k = 6611 locations• Let (assume bilateral symmetry)• Is there a relationship between (or ) and a
given SNP at one or more locations?
SA vs. P for (LR Hippo Sum)APOE BIN1
New Approach vs. RFT Results
Hippo Data
SNP New Approach
RFT Peak Amplitude
Left, APOELeft, BIN1LR Sum, APOELR Sum, BIN1
Permutation Distribution Normality
APOE BIN1• 10
SurfStat APOE T-Map for LR Sum
SurfStat BIN1 T-map for LR Sum
Comments
• Information splitting: info at location shared by and which are independent under .
• Performance/properties: seem favorable compared to RFT and permutation methods
• Going forward:– Incorporate spatial information!–Apply to larger images–Do formal simulation studies
Acknowledgements
1. Andrew Saykin, Li Shen, and the Department of Radiology and Imaging Sciences, IU School of Medicine, who supported and financed my 2010-2011 sabbatical.
2. My main coauthor: Jing Wan, who did the SurfStat statistical analyses and data management.
3. My other coauthors/colleagues: Sungeun Kim, Kwansik Nho, and Shannon Risacher.
Hippocampus Surface Data
• FreeSurfer and Large Deformation Diffeomorphic Metric Mapping (FS+LDDMM) were used to segment hippocampal surfaces from MRI scans
• To remove size effect, total intracranial volume (ICV) was adjusted to a constant and each hippocampus was scaled accordingly.
• Rigid body transformation was applied to register each hippocampus to a template.
• 6611 Surface signals were extracted as the deformation along the surface normal direction of the template and were adjusted for baseline age, gender, education and handedness.
Genetic (SNP) Data
• Single Nucleotide Polymorphism (SNP) – DNA sequence location possessing nucleotide variants of length one, i.e., T vs. C or A vs. G.
• The SNP data were genotyped using the Human 610-Quad BeadChip.
• Top 23 SNPs from AlzGene database and a SNP from the TOMM40 gene were considered.
• After quality controls, 20 SNPs remained.
Random Field Theory
• Suppose we want to test the global composite null Ho: for all for a given SNP.
• By the Bonferroni inequality:
• Gaussian Random Field Theory (RFT) provides much less conservative estimate:
where the sum is over the number of dimensions of the image (K. Worsley)
Random Field Theory, Cont.:
• RFT p-value for maximum statistic
• is the number of -dimensionalresels (resolution elements); it depends on smoothness (correlation) of image, e.g.
• is the -dimensional Euler Characteristic density. For large values of Euler C. is 0 or 1 depending if for any
Random Field Theory Varieties
• Maximum Test Statistic: P-value =
• Spatial Extent of Suprathreshold ’s:P-value =
where is the number of connected suprathreshold ’s; is observed numberexceeding threshold .
• Cluster Maximum and Spatial Extent
Left Spherical Distribution TheoryTheorem: Let be a matrix of -dimensional observations which is multivariate normal Let be a -dimensional vector of weights determined uniquely by .• let .• Let .• Let .• Then has a distribution.
Comparison of Maps
Information-Splitting: Statistical Parametric Map:
Materials• 582 non-Hispanic Caucasian participants 166 healthy controls (HCs), 287 mild cognitive impairment
(MCI), and 129 AD
• Magnetic resonance imaging (MRI) data• 20 SNPs were selected from the AlzGene database and
TOMM40 gene and coded to test additive genetic effect (i.e. dose dependent effect of the minor allele).