Post on 24-Jul-2020
06/11/09 1JB Poline
Databasing, neuroimaging and genetics
Jean-Baptiste Poline
Thanks: A. Barbot, B. Thyreau, Y. Schwartz , A. Moreno, B. Thirion, V. Frouin, E. Duchesnay, P Pinel and many
others
06/11/09 2JB Poline
Outline
• Motivation• Databasing and neuroimaging: a quick
review and taxinomy• Genetic databases: a very brief word• Neuroimaging and genetics: new needs• Two imaging genetics examples:
– Saguenay study– Imagen study
• Conclusion and perspective
06/11/09 3JB Poline
Motivation
• Imaging genetic studies can be divided into – Small groups selected for a specific polymorphism – Large studies involving hundreds / thousands of subjects for
sensitivity / exploratory approaches, (cf GWAS)
• Sharing: – several partners are involved, or the NIH requires it
• Data protection• Updating / data versioning
– Increasing the number of subjects, or information
• Queries made simpler / quicker / possible through the web or on disck
• Cost: Databasing reduces cost– Acquisition / maintenance
06/11/09 4JB Poline
You cannot handle large heterogeneous data without serious tools
•Exel files will kill you
06/11/09 5JB Poline
Databasing and neuroimaging: a quick review and taxinomy
See also…
• The bibliography database– Brainmap
• Networks / databases– BIRN, ADNI, Brainscape, fMRIDC…
• Knowledge based– Ontology projects / Xceed
• Processings…– Loni pipelines, NAMIC, Brainvisa, Neurogrid, Fistwidget,, …
• Each large project has its DB– Imagen, Saguenay, many others …
06/11/09 6JB Poline
BrainMap http://brainmap.org/
Three BrainMap applications :
1. Database searches and Talairach coordinate plotting(Sleuth)
2. Meta-analyses via the activation likelihood estimation (ALE) method; (GingerALE)
3. Entry of published functionalneuroimaging papers withcoordinate results (Scribe)
Not a resource for raw data, but may contain contrast ma ps
06/11/09 7JB Poline
The BIRN
06/11/09 8JB Poline
BIRN Data Repository
• Sharing data through the BDR to capture, curate, store, query, view, and download imaging and related data.
• Enable the sharing of existing, published data, • BDR as a mechanism to facilitate collaborations• Appropriate timeline for public release
– versioning
• A rich curatorial environment, built on the BIRN portal foundation, data submission process and subsequent sharing.
• XNAT remains a possibility
06/11/09 9JB Poline
NAMIC •creating a medical image computing platform •research on novel image analysis algorithms •deploying these capabilities
06/11/09 10JB Poline
06/11/09 11JB Poline
• ADNI methods available for non-ADNI studies. – imaging protocol, – image corrections, – ADNI phantom and analysis
software.
…
• http://www.adni-info.org• Access through request
06/11/09 12JB Poline
ADNI uses LONI Image Data Archive
06/11/09 13JB Poline
LONI
The LONI Image Data Archive: an environment for• safely archiving, • querying, • visualizing• sharing.
The archive facilitates• de-identification and pooling of
data from multiple institutions• protection from unauthorized
access• the ability to share data among
collaborative investigator
06/11/09 14JB Poline
Extensible Neuroimaging Archive Toolkit
06/11/09 15JB Poline
Neurogrid -http://www.neurogrid.ac.uk/
• A Grid-based network of neuroimaging centres and a neuroimaging tool-kit. Sharing data and expertise to facilitatethe archiving, curation, retrieval and analysis of imaging data
• Enable multiple sites large-scale clinical studies
• Practicalities: – Set up a secured account– Upload your brain image (T1, DTI)– Dowload results
06/11/09 16JB Poline
Outstanding questions
• Databases are still about large project, but local organisation is needed – How to reconcile the need for local need and real DB?
• Most of the tools from large projects require IT support (system manager + knowledge on neuroimaging) Often even if they pretend otherwise…
• Results are too rarely input in DB after analyses: ontology issues
• Large projects publications: are those the most efficient with respect to the current success criteria?– BIRN: about 80 publications in 5 years– ADNI: about 15? (pubmed)
06/11/09 17JB Poline
Some thoughts on neuroimaging and databasing
• Sharing data is not yet common but should be in the future– NIH trend, cost, specific population recrutment
• Remote computing is getting more common (cloud computing) but tools are still too difficult for average lab
• Reproducibility / provenance tracking of results may eventually impose databasingsolution
• Could be a cost effective solution…
06/11/09 18JB Poline
• Gene Database• A new database of genes and associated information is
available for searching in Entrez.• RefSeq
Reference sequences of chromosomes, genomic contigs, mRNAs, and proteins for human and major model organisms.
• OMIMA guide to human genes and inherited disorders maintained by Johns Hopkins University and collaborators.
• dbSNPA database of single nucleotide polymorphisms (SNPs) and other nucleotide variations.
NCBI (National Center for Biotechnology Information) Genome Resource guides http://www.ncbi.nlm.nih.gov/genome/guide/
06/11/09 19JB Poline
06/11/09 20JB Poline
See also the …. … resources
06/11/09 21JB Poline
db SNP:
• SNP rs2396753: Variations can be used for genemapping, definition of population structure, and performance of functional studies.– DBSNP
06/11/09 22JB Poline
Mapview
06/11/09 23JB Poline
Hapmap / Haploview
06/11/09 24JB Poline
Summarizing the needs
• Data protection / Backup / Archiving• Data (pseudo) anonymisation – deidentification
– The story of the pseudocode 2 and how it can be broken• Data entering and download
– User login/password based access– User specific view of the data
• Data versioning• Quality check – Data curation• Querrying the data (Gene/Img/Behav): Interface +
scripting– Different level: x,y,z ? Whole image / run?
• Sharing the results; (results re-entered)• Visualization
06/11/09 25JB Poline
Example 1: Saguenay Youth Study
A genetic study of long-term effects of prenatal exposure to maternal cigarette smoking:
On: * Brain Structure* Brain Function
* Cardiovascular Function* Body Fat/Metabolism
In: * Human Subjects (500 sibpairs)* Recombinant Inbred Strains of Rats
Funded by CIHR (PIs: T. Paus and Z. Pausova)
06/11/09 26JB Poline
Saguenay-Lac-Saint-Jean region
06/11/09 27JB Poline
Saguenay Youth Study
•Genome-wide scan with sib-pair linkage analysis•Fine mapping with family-based association analyses
•500 sib-pairs (+parental DNA)•Age: 12-18 years
•French-Canadian origin
250 non-exposed250 exposed
Matched by:• Maternal education
• School attended
Pausova et al. Human Brain Mapping 28:502-518, 2007
Saguenay Youth StudyData Collection
Telephone InterviewIII•Life habits of mother during pregnancy and now•Medical history of children, mother and father
30 min
Home VisitIV•School performance, activities at school, feelings at school, life at home (ECOBES, students)
•Your children and school, your education, your family life (ECOBES, parents)•Screen for psychiatric disorders (DISC Predictive Scale for adolescents)
•Puberty development; risky behaviors (cigarettes, drugs, alcohol); hyperactivity, conduct disorder, aggression, anxiety, and depression; delinquency (GRIP, adolescents)
•Cigarettes, drugs, and alcohol abuse; anxiety, depression, and anti-social behavior (GRIP, parents)•Drawing a blood sample (parents)
2 h
LaboratoryVNeuro-psychological Assessment
•IQ assessment (WISC-III)•Academic achievement (Woodcock-Johnson)
•Memory (Children’s Memory Scale)•Motor skills (pegboard, tapping, bi-manual coordination)
•Executive functions (interference, word fluency, working memory)•Emotion/Motivation (faces, voices, gambling, RFT)
•Language (FM threshold, phonological awareness, DAF, phonetic learning)
6 h
MRI scan•Brain
•Abdomen: fat and kidneysDiet and Physical Activity
•Twenty-four-hour food recall, •Food frequency questionnaire•Physical activity questionnaire
Hospital SessionVIBody composition•Anthropometry•Bioimpedance
•MRI (fat)Blood pressure, cardiovascular reactivity, and salivary cortisol
(Finometer: beat-to-beat, respiration)•Resting
•In response to postural change•In response to mental stress
4 h
School Session VIIFasting Blood Sample
Glucose and lipid metabolismLow-grade inflammation, endothelial and fibrinolytic dysfunctions, HPA activity
Sexual maturationSmoking habits
Nutrition
1 h
Genotyping: Candidate Genes and Total Genome ScanVIII
T1-weighted T2-weighted Proton Density MagnetizationTransfer Ratio
15 min 15 min 15 min
Structural Magnetic Resonance Imaging:
MR Pipeline: Quality Control
06/11/09 32JB Poline
One result
• White Mattervolume
Magnetization Transfer ratio
testosterone influenced WM volume to a greater extent i n males with the more “efficient”AR (short AR gene), compared with those with a less effi cient AR (long AR gene)
06/11/09 33JB Poline
Lessons from the Saguenay study
• Home made database (PHP, 1py)• Contains all variables (phone interview, etc) but not
the imaging data• No mecanism to share data• Home design for web pages for specific datasets
(~versioning)• Semi automatic analysis pipeline, results re-entered
in the DB
• The use of a specific population• Very large amount of behavioural or biological data• No tool easy for re-use
06/11/09 34JB Poline
Example 2: Imagen project and database: a brief review
• Genetically influenced individual differences in brain responsesto reward, punishment and emotional cues in adolescents mediate risk for mental disorders
• Neuroimaging : measurement of specific brain functionsimplicated in the etiology of mental disorders and link them to genetic and behavioural variations
• The goal of the present study is to identify the neurobiologicaland genetic basis of these traits and to assess their relevance for mental disorder. Means: a multicentre functional and structural genetic-neuroimaging study of a cohort of 2000+ 14 year old adolescents. Intermediate phenotypes of risk for adolescent mental illness will be explored.
06/11/09 35JB Poline
European partners
1. Berlin: A. Heinz2. Cambridge: replaced by Dresden, M. Smolka3. Dublin: H. Garavan4. Hamburg: C. Buechel5. London: G. Schumann, L. Reed, 6. Mannheim: H. Flor7. Nottingham: T. Paus8. Orsay: JL MartinotAlso: T. Robbins, Cam. P. Conrod, IOP, …
06/11/09 36JB Poline
WP 05: Neuroimaging standardisationYear 1 (1 year)
WP 01: Behavioural analysisof animal models; (3 years)Implementation Year 2-3
WP 03:Geneidenti-fication
Month 19-year 4
(2,5 years)
WP 06:Neuro-imaging
Year 2-4(3 years)
WP 02: Behavioural tasks in humans; (4 years)
Implementation Year 2-4
WP 04:Recruitment
andcharacterisation
Year 2-4 (3 years)
WP 08:DNA bank,
SNPdetection
andgenotyping
Year 2-5(4 years)
WP 07:Bioinfor-
matics and Biostatistics
Year 1-5(5 years)
WP 09: EthicsIMAGEN; Year 1-5 (5 years)
WP10: Training and dissemination; Year 1-5 (5 years)
WP 11: Project Management; Year 1-5 (5 years)
Preparation (Year 1) Preparation (Year 1)
Preparation (6 months)
WP 05: Neuroimaging standardisationYear 1 (1 year)
WP 01: Behavioural analysisof animal models; (3 years)Implementation Year 2-3
WP 03:Geneidenti-fication
Month 19-year 4
(2,5 years)
WP 06:Neuro-imaging
Year 2-4(3 years)
WP 02: Behavioural tasks in humans; (4 years)
Implementation Year 2-4
WP 04:Recruitment
andcharacterisation
Year 2-4 (3 years)
WP 08:DNA bank,
SNPdetection
andgenotyping
Year 2-5(4 years)
WP 07:Bioinfor-
matics and Biostatistics
Year 1-5(5 years)
WP 09: EthicsIMAGEN; Year 1-5 (5 years)
WP10: Training and dissemination; Year 1-5 (5 years)
WP 11: Project Management; Year 1-5 (5 years)
Preparation (Year 1) Preparation (Year 1)
Preparation (6 months)
06/11/09 37JB Poline
Step 1: One site collection and transfert (Scito, NNL)
06/11/09 38JB Poline
Step 2: data anonymisation and package handling
06/11/09 39JB Poline
Step 3: including data
40JB Poline
Work-Package 07 – Central Database
XNAT : a database tool[ Marcus & al 2007 ]
(also use in BIRN )
• XML schemas define database structure
( easy database modification )�
• Auto-generated tools :
• Web portal
• Command line
06/11/09 41JB Poline
Data included, use XML schema for DB Ontology
06/11/09 42JB Poline
06/11/09 43JB Poline
06/11/09 44JB Poline
Web
bas
edQ
ualit
ych
eck
06/11/09 45JB Poline
Web
bas
edQ
ualit
ych
eck
06/11/09 46JB Poline
(Pre-)Processings• T1
– SPM8 new segment– Brainvisa pipeline– Dartel / Free surfer have been tried out
• T2*– SPM8 preprocessing of all available EPI data – Strategy: mvt correction; reslicing, fMRI -> MPRAGE long,
MPRAGE long -> MNI template for each session– Homogenizing the log file to get fMRI protocols (dealing with
various number of runs, …)– Fitting the model intra subject (SPM) – Inter subject: in house (mixed effect + permutation)
• DTI– FSL – In house
06/11/09 47JB Poline
06/11/09 48JB Poline
Queries
• Give me T1 – normalized in MNI images for which subjects had score X above 5
• Give me behavioural scores of instrument X and Y for subjects with T2* image qualityabove Z
• Give me the genotypes of subject with bothbehavioural score X and DTI images of good quality
• Download results• API for scripts
06/11/09 49JB Poline
Automatic Quality check
06/11/09 50JB Poline
Neuroimaging scores for QC
fMRI Movement estimated
T1 mask and template overlap
Intra volume variance variation
06/11/09 51JB Poline
A few words on data analysis
06/11/09 52JB Poline
Neuroimaging and WGA
SNPs
G GT G
T TT G
G GaMRI dMRI fMRI
Clinical / behaviour
Find statistical linksOr
Predict
06/11/09 53JB Poline
Finding out the good analysisstrategies
SNP – 1M. +CNV
Transcriptom 50k
Images 200k-50k
Behaviour: <200
Data dimension reduction
Multiple comparisonpb
Inhomogeneousdata
Subjects
06/11/09 54JB Poline
Candidate SNPs vs. all image
f( )=voxel
G GC GC CC GG G
Stat. Map
Methods:- VBM, group fMRI, etc...
Complexity/multiple comparison issue:- ~106 tests or estimated parameters
For each voxel
06/11/09 55JB Poline
f( )=SNP
Method known as WGASMultiple comparison: ~106 tests
Plink?
For one voxel
One image region vs. all SNPs
06/11/09 56JB Poline
Feature selection approach
Selection
Selection
Gene-Imageon reduced
data
Or multivariate approaches
• Consider LD / spatial covariance / behaviourtests covariance
06/11/09 57JB Poline
Circuit Lecture de phrase Circuit Lecture de phrase –– damiersdamiersCorrelationCorrelation lateralisationlateralisation / / vistessevistesse de lecture pseudomotsde lecture pseudomots
Score=vitesse de lecture des pseudo-mots / étude cerveau entier p=0.01, 40 voxelsSans les outliers à deux écarts types de distance au moins de la moyenne
Circuit Ecoute de phraseCircuit Ecoute de phraseCorrelationCorrelation lateralisationlateralisation / / vistessevistesse de lecture pseudomotsde lecture pseudomots
Score=vitesse de lecture des pseudo-mots / étude cerveau entier p=0.01, 40 voxelsSans les outliers à deux écarts types de distance au moins de la moyenne
Circuit Lecture de phrase Circuit Lecture de phrase –– damiersdamiersDiffDiff . . LateralisationLateralisation –– genegene KIAA / SNP: rs155089 6>8KIAA / SNP: rs155089 6>8
Circuit Lecture de phrase Circuit Lecture de phrase –– damiersdamiersDiffDiff . . LateralisationLateralisation –– genegene KIAA / SNP: rs7761100 7>6KIAA / SNP: rs7761100 7>6
Type G/G G/T T/TType G/G G/T T/Tnb. sbj 32 28 7age 23 23 22men 39 40 50educ (y) 3.4 3.1 3.5Dysl.(%) 10 7 0Substr.(%) 70 79 81Pseudow(ms/w) 930 895 850
% dyslexic Pseudow. speed reading Substraction score
Type C/C C/T T/TType C/C C/T T/Tnb. sbj 2 21 34age - 25 24men - 44 36educ (y) - 3.1 3.4Dysl.(%) - 6 10Substr.(%) - 80 73Pseudow(ms/w) - 855 924
06/11/09 58JB Poline
peak
06/11/09 59JB Poline
Conclusion:
• A lot to be done: combining two complex and powerful data for – Better understanding of brain mecanisms– Better understanding of the impact of genetic
variations– Better risk factor prediction…
• Visualisation and interaction: see Abstract • Strategy for analysis is multiple to face huge
multiple comparison
06/11/09 60JB Poline
L Shen, S Kim, J D West, A J Saykin