Multivarite and network tools for biological data analysis
-
Upload
dmitry-grapov -
Category
Science
-
view
23.290 -
download
0
Transcript of Multivarite and network tools for biological data analysis
![Page 1: Multivarite and network tools for biological data analysis](https://reader033.fdocuments.net/reader033/viewer/2022050613/53effcc48d7f72874b8b6973/html5/thumbnails/1.jpg)
Dmitry Grapov and Oliver FiehnUniversity of California, Davis
Multivariate Analysis and Visualization Tools for
Metabolomic Data
![Page 2: Multivarite and network tools for biological data analysis](https://reader033.fdocuments.net/reader033/viewer/2022050613/53effcc48d7f72874b8b6973/html5/thumbnails/2.jpg)
State of the art facility producing massive amounts of biological data…
>20-30K samples/yr>200 studies
![Page 3: Multivarite and network tools for biological data analysis](https://reader033.fdocuments.net/reader033/viewer/2022050613/53effcc48d7f72874b8b6973/html5/thumbnails/3.jpg)
Sam
ple
Variable
Data Analysis and Visualization
Quality Assessment• use replicated mesurements
and/or internal standards to estimate analytical variance
Statistical and Multivariate• use the experimental design
to test hypotheses and/or identify trends in analytes
Functional• use statistical and multivariate
results to identify impacted biochemical domains
Network• integrate statistical and
multivariate results with the experimental design and analyte metadata
experimental design - organism, sex, age etc.analyte description and metadata- biochemical class, mass spectra, etc.
VariableSample
![Page 4: Multivarite and network tools for biological data analysis](https://reader033.fdocuments.net/reader033/viewer/2022050613/53effcc48d7f72874b8b6973/html5/thumbnails/4.jpg)
Sam
ple
Variable
Data Analysis and Visualization
Quality Assessment• use replicated mesurements
and/or internal standards to estimate analytical variance
Statistical and Multivariate• use the experimental design
to test hypotheses and/or identify trends in analytes
Functional• use statistical and multivariate
results to identify impacted biochemical domains
Network• integrate statistical and
multivariate results with the experimental design and analyte metadata
Network Mapping
experimental design - organism, sex, age etc.analyte description and metadata- biochemical class, mass spectra, etc.
VariableSample
![Page 5: Multivarite and network tools for biological data analysis](https://reader033.fdocuments.net/reader033/viewer/2022050613/53effcc48d7f72874b8b6973/html5/thumbnails/5.jpg)
Principal Component Analysis (PCA) of all analytes, showing QC sample scores
Data Quality AssessmentDrift in >400 replicated measurements across >100 analytical batches for a single analyte
Acquisition batch
Abun
danc
e QCs embedded among >5,5000 samples (1:10) collected over 1.5 yrs
If the biological effect size is less than the analytical variance
then the experiment will incorrectly yield insignificant results
![Page 6: Multivarite and network tools for biological data analysis](https://reader033.fdocuments.net/reader033/viewer/2022050613/53effcc48d7f72874b8b6973/html5/thumbnails/6.jpg)
Data Quality AssessmentAnalyte specific data quality
overviewSample specific normalization can be used to estimate and remove analytical variance
Raw Data Normalized Data
Normalizations need to be numerically and visually validated
log mean
low precision
%RS
D
high precision
SamplesQCs
![Page 7: Multivarite and network tools for biological data analysis](https://reader033.fdocuments.net/reader033/viewer/2022050613/53effcc48d7f72874b8b6973/html5/thumbnails/7.jpg)
Network Mapping
Ranked statistically significant differences within a a biochemical
context
Statistics
Multivariate
Context
++=
Statistical and Multivariate AnalysesGroup 1
Group 2
What analytes are different between the
two groups of samples?
Statistical
significant differences lacking rank and
context
t-Test
Multivariate
ranked differences lacking significance
and context
O-PLS-DA
![Page 8: Multivarite and network tools for biological data analysis](https://reader033.fdocuments.net/reader033/viewer/2022050613/53effcc48d7f72874b8b6973/html5/thumbnails/8.jpg)
Network Mapping
Statistics
Multivariate
Context
++=
Statistical and Multivariate AnalysesGroup 1
Group 2
What analytes are different between the
two groups of samples?
Statistical
t-Test
Multivariate
O-PLS-DA
To see the big picture it is necessary too view the data from multiple different angles
![Page 9: Multivarite and network tools for biological data analysis](https://reader033.fdocuments.net/reader033/viewer/2022050613/53effcc48d7f72874b8b6973/html5/thumbnails/9.jpg)
DeviumWebhttps://github.com/dgrapov/DeviumWeb
• visualization• statistics• clustering • PCA• O-PLS
![Page 10: Multivarite and network tools for biological data analysis](https://reader033.fdocuments.net/reader033/viewer/2022050613/53effcc48d7f72874b8b6973/html5/thumbnails/10.jpg)
DeviumWebhttps://github.com/dgrapov/DeviumWeb
• visualization• statistics• clustering • PCA• O-PLS
![Page 11: Multivarite and network tools for biological data analysis](https://reader033.fdocuments.net/reader033/viewer/2022050613/53effcc48d7f72874b8b6973/html5/thumbnails/11.jpg)
Functional Analysis
Nucl. Acids Res. (2008) 36 (suppl 2): W423-W426.doi: 10.1093/nar/gkn282
Identify changes or enrichment in biochemical domains
• decrease• increase
![Page 12: Multivarite and network tools for biological data analysis](https://reader033.fdocuments.net/reader033/viewer/2022050613/53effcc48d7f72874b8b6973/html5/thumbnails/12.jpg)
Functional Analysis: opportunity for ‘Omic integration
Use domain knowledge databases to integrate genomic, proteomic and metabolomic data
Current approaches can be limited to pathway level analyses
![Page 13: Multivarite and network tools for biological data analysis](https://reader033.fdocuments.net/reader033/viewer/2022050613/53effcc48d7f72874b8b6973/html5/thumbnails/13.jpg)
Networks
Biochemical•reaction•domain
Structural •molecular fingerprints• mass spectra
Empirical •correlation•partial correlation
BMC Bioinformatics 2012, 13:99 doi:10.1186/1471-2105-13-99
![Page 14: Multivarite and network tools for biological data analysis](https://reader033.fdocuments.net/reader033/viewer/2022050613/53effcc48d7f72874b8b6973/html5/thumbnails/14.jpg)
Mapped Network
- displaying metabolic differences in control vs.
malignant lung tissue
Biochemical Relationships
http://www.genome.jp/dbget-bin/www_bget?rn:R00975
![Page 15: Multivarite and network tools for biological data analysis](https://reader033.fdocuments.net/reader033/viewer/2022050613/53effcc48d7f72874b8b6973/html5/thumbnails/15.jpg)
Structural Similarity
http://pubchem.ncbi.nlm.nih.gov//score_matrix/score_matrix.cgi
![Page 16: Multivarite and network tools for biological data analysis](https://reader033.fdocuments.net/reader033/viewer/2022050613/53effcc48d7f72874b8b6973/html5/thumbnails/16.jpg)
Empirical NetworksUse experiment specific or data driven relationships to gain novel insight
into biochemical relationshipsurea cycle
nucleotide
synthesis
protein
glycosylation
![Page 17: Multivarite and network tools for biological data analysis](https://reader033.fdocuments.net/reader033/viewer/2022050613/53effcc48d7f72874b8b6973/html5/thumbnails/17.jpg)
Mass Spectral NetworksUse mass spectra as a proxy for structure to help make sense of
unknown compounds’ biochemical identities
Watrous J et al. PNAS 2012;109:E1743-E1752
unknown compounds are likely phytosterol esters
![Page 18: Multivarite and network tools for biological data analysis](https://reader033.fdocuments.net/reader033/viewer/2022050613/53effcc48d7f72874b8b6973/html5/thumbnails/18.jpg)
Mass Spectral NetworksUse mass spectra and empirical relationships to narrow down the
biochemical roles for unknown compounds
Rigorous chemical experiments identified the unknown compounds as partial derivatization products of glucose
![Page 19: Multivarite and network tools for biological data analysis](https://reader033.fdocuments.net/reader033/viewer/2022050613/53effcc48d7f72874b8b6973/html5/thumbnails/19.jpg)
MetaMapRhttps://github.com/dgrapov/MetaMapR
![Page 20: Multivarite and network tools for biological data analysis](https://reader033.fdocuments.net/reader033/viewer/2022050613/53effcc48d7f72874b8b6973/html5/thumbnails/20.jpg)
![Page 21: Multivarite and network tools for biological data analysis](https://reader033.fdocuments.net/reader033/viewer/2022050613/53effcc48d7f72874b8b6973/html5/thumbnails/21.jpg)
Analysis at the Metabolomic Scale and Beyond
pyruvate lactate
enzyme
gene Bgene A
Pathway independent metabolomic (known and unknown), proteomic and genomic data integration
![Page 22: Multivarite and network tools for biological data analysis](https://reader033.fdocuments.net/reader033/viewer/2022050613/53effcc48d7f72874b8b6973/html5/thumbnails/22.jpg)
Software and Resources•DeviumWeb- Dynamic multivariate data analysis and visualization platformurl: https://github.com/dgrapov/DeviumWeb
•imDEV- Microsoft Excel add-in for multivariate analysisurl: http://sourceforge.net/projects/imdev/
•MetaMapR: Network analysis tools for metabolomicsurl: https://github.com/dgrapov/MetaMapR
•TeachingDemos- Tutorials and demonstrations•url: http://sourceforge.net/projects/teachingdemos/?source=directory•url: https://github.com/dgrapov/TeachingDemos
•Data analysis case studies and Examplesurl: http://imdevsoftware.wordpress.com/
![Page 23: Multivarite and network tools for biological data analysis](https://reader033.fdocuments.net/reader033/viewer/2022050613/53effcc48d7f72874b8b6973/html5/thumbnails/23.jpg)
[email protected] metabolomics.ucdavis.edu
This research was supported in part by NIH 1 U24 DK097154