A Novel Bayesian Approach for Uncovering Potential Spectroscopic Counterparts for Clinical Variables...
-
date post
20-Jan-2016 -
Category
Documents
-
view
217 -
download
0
Transcript of A Novel Bayesian Approach for Uncovering Potential Spectroscopic Counterparts for Clinical Variables...
A Novel Bayesian Approach for Uncovering
Potential Spectroscopic Counterparts for
Clinical Variables in 1H NMR Metabonomic Applications
Aki Vehtari1*, Ville-Petteri Mäkinen1,2, Pasi Soininen3, Petri Ingman4,
Sanna Mäkelä5, Markku Savolainen5, Minna Hannuksela5,
Kimmo Kaski1, and Mika Ala-Korpela1*
1Laboratory of Computational Engineering, Systems Biology and Bioinformation Technology, Helsinki University of Technology, P.O. Box 9203, FI-02015 HUT, Finland;
2Folkhälsan Research Center, University of Helsinki, Finland;
3Department of Chemistry, University of Kuopio, Finland;
4Department of Chemistry, University of Turku, Finland;
5Department of Internal Medicine, University of Oulu, Finland.
{*Aki.Vehtari, *Mika.Ala-Korpela}@hut.fi
Protein lipid aggregates
The ‘omics’ Revolution and Systems Biology
Lipoproteins – the lipid transporters in human circulations
Metabo*omics
A T H E R O S C L E R O S I S
Underlies the clinical conditions leading to death of
approximately half of the people in Western countries.
A systemic disease characterised by the local build-up of
lipid-rich plaques within the walls of large arteries.
The Trade-Off between Metabolic Coverageand the Quality of Metabolic Analysis
Fernie, Trethewey, Krotzky and Willmitzer, Nat Rev Molec Cell Biol 5, 1 (2004).Systems Biology & the ‘omics’ Revolution
Feasible, done
To be explored… … metabo*omics…
Lipoprotein subclasses are a key issue in atherothrombosis
5 10 20 40 60 80 1000
Diameter (nm)
1.20
1.10
1.06
1.02
1.006
0.95
Den
sity
(g/
ml)
HDL2
HDL3
ChylomicronRemnants
VLDL
IDL
Chylo-microns
Lp(a)
LDL
http://www.liposcience.com/
> million NMR LipoProfile® tests
J. D. Otvos et al., LipoScience
Inc. -CH3
15 subclasses
1 spectrum at a time
Quantification of Biomedical NMR
Data using Artificial Neural Network
Analysis: Lipoprotein Lipid Profiles
from 1H NMR Data of Human Plasma
Ala-Korpela, Hiltunen and Bell. NMR in Biomedicine 8, 235 (1995)
1H NMR biochemistry versus clinical biochemistry
Metabolic information by 1H NMR spectroscopy of serum
Principal Component Analysis
Lipoprotein Subclass Profiles via 1H NMR SpectraSelf-Organising Maps
SOM – rather easy and rather fast
The SOM clearly organised according
to the lipoprotein subclass profiles,
i.e.,
according to the spectral information
in the lipoplasma spectra.
Lipoprotein Subclass Profiling by 1H NMR
METABOLIC SYNDROME
METABOLIC
PATHWAY
NORMAL
-N(CH3)3 region / SOM U-matrix
Suna, et al., NMR in Biomedicine, submitted.
1H NMR biochemistry versus clinical biochemistry
Metabolic and
other individual
characteristics
Metabolite
profiles of
pre-dose biofluids
Inter-subject
variation in
effects of drugs
Influence
Influence
Predictable…?!
1H NMR biochemistry versus clinical biochemistry
Individual Risk Assessment and Diagnostics (of Atherothrombosis)
• T H E R E I S A C A L L F O R
M E T A B O N O M I C A P P R O A C H E S …
… p a r t i c u l a r l y s i n c e:
The 1H NMR Profile of Serum –in principle– Contains ALL
the Relevant Information for the CHD Risk Assessment
= Lipoprotein Subclasses + many other metabolites…
1H NMR Spectra of Human Serum at 500 MHz
Molecular windows
A Novel Bayesian Approach for Uncovering
Potential Spectroscopic Counterparts for
Clinical Variables in 1H NMR Metabonomic Applications
Aki Vehtari1*, Ville-Petteri Mäkinen1,2, Pasi Soininen3, Petri Ingman4,
Sanna Mäkelä5, Markku Savolainen5, Minna Hannuksela5,
Kimmo Kaski1, and Mika Ala-Korpela1*
1Laboratory of Computational Engineering, Systems Biology and Bioinformation Technology, Helsinki University of Technology, P.O. Box 9203, FI-02015 HUT, Finland;
2Folkhälsan Research Center, University of Helsinki, Finland;
3Department of Chemistry, University of Kuopio, Finland;
4Department of Chemistry, University of Turku, Finland;
5Department of Internal Medicine, University of Oulu, Finland.
{*Aki.Vehtari, *Mika.Ala-Korpela}@hut.fi; [email protected]
Objectives and requirements
● Quantitative target: Estimating the value of a
clinical variable from 1H NMR spectrum.
- Accuracy must be maximized.
● Explanatory target: What are the spectral
features that best explain the clinical variable?
- Results must be easy to interpret.
● The two requirements can be conflictive.
Dataset
●100 serum samples from an ongoing clinical study of the effects of alcoholism (Dept Internal Medicine, Univ Oulu).
●Two 1H NMR molecular windows (LIPO & LMWM) were measured from each sample.
●A 500 MHz NMR-spectrometer with a double-tube system that enables absolute metabolite quantification.
●Automatic sample changer (24 samples in 16h).
VLDL / LDL / HDL
Overlapping resonances
Phospholipid choline headgroup
Lipoprotein spectra
TriglyceridesHDL
particles
Lipid signals
Lactate doublet
Cholesterol
Low-molecular weight metabolites
Glucose peaks
Creatinine
Acetate
Lactate
AlanineValine
Creatinine
Correlation analysis
HDL particles
Triglycerides
Abnormal triglycerides and HDL cholesterol are associated with cardiovascular diseases and are components of the metabolic syndrome, a clinically established condition with increased risk for atherosclerosis.
1H NMR biochemistry versus clinical biochemistry
Bayesian inference*
Regression model- robust against outliers- linearity preferred
Feature extraction- biologically motivated- easy to interpret- relevant wrt to target
Feature selectionFeature weights
Spectral parameterisation
*This is only a schematic view and not a precise description of the posterior density.
Regression model
●Heteroscedastic linear regression with scale-mixture Gaussian noise model (asymptotically Student-t).
●μ is the α2U-scaled predictor, effectively reducing the effect of outliers.
●The purpose of α is to improve convergence of Gibbs’ sampling (Gelman et al. 2004).
Kernel-based feature extraction
width
location3σ
Spectral parameterisation
●Gaussian kernels truncated at 3σ.
●Each kernel is represented by width and location.
●Posterior widths and locations were obtained by slice sampling from [0.008, 0.8] ppm.
●The number of kernels was obtained by reversible jump MCMC (max 20 kernels).
●MCMCStuff software for Matlab was written byAki Vehtari et al.
Additional details
● The degrees of freedom for the residual
model was obtained by slice sampling
within [2, 40].
● Prior for the number of kernels was a
decaying exponential to eliminate the “tail
saturation” effect.
● Ten independent chains of 10 000 samples.
● Predictive replicates were found to closely
match10-fold cross-validation results.
Quantification results
Non-spectroscopic measurement
1H
NM
R
VLDL-TGR2 = 0.97
HDL-CR2 = 0.87
n = 75 n = 67
1H NMR biochemistry versus clinical biochemistry
Relevant spectral features for VLDL-TG
Frequency [ppm]
Best observable VLDL-TG signal is located at the biochemically expected frequencies. -CH3
(-CH2-)n
-N(CH3)3
Relevant spectral features for HDL-C
Frequency [ppm]
Best observable HDL cholesterol signals are in the phospholipid choline headgroup region
-CH3
(-CH2-)n
-N(CH3)3
Conclusions
●Kernel based parameterisation is able to describe the data effectively.
●A Bayesian treatment with linear regression gives both accuracy in quantification and relative ease of interpretation.
●The results are biochemically fully coherent.
Future work
●Continued assessment of usefulness of different approaches within clinical research environment (interpretation).
●Systematic testing of kernel-based regression with Bayesian and other approaches (accuracy).
●Analysis across the full frequency range in both molecular windows.
●Application of Bayesian kernel parameterisation to classification problems.
Good life is antiCHD…!It is most probable that the
Bayesian methodology will
have a crucial role in paving
the way for metabonomics
on the clinical arena.
Life is about probabilities (and lipoproteins)…
THANK YOU