Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other...
Transcript of Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other...
![Page 1: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/1.jpg)
Applications of Tensor Applications of Tensor Methods in Life SciencesMethods in Life Sciences
Rasmus Bro
University of Copenhagen
Faculty of Life Sciences
![Page 2: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/2.jpg)
PARAFACA very nice model
Some examplesHow to store a cheese?
A model of wine
Some issuesVariable selection
Nonnegativity
Dealing with missing data
![Page 3: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/3.jpg)
PARallel FACtor analysis
• PCA - bilinear model,
X E= + +
b2
a2
b1
a1
E= +
A
B
1
F
ij if jf ijf
x a b e
![Page 4: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/4.jpg)
• PCA - bilinear model,
• PARAFAC - trilinear model,
X E= + +
c2
b2
a2
c1
b1
a1
E= +
A
C
B
1
F
ij if jf ijf
x a b e
1
F
ijk if jf kf ijkf
x a b c e
PARAFAC invented in 1970 by Harshman and
independently by Carroll & Chang under the
name CANDECOMP. Based on a principle of
parallel proportional profiles suggested in 1944
by Cattell
•R. A. Harshman. UCLA working papers in phonetics 16:1-84, 1970.
•J. D. Carroll and J. Chang. Psychometrika 35:283-319, 1970.
•R. B. Cattell. Psychometrika 9:267-283, 1944.
PARallel FACtor analysis
PCA PARAFACX = ABT Xk = ADkB
T, Dk =diag(c(k,:))
![Page 5: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/5.jpg)
Lamp
(uv-vis)
Sample
Excitation monochromator
Emissionmonochromator
Detector/IntensityExcitation
Emission
Fluorescence spectroscopy
Excitation-emission matrix – a chemical
fingerprint
![Page 6: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/6.jpg)
Amino acids
Chlorophyll
Porphyrin
ATP
NADH
Vitamins
(A, B2, B6, E)
Fluorescence excitation-emissionVery high sensitivity and selectivity towards important compounds
J. Christensen - Foodfluor database at
www.models.life.ku.dk/research/foodfluor
![Page 7: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/7.jpg)
![Page 8: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/8.jpg)
![Page 9: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/9.jpg)
PARAFAC - uniqueness
• Uniqueness - conditionsA PARAFAC model is unique when
kA + kB + kC 2F + 2
F is the number of components and kA is the k-rank of loading A = maximal number of randomly chosen columns which will be full rank (F)
J. B. Kruskal. Linear Algebra and its Applications 18:95-138, 1977.
N. D. Sidiropoulos and R. Bro. Journal of Chemometrics 14 (3):229-239, 2000.
XB
A
C• No rotational freedom
Unlike the bilinear ‘PCA’ model, there is only one solution
![Page 10: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/10.jpg)
PARAFAC is mathematical chromatographyakaBlind source separationSolving the cocktail party effectUnmixingCurve resolution...
Mathematical chromatography eliminates major problems in multivariate analysis:
• Indirect correlations stemming from rotational freedom• It also eliminates outliers• It determines underlying sources• Simpler because it provides a chemical model• It is way more noise insensitive
![Page 11: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/11.jpg)
How to store a cheese?How to store a cheese?
Oxidation• Oxidation from light causes rancid taste of cheese, butter etc.
• Important for packaging of food and shelf-storage
• Believed to be caused by riboflavin acting as photosensitizer
• Riboflavin does not absorb much red light, hence red material should protect
![Page 12: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/12.jpg)
ExperimentDifferent lightWith / without OxygenDifferent storage time
Samples measured by
• Sensory analysis (quality)• Fluorescence EEM
![Page 13: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/13.jpg)
Spectra from PARAFAC of EEMs
0 5 0 1 0 0 1 5 0 2 0 0 2 5 00
0 .0 5
0 . 1
0 .1 5
0 . 2
0 .2 5
600 620 635 650 665 680 695 705
nm
0 5 1 0 15 2 0 25 3 0 350
0 . 0 5
0 . 1
0 . 1 5
0 . 2
0 . 2 5
0 . 3
0 . 3 5
0 . 4
0 . 4 5
350 365 380 395 410 425 440 455 nm
Excitation Emission
Riboflavin
HematoPProtoP
Cloro B
Cloro A
Chlorin X
From JPWo/Matforsk
![Page 14: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/14.jpg)
Relation between sensory data and PARAFAC estimated concentrations
Rancid taste
From JPWo/Matforsk
Importance of different compounds
ProtoporphyrinChlorophyl BX (Chlorine?)
S S
S
S S
S
From JPWo/ Matforsk
![Page 15: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/15.jpg)
New result
• Apart from riboflavin at least five other light-sensitizers
• ‘New’ ones seem to be more important than riboflavin
• Fluorescence and PARAFAC provides a ‘simple’ approach for exploring these.
0
0,2
0,4
0,6
0,8
1
1,2
1,4
1,6
1,8
300 400 500 600 700
Abs.
Chlorophyll a
Riboflavin
UV
From JPWo/Matforsk
![Page 16: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/16.jpg)
PARAFAC can not handle shifts and shape changes
PARAFAC(1) Xk = ADkBT
A wine modelA wine model
![Page 17: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/17.jpg)
PARAFAC can not handle shifts and shape changes
PARAFAC(1) Xk = ADkBT
A wine modelA wine model
![Page 18: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/18.jpg)
R. A. Harshman. UCLA working papers in phonetics 22:30-47, 1972.H. A. L. Kiers, J. M. F. ten Berge, R. Bro. J. Chemom. 13:275-294, 1999.
R. Bro, C. A. Andersson, H. A. L. Kiers. J. Chemom. 13:295-309, 1999.
PARAFAC2
PARAFAC2 Xk = ADkBkT subject to Bk
TBk constant
PARAFAC(1) Xk = ADkBT
*Actually it is more general than shifts
but it’s a feasible approximation to
what PARAFAC2 can handle
![Page 19: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/19.jpg)
60 wine samples measured by GC-MS
Elution
Sam
ple
s
![Page 20: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/20.jpg)
Weird shifts
Overlap and shifts
Low intensity and baseline
fronting
![Page 21: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/21.jpg)
PARAFAC2 results
![Page 22: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/22.jpg)
Other applications of tensor methods
Scientific fieldEnvironmental monitoringSensory analysisProcess monitoringFermentationCell phone audio qualityWireless communicationMetabolomicsProteomicsCancer diagnosticsAnthropometry......
2, L. America (Indian)
15, S. India16, N. Asia
19, Australia
20, Korea/Japan
12, S.E.Africa11, W. Africa
10, N. Africa18, S.E. Asia
13, Near East
9, Iberian Peninsula
6, E. Europe
7, S.E. Europe
17, S. China
3, L. America
14, N. India
1, N. America
5, C. Europe
4, N. Europe
8, France
2, L. America (Indian)
15, S. India16, N. Asia
19, Australia
20, Korea/Japan
12, S.E.Africa11, W. Africa
10, N. Africa18, S.E. Asia
13, Near East
9, Iberian Peninsula
6, E. Europe
7, S.E. Europe
17, S. China
3, L. America
14, N. India
1, N. America
5, C. Europe
4, N. Europe
8, France
![Page 23: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/23.jpg)
Variable selectionVariable selection
VIS/NIR spectra of 61 beers
1
2
3
4
400 nm 2300 nm
Absorbance
Wavelength
![Page 24: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/24.jpg)
Just crap! Random Just crap! Random noisenoise
The The good good partpart
Not relevant Not relevant but highly but highly systematicsystematic
400 600 800 1000 1200 1400 1600 1800 2000 2200
1
2
3
Raw data
400 600 800 1000 1200 1400 1600 1800 2000 2200
0
0.4
0.8
Actual modelled data - centered
![Page 25: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/25.jpg)
500 1000 1500 2000-2
-1
0
1
Regression vector PLS
0 5 10 15 200
5
10
15
20Predictions test set PLS
’Classicial’ regression – in this case partial least squares (PLS). Good!
And can be optimized by chemical interpretation
But – this is not always the case
![Page 26: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/26.jpg)
500 1000 1500 2000
-60
-40
-20
0
20
40
Regression vector LASSO
0 5 10 15 200
5
10
15
20Predictions test set LASSO
500 1000 1500 2000
-2
-1
0
1
Regression vector PLS
0 5 10 15 200
5
10
15
20Predictions test set PLS
Lasso. Weird stuff! Important area represented by two variables.
Little support:FragileNon-robustPoor outlier abilityInterpretation low
![Page 27: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/27.jpg)
500 1000 1500 2000
-60
-40
-20
0
20
40
Regression vector LASSO
0 5 10 15 200
5
10
15
20Predictions test set LASSO
500 1000 1500 2000
-2
-1
0
1
Regression vector PLS
0 5 10 15 200
5
10
15
20Predictions test set PLS
500 1000 1500 2000
-5
0
5
Regression vector SR
0 5 10 15 200
5
10
15
20Predictions test set SR
’Chemometric’ variable selection – very nice!Variable selection by selectivity ratios but others would do the job as well
![Page 28: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/28.jpg)
NonnegativityNonnegativity
Classical papers related to NMF
Lawton & Sylvestre. Self modeling curve resolution. Technometrics13:617-633, 1971.
Hanson & Lawson. Solving least squares problems, Englewood Cliffs:Prentice-Hall, Inc, 1974.
In 1999 Lee and Seung wrote a Nature paper on NMF -non-negative matrix factorization.
However, NMF has existed for much more than 30 years under the name multivariate curve resolution.
![Page 29: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/29.jpg)
NMF is not generally unique – rotational freedom
Conditions for uniquness exist
When unique, the solution is often shaky
NMF is (very) sensitive to starting values
Some facts worth noting
![Page 30: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/30.jpg)
Dealing with missing dataDealing with missing data
No missingEx.: standard PCA loss function ||X-TP’|| =
I.e., a summation of errors over all elements of X
If missingOnly fit the model to the data that exist
I.e., fit to the loss function
where wij is zero if xij is missing and one otherwise
2
1 1 1
I J F
ij if jfi j f
x t p
2
1 1 1
I J F
ij ij if jfi j f
w x t p
![Page 31: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/31.jpg)
How can that loss function be optimized?
Method 1: use weighted least squares regression
Method 2: use imputation (expectation maximization)
1. Put numbers in missing elements
2. Fit model to these ‘wrong’ data (Ex: M = TP’ in PCA)
3. Replace missing elements with model guess (Ex: xij = Mij in
PCA)
4. Go to step 2 until convergence
Both methods give same result. Method 2 is easy to implement, Method 1 sometimes faster, but more memory-demanding
![Page 32: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/32.jpg)
Some concluding remarks
Tensor models provide
Mathematical chromatography (real blind source separation)
Huge noise reductionIntuitive models (chemically)Better handling of correlationsRobustness…But you need to know your data well – or be lucky
Much neededBetter algorithmsBetter statistical diagnosticsBetter software
![Page 33: Applications of Tensor Methods in Life Sciencesmmds.imm.dtu.dk/presentations/bro.pdf · Other applications of tensor methods Scientific field Environmental monitoring Sensory analysis](https://reader036.fdocuments.net/reader036/viewer/2022062921/5f02fb047e708231d406f3d6/html5/thumbnails/33.jpg)
Papers, m-files, courses, database of references, data sets, spectral libraries etc.
www.models.life.ku.dk
Rasmus [email protected]