Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2...
-
date post
22-Dec-2015 -
Category
Documents
-
view
215 -
download
0
Transcript of Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2...
![Page 1: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.](https://reader035.fdocuments.net/reader035/viewer/2022062715/56649d7a5503460f94a5e4f8/html5/thumbnails/1.jpg)
Model-based Classification in Food Authenticity Studies
D. Toher1,2, G. Downey1 and T.B. Murphy2
Presented by: Deirdre Toher
1 Ashtown Food Research Centre, Teagasc,
(formerly The National Food Centre), Dublin 152 Dept of Statistics, School of Computer Science and Statistics, Trinity College Dublin, Dublin 2
![Page 2: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.](https://reader035.fdocuments.net/reader035/viewer/2022062715/56649d7a5503460f94a5e4f8/html5/thumbnails/2.jpg)
Outline
• Food authenticity
• Spectroscopic data
• Current mathematical methods
• Proposed alternative – Dimension reduction– Model-based clustering– Updating
• Example near-infrared data with results
![Page 3: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.](https://reader035.fdocuments.net/reader035/viewer/2022062715/56649d7a5503460f94a5e4f8/html5/thumbnails/3.jpg)
Food Authenticity – what and why?
• Detecting when foods are not what they are claimed to be
• Tampering/adulteration, mislabelling
• Economic fraud worth millions of US dollars globally
• Promote quality products
• Build consumer trust
![Page 4: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.](https://reader035.fdocuments.net/reader035/viewer/2022062715/56649d7a5503460f94a5e4f8/html5/thumbnails/4.jpg)
Food Authenticity – how?
• Near infrared spectroscopy– Non-invasive– Relatively inexpensive
• Multivariate Mathematics– Partial Least Squares Regression– Factorial Discriminant Analysis– Model-based Clustering
• Other methods available (sp..)
![Page 5: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.](https://reader035.fdocuments.net/reader035/viewer/2022062715/56649d7a5503460f94a5e4f8/html5/thumbnails/5.jpg)
Spectroscopic Data• Near infrared transflectance spectroscopy
– High dimensional data– Range 1100-2498 nm, reading every 2 nm– 700 values for each sample
![Page 6: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.](https://reader035.fdocuments.net/reader035/viewer/2022062715/56649d7a5503460f94a5e4f8/html5/thumbnails/6.jpg)
Current Mathematical Methods
• Discriminant Partial Least Squares Regression
• Factorial Discriminant Analysis
Problem?– Limited to “two-group” classification problems– No quantification of certainty
![Page 7: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.](https://reader035.fdocuments.net/reader035/viewer/2022062715/56649d7a5503460f94a5e4f8/html5/thumbnails/7.jpg)
Proposed Alternative
Model-based clustering
– Expansion of discriminant analysis– Allows clusters to vary in shape and size– Gives probability of a sample being in each
cluster/group– Can classify situations with more than two
groupings
![Page 8: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.](https://reader035.fdocuments.net/reader035/viewer/2022062715/56649d7a5503460f94a5e4f8/html5/thumbnails/8.jpg)
Possible Cluster Shapes
![Page 9: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.](https://reader035.fdocuments.net/reader035/viewer/2022062715/56649d7a5503460f94a5e4f8/html5/thumbnails/9.jpg)
The Dimensionality Problem• Model-based clustering requires dimension
reduction – for efficient computation– to prevent singular covariance matrices
• Use wavelet analysis with thresholding
![Page 10: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.](https://reader035.fdocuments.net/reader035/viewer/2022062715/56649d7a5503460f94a5e4f8/html5/thumbnails/10.jpg)
EM Algorithm & Updating
• EM algorithm– expected value of the likelihood function– maximises the expected value– commonly used in statistics for estimating
missing values
• Updating– uses previous estimates of labels as a starting
point for iteration
![Page 11: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.](https://reader035.fdocuments.net/reader035/viewer/2022062715/56649d7a5503460f94a5e4f8/html5/thumbnails/11.jpg)
Example: Honey Adulteration
• Irish honey extended with – fructose:glucose mixtures – fully inverted beet syrup – high fructose corn syrup
• Total of 478 spectra:– 157 pure and 321 adulterated
• 225 with fructose:glucose mixtures• 56 with fully inverted beet syrup• 40 with high fructose corn syrup
![Page 12: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.](https://reader035.fdocuments.net/reader035/viewer/2022062715/56649d7a5503460f94a5e4f8/html5/thumbnails/12.jpg)
Classification AchievedClassification rates on test set data achieved
with correct proportions of each type of adulterant in the training set for “pure or adulterated” question.
Training / Test EM EM & Updating
50% / 50% 94.72% (1.12) 94.43% (1.10)
25% / 75% 93.22% (1.08) 93.05% (1.03)
10% / 90% 90.82% (1.76) 92.22% (1.11)
![Page 13: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.](https://reader035.fdocuments.net/reader035/viewer/2022062715/56649d7a5503460f94a5e4f8/html5/thumbnails/13.jpg)
Classification AchievedClassification rates on test set data achieved
with correct proportions of pure / adulterated in the training set for “pure or adulterated” question.
Training / Test EM EM & Updating
50% / 50% 94.38% (1.16) 94.11% (0.89)
25% / 75% 93.50% (1.08) 93.03% (1.02)
10% / 90% 90.54% (1.80) 92.05% (1.09)
![Page 14: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.](https://reader035.fdocuments.net/reader035/viewer/2022062715/56649d7a5503460f94a5e4f8/html5/thumbnails/14.jpg)
Classification AchievedClassification rates on test set data achieved
using 50% training, 50% test data
with correct proportion of pure / adulterated in the training data set for “type of adulteration” question.
Question EM EM & Updating
Pure or adulterated?
91.09% (1.40) 90.64% (1.36)
Type of adulteration
86.23% (1.20) 84.12% (1.67)
![Page 15: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.](https://reader035.fdocuments.net/reader035/viewer/2022062715/56649d7a5503460f94a5e4f8/html5/thumbnails/15.jpg)
Classification AchievedClassification rates on test set data achieved
using 50% training, 50% test data
with correct proportions of each type of adulterant in the training set for “type of adulteration” question.
Question EM EM & Updating
Pure or adulterated?
89.41% (1.76) 88.61% (1.82)
Type of adulteration
85.70% (1.96) 83.57% (2.23)
![Page 16: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.](https://reader035.fdocuments.net/reader035/viewer/2022062715/56649d7a5503460f94a5e4f8/html5/thumbnails/16.jpg)
Probability v Accurate Classification
Probability of group membership - by colour (black being pure, red being adulterated)
![Page 17: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.](https://reader035.fdocuments.net/reader035/viewer/2022062715/56649d7a5503460f94a5e4f8/html5/thumbnails/17.jpg)
Conclusions
• EM algorithm gives a method of predicting group membership
• Updating procedures effective with small training sets
• Quantifying certainty
• Allows cost of misclassification to be easily incorporated into modelling
![Page 18: Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.](https://reader035.fdocuments.net/reader035/viewer/2022062715/56649d7a5503460f94a5e4f8/html5/thumbnails/18.jpg)
Questions?
Funded by:Teagasc under the Walsh Fellowship Scheme
Irish Department of Agriculture & Food
(FIRM programme)
Science Foundation of Ireland
Basic Research Grant scheme (Grant 04/BR/M0057)