Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing...
Transcript of Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing...
![Page 1: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/1.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 1
Computing and Statistical Data Analysis Stat 5: Multivariate Methods
London Postgraduate Lectures on Particle Physics;
University of London MSci course PH4515
Glen Cowan Physics Department Royal Holloway, University of London [email protected] www.pp.rhul.ac.uk/~cowan
Course web page: www.pp.rhul.ac.uk/~cowan/stat_course.html
![Page 2: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/2.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 page 2
Finding an optimal decision boundary In particle physics usually start by making simple “cuts”:
xi < ci xj < cj
Maybe later try some other type of decision boundary: H0 H0
H0
H1
H1 H1
![Page 3: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/3.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 3
Multivariate methods Many new (and some old) methods:
Fisher discriminant Neural networks Kernel density methods Support Vector Machines Decision trees Boosting Bagging
New software for HEP, e.g., TMVA , Höcker, Stelzer, Tegenfeldt, Voss, Voss, physics/0703039 StatPatternRecognition, I. Narsky, physics/0507143
![Page 4: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/4.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 4
![Page 5: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/5.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 5
![Page 6: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/6.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 6
![Page 7: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/7.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 7
2
![Page 8: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/8.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 8
![Page 9: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/9.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 9
![Page 10: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/10.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 10
![Page 11: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/11.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 11
![Page 12: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/12.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 12
![Page 13: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/13.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 13
![Page 14: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/14.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 14
![Page 15: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/15.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 15
![Page 16: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/16.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 16
![Page 17: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/17.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 17
![Page 18: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/18.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 18
![Page 19: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/19.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 19
![Page 20: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/20.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 20
![Page 21: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/21.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 21
![Page 22: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/22.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 22
![Page 23: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/23.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 23
![Page 24: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/24.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 page 24
Overtraining
training sample independent validation sample
If decision boundary is too flexible it will conform too closely to the training points → overtraining. Monitor by applying classifier to independent validation sample.
![Page 25: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/25.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 25
Choose classifier that minimizes error function for validation sample.
![Page 26: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/26.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 page 26
Neural network example from LEP II Signal: e+e- → W+W- (often 4 well separated hadron jets) Background: e+e- → qqgg (4 less well separated hadron jets)
← input variables based on jet structure, event shape, ... none by itself gives much separation.
Neural network output:
(Garrido, Juste and Martinez, ALEPH 96-144)
![Page 27: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/27.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 27
![Page 28: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/28.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 28
![Page 29: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/29.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 29
![Page 30: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/30.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 30
![Page 31: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/31.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 31
![Page 32: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/32.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 32
Kernel-based PDE (KDE, Parzen window) Consider d dimensions, N training events, x1, ..., xN, estimate f (x) with
Use e.g. Gaussian kernel:
kernel bandwidth (smoothing parameter)
Need to sum N terms to evaluate function (slow); faster algorithms only count events in vicinity of x (k-nearest neighbor, range search).
![Page 33: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/33.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 33
![Page 34: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/34.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 34
![Page 35: Computing and Statistical Data Analysis Stat 5 ...cowan/stat/2013/stat_5.pdf · G. Cowan Computing and Statistical Data Analysis / Stat 5 3 Multivariate methods Many new (and some](https://reader033.fdocuments.net/reader033/viewer/2022052801/5f14b843290f5f1b1b106af2/html5/thumbnails/35.jpg)
G. Cowan Computing and Statistical Data Analysis / Stat 5 35