Performance of Statistical Learning Methods

Jens Zimmermann, MPI für Physik München, ACAT 2005 Zeuthen 1

Jens Zimmermannzimmerm@mppmu.mpg.de

Max-Planck-Institut für Physik, München

Forschungszentrum Jülich GmbH

Performance Examples from AstrophysicsPerformance vs. ControlH1 Neural Network TriggerControlling Statistical Learning Methods

OvertrainingEfficienciesUncertainties

Comparison of Learning MethodsArtificial IntelligenceHiggs Parity Measurement at the ILC

Performance of Statistical Learning Methods: MAGIC

Significance and number of excess events scale theuncertainties in the flux calculation.

Performance of Statistical Learning Methods: XEUS

Pileup vs. Single photon

classical algorithm„XMM“

? ?pileups not recognised by XMM but by NN

Control of Statistical Learning Methods

There may be many different successful applicationsof statistical learning methods.

There may be great performance improvementscompared to classical methods.

This does not impress people who fear thatstatistical learning methods are not well under control.

First talk: Understanding and InterpretationNow: Control and correct Evaluation

The Neural Network Trigger in the H1 Experiment

L1 2.3 µs

L2 20 µs

L4 100 ms

10 MHz

500 Hz

Trigger Scheme

H1 at HERA ep Collider, DESY

„L2NN“

Each neural network on L2 verifies a specific L1 sub-trigger.

Triggering Deeply Virtual Compton Scattering

L1 sub-trigger 41 triggers DVCS by requiring• Significant energy deposition in SpaCal• Within Time Window

L2 neural network additional information• Liquid argon energies• SpaCal centre energies• z-vertex information

Triggering with4 Hz

Must be reduced to0.8 Hz

TheorySignal

(DVCS)

Background(upstreambeam-gasinteraction)

Determine the correct efficiency

50% training set 25% test set

signalshouldpeak at 1

backgroundshouldpeak at 0

25% selection set

Tune training parameters to• avoid overtraining• optimise performance

Determine the Correct Efficiency

training set

test set

Check Statistical Uncertainties

propagation of uncertaintiesefficiency

statistical uncertainty of the efficiency

e.g. 80% ± 4% for 80 of 100

Check Systematical Uncertainties

There is only a propagation ofsystematical uncertainties of the inputs

Assumingx1 with absolute error 1

x2 with relative error 2= 5%x3 with relative error 3=10%

Check Systematical Uncertainties

example: DVCS dataset

Comparison of Hypotheses

efficiencies for fixed rejection of 80%

NN: 96.5% vs. SVM: 95.7%Statistically significant?

Build 95% confidence interval! is the variation over

different parts of the test set

Comparison of Learning Methods

Cross-Validation:Divide dataset into k parts,

train k classifiers byusing each part once as test set.

is the variationover the different trainings

Compare performancesover different training sets!

efficiencies for fixed rejection of 60%

two events with low NN-output

Artificial Intelligence

overlay cosmic

cosmic

H1-L2NN: TriggeringCharged Current

Artificial Intelligence

background foundin J/ selection

H1-L2NN: Triggering J/

Higgs Parity Measurement at the ILC

Parity induces favourite -configuration:• anti-parallel for H• parallel for A

H/A + -

= 5.09

Significance is amplitudedivided by its uncertainty

Significance measured for500 events and averaged

over 600 pseudo-experiments

Classical approach:fit angular distribution

Higgs Parity Measurement at the ILC

Statistical learning approach: direct discrimination

trained towards 0 trained towards 1

= 6.26Significance is difference

of measured meansdivided by its uncertainty

Significance measured for500 events and averaged

over 600 pseudo-experiments

Conclusion

Statistical Learning Methods successful in manyapplications in high energy and astrophysics.

Significant performance improvements comparedto classical algorithms.

Statistical learning methods are well under control:- efficiencies can be determined- uncertainties can be calculated.

Comparison of learning methods revealsstatistically significant differences.

Statistical Learning Methods sometimes show moreartificial intelligence than expected.

Performance of Statistical Learning Methods

Documents

Transcript of Performance of Statistical Learning Methods

4. STATISTICAL METHODS

Statistical analysis methods

Multiple Regression: Statistical Methods Using IBM SPSS · 2012-07-21 · Multiple Regression: Statistical Methods Using IBM SPSS. T. ... Multiple Regression: Statistical Methods

Statistical Methods [Jadhav]

Statistical Methods 1

Performance of statistical methods to correct food intake ......Performance of statistical methods to correct food intake distribution: comparison between observed and estimated usual

Visualization - Statistical Methods

Statistical Process Control Methods for Expert System Performance ...

Appendix B: Statistical Methods. Statistical Methods: Graphing Data Frequency distribution Histogram Frequency polygon.

Nonparametric Statistical Methods

HYPOTHESIS TESTING. Statistical Methods Estimation Hypothesis Testing Inferential Statistics Descriptive Statistics Statistical Methods.

Statistical Methods to Control and Predict Quality ... · Predict Quality Performance of Spare Part Operations ... Statistical Methods to Control and Predict Quality Performance of

Statistical Methods

Chapter 13 Statistical Quality Control Method. Statistical Quality Control Methods Statistical Quality Control Methods Acceptance Sampling Statistical.

1 A Review of Widely-Used Statistical Methods Widely-Used Statistical Methods.

Statistical Methods Bayesian methods 2

Introduction to Statistical Methods for Clinical Trials · Introduction to Statistical Methods for Clinical Trials Amy Liu Systematic reviews Statistical methods for meta-analysis

STATISTICAL METHODS (REGRESSION)

Advanced statistical methods for data analysis – Lecture 1cowan/mva/cowan_mva_1.pdf · Glen Cowan Multivariate Statistical Methods in Particle Physics 1 Advanced statistical methods

Statistical Methods II