Using Machine Learning to Extract Properties of Systems of ... posters/gulian_poster.pdf · Using...
Transcript of Using Machine Learning to Extract Properties of Systems of ... posters/gulian_poster.pdf · Using...
Using Machine Learning to Extract Properties of Systems of ParticlesEllen D. Gulian†, ‡, Michael Kordell II‡, Rainer J. Fries‡, §
†Department of Physics, University of Maryland - Baltimore County‡Cyclotron Institute, Texas A&M University
§Department of Physics and Astronomy, Texas A&M University
Using Machine Learning to Extract Properties of Systems of ParticlesEllen D. Gulian†, ‡, Michael Kordell II‡, Rainer J. Fries‡, §
†Department of Physics, University of Maryland - Baltimore County‡Cyclotron Institute, Texas A&M University
§Department of Physics and Astronomy, Texas A&M University
Introduction
In high energy and nuclear physics, one is often presented with complex final-state systems of particles that are products of many dynamical processes. Withtraditional analysis methods, it is hard to determine the specific processes thatcreated these systems. Here, we explore the application of machine learning inanalyzing such systems. We first discuss our Python simulations, which createsystems of particles. Then, we apply various machine learning algorithms to pseudo-data generated by our model in order to predict features of the systems of particles.
Model
Our Python code creates systems of particles (events) that resemble several keyfeatures of real-world experimental data:
• Thermal sampling of particle momentum based on a relativistic Maxwell-Boltzmann distribution given (in natural units) by:
f(p) = e−E/T , where E = −√p2 + M 2 (1)
• Collective flow velocity of the particles in the form of a cylindrical blastwave;transverse velocity of particles is given by
vT = α0/R (2)
where α0 is the boundary velocity and R is the radius of the cylinder.
• Two-body decay of unstable particles (mass M) into daughter particles (massesm1 and m2); momentum of daughters in rest frame of mother is given by
k =√
1/(4M)[(M 2 −m21 −m2
2)2 − 4m2
1m22] (3)
Example: 10,000 events with 10 particles each. Default parameters: T = 0.15 GeV, M = 0.775GeV, m1 = m2 = 0.14 GeV, and α0 = 0.7.
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0Mother Momentum Magnitude (GeV)
0
500
1000
1500
2000
2500
Mother Particle Momentum DistributionWith flowWithout flow
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0Daughter Momentum Magnitude (GeV)
0
500
1000
1500
2000
2500
3000
Daughter Particle Momentum Distribution
0.0 0.5 1.0 1.5 2.0 2.5 3.0Mother mass (GeV)
0
50000
100000
150000
200000
Invariant Mass Spectrum
Note: By analyzing the candidate mass for all pairs of daughter particles in an event, one can reconstruct the mothermass on top of a background, as seen by the peak in the invariant mass spectrum.
Results
We analyze the daughter momenta in an event to extract parameters M, T, and α0. UsingPython’s Scikit-learn library [1], we trained several machine learning algorithms on the datagenerated by our model. Each algorithm was trained with 14,000 events and tested with 6,000events. Each event consisted of 10 particles. We focused on four classifiers to obtain binaryclassifications:
1. random forest (RF): an ensemble classification algorithm with many individual decisiontrees; each tree in the forest yields a class prediction, and the class with the most numberof votes is the winning prediction.
2. adaptive boost (ADA): an ensemble classification algorithm that fits weak learners toweighted data in order to create a strong learner; very sensitive to noise and outliers.
3. gradient boost (GB): an ensemble classification algorithm that builds a prediction modelusing weak estimators (typically decision trees); allows for extensive optimization.
4. multilayer perceptron (MLP): a neural network classifier that uses back-propagation (atool that increases prediction accuracy) to train.
0.1 0.2 0.3 0.4 0.5ΔMΔ(GeV)
50
60
70
80
90
100Ac
curacyΔ(%
)
Predicting Mother Mass (RF, ADA, GD)
RFADAGB
0.05 0.10 0.15 0.20 0.25 0.30ΔTΔ(GeV)
50
60
70
80
90
100
Accu
racyΔ(%
)
Predicting Temperature (RF, ADA, GB)
RFADAGB
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0Δα0
50
60
70
80
90
100
Accu
racyΔ(%
)
Predicting Flow Velocity (RF, ADA, GB)
RFADAGB
Panel 1. Performance of ensemble classifiers (100 estimators each) in relation to closeness in value between two binary target values.
20 40 60 80 100 120 140Number of Estimators
50
60
70
80
90
100
Accu
racy
(%)
Predicting Mother Mass (RF, ADA, GB)
RFADAGB
20 40 60 80 100 120 140Number of Estimators
50
60
70
80
90
100
Accuracy (%)
Predicting Temperature (RF, ADA, GB)
RFADAGB
20 40 60 80 100 120 140Number of Estimators
50
60
70
80
90
100
Accu
racy (%
)
Predicting Flow Velocity (RF, ADA, GB)
RFADAGB
Panel 2. Performance of ensemble classifiers in predicting targets as a function of the number of estimators.
0.1 0.2 0.3 0.4 0.5ΔMΔ(GeV)
50
60
70
80
90
100
AccuracyΔ(%
)
PredictingΔM therΔMassΔ%ithΔMLPΔClassifier
0.05 0.10 0.15 0.20 0.25 0.30ΔTΔ(GeV)
50
60
70
80
90
100
Accu
racy (%
)
Predicting Temperature with MLP Classifier
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0Δα0
50
60
70
80
90
100
AccuracyΔ(%
)
PredictingΔFl %ΔVel cityΔ%ithΔMLPΔClassifier
Panel 3. Performance of MLP classifier in predicting targets in relation to the closeness in value between two binary target values.
Discussion and Analysis
1. As expected, the performance of all classifiers improves with increasing con-trast between parameter choices (Panel 1).
2. The algorithms can easily infer the masses of decayed particles. However,they have slightly more trouble predicting temperature and are least efficientin predicting the collective flow velocities.
3. Increasing the number of estimators, in general, results in higher accuracy.However, the impact of additional estimators appears to decrease at somepoint, and one can imagine that eventually there is a trade-off between ac-curacy and efficiency; adding more estimators to the classifier can increaseruntime significantly. Other options, such as increasing the size of the trainingdata set, can also improve accuracy.
4. The multi-layer perceptron classifier in Panel 3 is shown only for comparison.With default parameters, it appears to perform worse than the ensemble clas-sifiers. While neural networks are very powerful, they are also very complexand require a significant amount of parameter tuning. Given the performanceof the ensemble classifiers, we chose not to study the parameter optimizationof the multi-layer perceptron classifier here.
Remarks
While our work is a first step in training machine learning algorithms to under-stand systems of particles, further work is needed to develop more sophisticatedsimulation code. Ongoing work involves adjusting the model to account for parti-cles that undergo N -body decays (N > 2). In addition, further research is neededinto optimizing the parameters to the classifiers. In the future, after exploring theability of machine learning algorithms in our simplified models, we plan to useestablished simulation codes like JETSCAPE [2] and PYTHIA [3] which createmore complex and realistic systems. If proven feasible, one possible applicationof our work is to increase our understanding of the hadronization process andthe phenomenon of confinement by analyzing experimental data with machinelearning.
Acknowledgements
This research was supported by NSF grants PHY-1659847, PHY-1812431, andPHY-1550221.
References
[1] F. Pedregosa et al. “Scikit-learn: Machine Learning in Python”. In: Journal of MachineLearning Research 12 (2011), pp. 2825–2830.
[2] J. H. Putschke et al. “The JETSCAPE framework”. In: (2019). arXiv: 1903.07706 [nucl-
th].
[3] Torbjorn Sjostrand et al. “An Introduction to PYTHIA 8.2”. In: Comput. Phys. Commun.191 (2015), pp. 159–177. doi: 10.1016/j.cpc.2015.01.024. arXiv: 1410.3012 [hep-ph].