Machine Learning for the Prediction of Solar...
Transcript of Machine Learning for the Prediction of Solar...
Machine Learning for the Prediction of Solar Flares
Caroline Mather
Mentors: Laura Sandoval (LASP), Stéphane Béland (LASP)
McIntosh Classification
● General form: ‘Zpc’● ‘Z’ - Modified Zurich Class
○ Unipolar or bipolar○ Longitudinal extent of penumbra
● ‘p’ - describes penumbra of principal spot○ Exists or not○ Complete or not○ Size○ Symmetry
● ‘c’ - describes the distribution of spots○ Number of intermediate spots between
‘leader’ and ‘follower’
https://www.spaceweatherlive.com/en/help/the-classification-of-sunspots-after-malde
McIntosh Classification
● General form: ‘Zpc’● ‘Z’ - Modified Zurich Class
○ Unipolar or bipolar○ Longitudinal extent of penumbra
● ‘p’ - describes penumbra of principal spot○ Exists or not○ Complete or not○ Size○ Symmetry
● ‘c’ - describes the distribution of spots○ Number of intermediate spots between
‘leader’ and ‘follower’
Solar Synoptic Map vs Solar Region Summary
Connection to Predicting Solar Flares
● Solar flares and coronal mass ejections (CMEs) originate from active regions of the sun
● Improving our classification of sunspots can help predict when/what kind of solar flare will occur
NASA/SDO/Goddard
A New Method of Classification?
● Goal: extract information that will be used to classify the region
● Machine Learning on features to automate classification
● How can we train a computer to replicate, and possibly replace, the manual classification process?
Umbra
Penumbra
Image Processing
Machine Learning: KNN vs SVM
http://stackabuse.com/k-nearest-neighbors-algorithm-in-python-and-scikit-learn/
http://www.eric-kim.net/eric-kim-net/posts/1/kernel_trick.html
Preliminary Results
Correct classification rates:
● KNN: 0.46227
● SVM: 0.4910
● Random Forest: 0.4712
● AdaBoost: 0.4455
Compare to Colak, Qahwaji article (2008) - 47%
correct classification rate for p-value
Detailed Results SVM vs KNN
● SVM: 49.1% correctly classified○ Failed to classify any spots as
type ‘r’○ Misclassified ‘r’ spots as either
‘x’ or ‘a’ most often
● KNN: 46.2% correctly classified○ Predicted each class of largest
spot, although was less accurate overall
Detailed Results Ensemble Methods
● Random Forest Classifier: 47.1% correctly classified
● AdaBoost Classifier: 44.5% correctly classified○ Least accurate○ Failed to classify any spots as
types ‘r’ or ‘h’○ Misclassified ‘r’ as ‘x’ most
often○ Misclassified ‘h’ as either ‘k’ or
‘s’ most often
Colak, Qahwaji (2008) Results
Colak, T. & Qahwaji, R. Sol Phys (2008) 248: 277. https://doi.org/10.1007/s11207-007-9094-3
Next Steps
● Improve image processing● More features
○ Colak and Qahwaji (2008) suggest trying a Hugh transform - an imaging algorithm to detect the geometry of the largest spot
● Try unsupervised learning algorithm○ See how well our classification system does in comparison with
McIntosh
Acknowledgements
This research was supported by the National Science Foundation REU program, Award #1659878 and the NASA SORCE grant
● Laura Sandoval, Stephane Beland, and Megan Smith● The rest of the Solar Flare Forecast Project group
○ Andrew Jones, Kim Kokkonen, Wendy Carande, Tracey Morland, Maxine Hartnett, and Justin Cai
● Laboratory for Atmospheric and Space Physics
Questions?