Machine Learning for the Prediction of Solar...

Machine Learning for the Prediction of Solar Flares

Caroline Mather

Mentors: Laura Sandoval (LASP), Stéphane Béland (LASP)

McIntosh Classification

● General form: ‘Zpc’● ‘Z’ - Modified Zurich Class

○ Unipolar or bipolar○ Longitudinal extent of penumbra

● ‘p’ - describes penumbra of principal spot○ Exists or not○ Complete or not○ Size○ Symmetry

● ‘c’ - describes the distribution of spots○ Number of intermediate spots between

‘leader’ and ‘follower’

https://www.spaceweatherlive.com/en/help/the-classification-of-sunspots-after-malde

McIntosh Classification

● General form: ‘Zpc’● ‘Z’ - Modified Zurich Class

○ Unipolar or bipolar○ Longitudinal extent of penumbra

● ‘p’ - describes penumbra of principal spot○ Exists or not○ Complete or not○ Size○ Symmetry

● ‘c’ - describes the distribution of spots○ Number of intermediate spots between

‘leader’ and ‘follower’

Solar Synoptic Map vs Solar Region Summary

Connection to Predicting Solar Flares

● Solar flares and coronal mass ejections (CMEs) originate from active regions of the sun

● Improving our classification of sunspots can help predict when/what kind of solar flare will occur

NASA/SDO/Goddard

A New Method of Classification?

● Goal: extract information that will be used to classify the region

● Machine Learning on features to automate classification

● How can we train a computer to replicate, and possibly replace, the manual classification process?

Umbra

Penumbra

Image Processing

Machine Learning: KNN vs SVM

http://stackabuse.com/k-nearest-neighbors-algorithm-in-python-and-scikit-learn/

http://www.eric-kim.net/eric-kim-net/posts/1/kernel_trick.html

Preliminary Results

Correct classification rates:

● KNN: 0.46227

● SVM: 0.4910

● Random Forest: 0.4712

● AdaBoost: 0.4455

Compare to Colak, Qahwaji article (2008) - 47%

correct classification rate for p-value

Detailed Results SVM vs KNN

● SVM: 49.1% correctly classified○ Failed to classify any spots as

type ‘r’○ Misclassified ‘r’ spots as either

‘x’ or ‘a’ most often

● KNN: 46.2% correctly classified○ Predicted each class of largest

spot, although was less accurate overall

Detailed Results Ensemble Methods

● Random Forest Classifier: 47.1% correctly classified

● AdaBoost Classifier: 44.5% correctly classified○ Least accurate○ Failed to classify any spots as

types ‘r’ or ‘h’○ Misclassified ‘r’ as ‘x’ most

often○ Misclassified ‘h’ as either ‘k’ or

‘s’ most often

Colak, Qahwaji (2008) Results

Colak, T. & Qahwaji, R. Sol Phys (2008) 248: 277. https://doi.org/10.1007/s11207-007-9094-3

Next Steps

● Improve image processing● More features

○ Colak and Qahwaji (2008) suggest trying a Hugh transform - an imaging algorithm to detect the geometry of the largest spot

● Try unsupervised learning algorithm○ See how well our classification system does in comparison with

McIntosh

Acknowledgements

This research was supported by the National Science Foundation REU program, Award #1659878 and the NASA SORCE grant

● Laura Sandoval, Stephane Beland, and Megan Smith● The rest of the Solar Flare Forecast Project group

○ Andrew Jones, Kim Kokkonen, Wendy Carande, Tracey Morland, Maxine Hartnett, and Justin Cai

● Laboratory for Atmospheric and Space Physics

Questions?

Machine Learning for the Prediction of Solar...

Documents

Transcript of Machine Learning for the Prediction of Solar...