UjjainiAlam, Shabbir Bawaji(ThoughtWorks), SurajitMondal ...

10
Ujjaini Alam, Shabbir Bawaji (ThoughtWorks), Surajit Mondal, Divya Oberoi (NCRA-TIFR) ([email protected]) e4r™

Transcript of UjjainiAlam, Shabbir Bawaji(ThoughtWorks), SurajitMondal ...

Page 1: UjjainiAlam, Shabbir Bawaji(ThoughtWorks), SurajitMondal ...

Ujjaini Alam, Shabbir Bawaji (ThoughtWorks), Surajit Mondal, Divya Oberoi (NCRA-TIFR)([email protected])

e4r™

Page 2: UjjainiAlam, Shabbir Bawaji(ThoughtWorks), SurajitMondal ...

ØThe perplexing mystery of how the solar corona maintains itself at a temperature ofmillion K, while the visible disc of the Sun is only at 5800 K, is a long standingproblem in solar physics. Parker’s promising “nanoflare” hypothesis explains thisphenomenon using tiny flares with energies ~1024 ergs. However, in spite oftremendous efforts, they have remained undetected.

ØA key consequence of this hypothesis is the ubiquitous presence of very weakimpulsive emissions in the metrewave radio band. The very first detection of theseemissions has recently been reported by Mondal et al. (2020, ApJ, 895, L39). Werefer to these emissions as Weak Impulsive Narrowband Quiet Sun Emissions(WINQSEs).

ØThis work attempts to characterise the morphology of WINQSEs. Based ontheoretical considerations, they are expected to be compact and are detected withsufficient SNR in these data to allow a robust characterisation.

ØWe create a pipeline based on unsupervised Machine Learning techniques todetect and characterize WINSQEs in quiet radio solar data at several frequencies.

e4r™

Page 3: UjjainiAlam, Shabbir Bawaji(ThoughtWorks), SurajitMondal ...

Fit gaussian to the peaks

Isolated peaks with SNR > 5

Residuals to the fits

Radio sun

Isolated peaks with > 5 SNR: 20,900Peaks successfully fitted with Gaussian: 13,143

Morphology of peaks: Compact, low intensity

Raw Image Solar Boundary (Sobel Edge Detection)

Contours around Peaks

Optical sun

Radio sun

Clustered peaks

Optimum fitting window (DBSCAN + tSNe

clustering of peaks)

e4r™

Page 4: UjjainiAlam, Shabbir Bawaji(ThoughtWorks), SurajitMondal ...

~8000 images over ~70 m

inutes

100 px

100

px

~8000 images over ~70 m

inutes

100 px

100

px

132 MHz

More than 8000 images at 132 MHz made every 0.5 sobtained using the Murchison Widefield Array were usedfor this study. A total of about 33500 WINQSEs weredetected in these data. The prominent bright spot on theSun comes from an active region and makes it hard todetect WINQSEs in its vicinity.

120.5 MHz, 128.2 MHz, 135.9 MHz, 143.6 MHz

For each of these frequencies, more than 8000 images madeevery 0.5 s obtained using the Murchison Widefield Array wereused for this study. No active region was present on the Sun at thistime. This made it easier to identify WINQSEs occurring all overthe sun.

e4r™

Page 5: UjjainiAlam, Shabbir Bawaji(ThoughtWorks), SurajitMondal ...

e4r™

Ø Create elevation map using the Sobel gradient of image.Ø Watershed transform to fill regions of the elevation map

using threshold markers separating sun from noise.Ø Fit Hough ellipse to boundary thus detected.Ø Use ellipse centre to align all images.

Region-based edge-detection using watershed transform

Optical sun

Radio sun

Page 6: UjjainiAlam, Shabbir Bawaji(ThoughtWorks), SurajitMondal ...

1

2

3

1: Isolated, quasi-Gaussian contours around the peak.

2: Partially isolated, but effect of nearby peaks distorts Gaussian contours.

Examples of types of peaks detected:

3: Clustered, two peaks close together forming a bimodal distribution, plus distortion by another nearby peak.

e4r™

Page 7: UjjainiAlam, Shabbir Bawaji(ThoughtWorks), SurajitMondal ...

DBSCAN

Density Based Spatial Clusteringalgorithm. For a given set of points, itgroups together points with nearbyneighbours, and defines isolatedpoints in regions of low-density asoutliers. Used here to optimize peakfitting window size based on peakcharacteristics, and to identifyisolated peaks.

tSNE

t-distributed stochastic neighbourembedding is a statistical method forvisualizing high-dimensional data bygiving each datapoint a location in atwo or three-dimensional map. Usedhere to visualize the clustering ofpeaks.

Clustering using DBSCAN on characteristics of detected peaks:Ø Is able to separate the isolated/semi-isolated peaks from noise

peaks or clustered peaksØ Defines optimal window for Gaussian fitting of peaks in a group,

eliminating the need to tune hyperparameters for individualpeaks.

Ø Makes the fit more accurate, and the pipeline more efficient andautomated, suitable for large datasets.

Group Major axis

Minor axis

Orientation Angle

Drop from peak

No.

1 9.2 2.0 105 0.98 4968

2 7.1 3.0 158 0.92 2622

3 4.4 2.2 116 0.90 2246

4 6.4 3.2 38 0.91 2566

5 11.3 4.1 135 0.90 2541

6 9.4 4.1 35 0.91 2478

7 7.3 3.0 56 0.95 2704

Groups of isolated (green), partially isolated (yellow), and clustered (red) peaks

e4r™

Page 8: UjjainiAlam, Shabbir Bawaji(ThoughtWorks), SurajitMondal ...

Fit gaussian to the peaks

Isolated peaks with SNR > 5

Residuals to the fits

Clustered peaks

Distribution of peaks (> 5 SNR) over time on thesolar surface. The outer contour marks theboundary of the radio Sun and the color showsthe number of peaks found at a given location.For the first dataset, the active region masks lowintensity peaks around it. For the second dataset,there is a more homogeneous distribution ofpeaks.

132 MHz 120.5 MHz

e4r™

Page 9: UjjainiAlam, Shabbir Bawaji(ThoughtWorks), SurajitMondal ...

Peaks with > 5 SNR: 42,469Peaks successfully fitted with Gaussian: 34,457

132 MHz: Sun with one active region 143.6 MHz: Quiet sun

Peaks with > 5 SNR: 20,900Peaks successfully fitted with Gaussian: 13,143

e4r™

Page 10: UjjainiAlam, Shabbir Bawaji(ThoughtWorks), SurajitMondal ...

➢ Algorithm successfully identifies weak isolated and clustered emissions and characterises the shapes of isolated WINQSEs.

➢ Use of unsupervised clustering improves both accuracy and efficiency of the pipeline.➢ WINQSEs are indeed found to be taking place over a large parts of the quiet Sun.➢ The significant difference between the distribution of intensities of the observed WINQSEs in different datasets seems to arise because of the presence of the active region in the 132 MHz dataset.

➢ The size and shape of the best fit Gaussians is consistent with the expectation of WINQSEs being compact features.

Future work:➢ Run the pipeline on more independent data to examine its robustness➢ Algorithm identifies a particular class of morphological features, expand to wider classes of features.➢ Extend pipeline to include frequency as a dimension.

Final objective: To explore solar emission structures in the 4D space (x, y, time, frequency).

e4r™