ExploratoryAnalysisofLightCurves€¦ · LBV Asteroids AGN Rotation Eclipse Microlensing...

Exploratory Analysis of Light Curves

G. Jogesh Babu Ashish Mahabol SaeNa Park

•  For centuries, astronomers have created taxonomies for all types celesCal populaCons including comets, asteroids, stars, galaxies, acCve galacCc nuclei, and supernovae.

•  UnCl recently, nearly all classificaCons were based on heurisCc and subjecCve procedures.

•  Most stellar classificaCons were based on colors and spectral properCes.

•  Supernovae have a complicated classificaCon scheme: Type I with subtypes Ia, Ib, Ic, Ib/c pec and Type II with subtypes IIb, IIL, IIP and Iin.

•  Rarely have staCsCcal or algorithmic procedures been used to define the classes.

Overview

•  What we want to do – Classify lightcurves of mostly non-‐variable sources – Look for interesCng aspects in the outliers

•  What are the steps we are taking – Clustering – Exploring feature space – FuncConal Data Analysis – Machine Learning

Clustering versus ClassificaCon •  ClassificaCon: –  There are M known populaCons –  There is a sample from each populaCon –  These samples are the training data, so this is supervised learning –  The training data is used to develop a classifier for objects whose populaCon membership is unknown.

•  Clustering: –  The goal is to parCCon the data into M groups –  Groups are not defined a priori –  No training data, so unsupervised learning –  There is no “best” M

Most groups are aVer transients (low hanging fruit)

•  For every transient, there are 10^6 non-‐transients •  But there is variability at all levels •  We are trying to make the most of the non-‐transients

LBV

AGNAsteroids

RotationEclipse

Microlensing Eruptive PulsationSecular

(DAV) H-WDs

Variability Tree

NovaeN

SymbioticZAND

Dwarf novae

UG

Eclipse

Asteroid occultation

Eclipsing binary

Planetary transits

EA

EB

EW

Rotation

ZZ CetiPG 1159

Solar-like

(PG1716+426, Betsy)long period sdB

V1093 Her

(W Vir)Type II Ceph.δ Cepheids

RR Lyrae

CW

Credit : L. Eyer & N. Mowlavi (03/2009)

(updated 04/2013) δ Scuti

γ Doradus

Slowlypulsating B stars

α Cygni

β Cephei

λ Eri

SX Phoenicis

Hot OB Supergiants

ACYG

BCEP

SPBe

GDOR

DST

PMSδ Scuti

roAp

Miras

Irregulars

Semi-regulars

M

SRL

RV

SARVSmall ampl. red var.

(DO,V GW Vir)He/C/O-WDs

PV TelHe star

Be stars

RCB

GCASFU

UV Ceti

Binary red giants

α2 Canes VenaticorumMS (B8-A7) withstrong B fields

SX ArietisMS (B0-A7) withstrong B fields

Red dwarfs(K-M stars)

ACV

BY Dra

ELL

FKCOMSingle red giants

WR

SXA

β Per, α Vir

RS CVn

PMS

S Dor

Eclipse

(DBV) He-WDs

V777 Her

(EC14026)short period sdB

V361 Hya

RV Tau

Photom. Period.FG SgeSakurai,V605 Aql

R Hya (Miras)δ Cep (Cepheid)

DY Per

Supernovae

SN II, Ib, IcSN Ia

Extrinsic

Radio quiet Radio loud

Seyfert I

Seyfert 2

LINER

RLQ

BLRG

NLRG

WLRG

RQQ

OVVBL Lac

Blazar

Stars Stars

Intrinsic

CEPRR

SXPHESPB

Cataclysmic

Characterize/Classify as much as possible all types of objects

We concentrate here on lightcurves (;me series)

Current sample is from Catalina RealCme Transient Survey (CRTS)

500 M lightcurves available for analysis. We chose a few hundred for the exploratory work.

CRTS lightcurves

•  Regions with a radius of 3’ have been chosen with centers at RA=(100,200,300), Dec = (-‐30,-‐20,...,50)

•  File naming: crtslc_200_p10.csv etc. for the region centered on RA = 200 deg, Dec = 10 deg

•  Included fields: MasterID, Mag, Magerr, RA, Dec, MJD (in days), Blend (flag indicaCng possible confusion).

Set of objects around random locaCons (mostly non-‐variable

Look at the random regions

•  Next few slides show how these random regions look •  Though we use CRTS data, we deliberately chose SDSS cutouts so that we can also highlight how the same region can look very different in different surveys, something that will be crucial when federaCng varied datasets.

•  The RA/Dec can be seen on the leV panels. •  Circles indicate photometric objects, squares indicate objects with spectra.

•  The informaCon will be useful in characterizing outliers independently.

Mul;ple epochs from CRTS (8 years of data)

•  The large symbols somewhat exaggerate any possible moCons here

•  An advantage of picking sets of objects near each other is that they have roughly the same epochs of observaCon, thus allowing for registraCon when needed

Registra;on of lightcurves is possible. Gaps indicate missing data (upper limits can be assumed)

A zoom in to indicate that each column in previous plot is actually 4 columns just 10 minutes apart (x-‐axis, MJD is in days)

Derived staCsCcs

•  StaCsCcs for each object in each region is available. Note: many calculaCons done using fluxes (linear) instead of magnitudes (log).

•  A few discriminaCng stats are: amplitude, linear trend, median buffer range, standard deviaCon, beyond1std.

•  Lightcurves with less than 5 points were ignored •  Stats based on Richards et al. 2011 and calculated using the Caltech Time Series CharacterizaCon Service: hip://nirgun.caltech.edu:8000/scripts/descripCon.html

•  Amplitude: Half the difference between the maximum and minimum magnitudes

•  Beyond 1 std: Percentage of points beyond one standard deviaCon from the weighted mean

•  Flux percenCle raCo (90 -‐ 10 : 95 – 5)[mid80]: RaCo of flux percenCles (90th -‐ 10th) over (95th -‐ 5th)

•  Linear Trend: Slope of a linear fit to the light curve •  Maximum slope: Maximum absolute flux slope between two consecuCve observaCons

•  Median buffer range percentage: Percentage of fluxes within 10% of the amplitude from the median

Lightcurves

•  Mostly constant •  Some variables – Some rapid – Some slow – Some periodic

•  Errors can vary as a funcCon of Cme

Examples from CRTS

●

●●

●

●

●

●

●●

●

●

●

●

●

●●

●

●

●

●

●●●●

●

●●●●

●

●

●●

●●●

●●●

●

●●

●

●●●

●

●●

●●

●

●●●●

●●

●

●

●

●●● ●●●●●

●

●

●

●

●

●

●

●

●

●

●●

●●●

●

●

●●

●

●

●●●●●●

●●

●

●

54000 54500 55000 55500 56000

2019

1817

16

ID=1021068061186

MJD

Mag

●

●●

●

●

●

●

●●

●

●

●

●

●

●●

●

●

●

●

●●●●

●

●●●●

●

●

●●

●●●

●●●

●

●●

●

●●●

●

●●

●●

●

●●●●

●●

●

●

●

●●● ●●●●●

●

●

●

●

●

●

●

●

●

●

●●

●●●

●

●

●●

●

●

●●●●●●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●●

●

●●●●

●

●

●●

●

●●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●●●●

●●

●

●

●

●●● ●●

●●

●

●

●

●

●

●

●

●

●

●

●

●●

●●

●

●

●

●●

●

●

●

●

●

●●●

●

●

●

●

54000 54500 55000 55500 56000

18.5

18.0

17.5

17.0

16.5

16.0

ID=1021068061186

MJD

Mag

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●●

●

●●●●

●

●

●●

●

●●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●●●●

●●

●

●

●

●●● ●●

●●

●

●

●

●

●

●

●

●

●

●

●

●●

●●

●

●

●

●●

●

●

●

●

●

●●●

●

●

●

●

Examples from CRTS

●

●

●

●

●●

●●

●

●

●

●

●●

●●

●

●●●

●

●

●

●●

●

●●

●●

●●

●

●

●

●

●

●

●

●

●●●●

●

●

●

●

●●

●

●●

●●●●

●●

●

●

●

●

●●

●

●●

●

●

●

●

●●

●

●

●●

●●●

●●

●

●

●

●

●

●

●

●●

●

●●

●●

54000 54500 55000 55500 56000

2221

2019

1817

ID=1021068061378

MJD

Mag

●

●

●

●

●●

●●

●

●

●

●

●●

●●

●

●●●

●

●

●

●●

●

●●

●●

●●

●

●

●

●

●

●

●

●

●●●●

●

●

●

●

●●

●

●●

●●●●

●●

●

●

●

●

●●

●

●●

●

●

●

●

●●

●

●

●●

●●●

●●

●

●

●

●

●

●

●

●●

●

●●

●●

●

●

●

●

●●

●●

●

●

●

●

●

●

●●

●

●●●

●

●

●

●

●

●

●●

●

●

●●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●●●

●

●

●

●

●

●

●●

●

●●

●

●

●

●

●

●

●

●

●

●

●●●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●●

54000 54500 55000 55500 56000

20.0

19.5

19.0

18.5

18.0

ID=1021068061378

MJD

Mag

●

●

●

●

●●

●●

●

●

●

●

●

●

●●

●

●●●

●

●

●

●

●

●

●●

●

●

●●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●●●

●

●

●

●

●

●

●●

●

●●

●

●

●

●

●

●

●

●

●

●

●●●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●●

Examples from CRTS

●●

●

●●●

●●●

●

●●●

●

●

●

●

●●

●

●

●

●●

●

●

●

●

●

●

●

●●

● ●●

●

●

●

●

●●

● ●● ●

●●

●

●

●

●

●

●

● ●

●

●●

●

●

●

●

●

●

●

●

●

●

●●

●

●●

●●

53500 54000 54500 55000 55500

2322

2120

1918

17

ID=2020262030178

MJD

Mag

●●

●

●●●

●●●

●

●●●

●

●

●

●

●●

●

●

●

●●

●

●

●

●

●

●

●

●●

● ●●

●

●

●

●

●●

● ●● ●

●●

●

●

●

●

●

●

● ●

●

●●

●

●

●

●

●

●

●

●

●

●

●●

●

●●

●●

●●

●

●

●●

●●●

●

●●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●●

●

●

●

●

●

●

●●●

●●

●

●

●

●

●

●

●

●●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

53500 54000 54500 55000 55500

2221

2019

ID=2020262030178

MJD

Mag

●●

●

●

●●

●●●

●

●●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●●

●

●

●

●

●

●

●●●

●●

●

●

●

●

●

●

●

●●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

Examples from CRTS

●

●

●

●

●

●

●●●●

●

●

●

●

●

●●

●

●

●●●●

●●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●●

●

● ●●●

●

●

●●

●

●

●

●

●

●● ●●●

●●●●● ●

●

●

●

●●●●

●●

●

●

●

● ●●●●

●

●●

●

53500 54000 54500 55000 55500

2322

2120

1918

17

ID=2020262030217

MJD

Mag ●

●

●

●

●

●

●●●●

●

●

●

●

●

●●

●

●

●●●●

●●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●●

●

● ●●●

●

●

●●

●

●

●

●

●

●● ●●●

●●●●● ●

●

●

●

●●●●

●●

●

●

●

● ●●●●

●

●●

●

●

●

●

●

●

●

●●●●

●

●

●

●

●

●●

●

●

●●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

● ●●●

●

●

●●

●

●

●

●

●

●● ●

●●

●●●

●● ●●

●

●

●●

●●

●●

●

●

●

● ●

●●●

●

●●

●

53500 54000 54500 55000 55500

2120

1918

ID=2020262030217

MJD

Mag

●

●

●

●

●

●

●●●●

●

●

●

●

●

●●

●

●

●●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

● ●●●

●

●

●●

●

●

●

●

●

●● ●

●●

●●●

●● ●●

●

●

●●

●●

●●

●

●

●

● ●

●●●

●

●●

●

Examples from CRTS

●

●

●●

●●●●

●●

●

●

●

●

●

●●

●●

●

●●

●●●

●

●

●

●

●●

●

●

●

●●●

●

●

●

●

●● ●

●

●

●

●

●●●

●

●●●●

●

●

●●

●

●

●●

●

●

●

●●

●●●●

●●

●

●

●

● ●

●

●●

●

●

●

●

53500 54000 54500 55000 55500

2322

2120

1918

17

ID=2020262030567

MJD

Mag

●

●

●●

●●●●

●●

●

●

●

●

●

●●

●●

●

●●

●●●

●

●

●

●

●●

●

●

●

●●●

●

●

●

●

●● ●

●

●

●

●

●●●

●

●●●●

●

●

●●

●

●

●●

●

●

●

●●

●●●●

●●

●

●

●

● ●

●

●●

●

●

●

●

●

●

●

●

●●●●

●

●

●

●

●

●

●

●●

●●

●

●

●

●●●

●

●

●

●

●●

●

●

●

●●●

●

●

●

●

●

●●

●

●

●

●

●●●

●

●●●●

●

●

●●

●

●

●●

●

●

●

●●

●

●●●

●●

●

●

●

●●

●

●●

●

●

●

●

53500 54000 54500 55000 55500

2120

1918

ID=2020262030567

MJD

Mag

●

●

●

●

●●●●

●

●

●

●

●

●

●

●●

●●

●

●

●

●●●

●

●

●

●

●●

●

●

●

●●●

●

●

●

●

●

●●

●

●

●

●

●●●

●

●●●●

●

●

●●

●

●

●●

●

●

●

●●

●

●●●

●●

●

●

●

●●

●

●●

●

●

●

●

Clustering and outliers using 6 most significant variables

Movie using 6 significant variables

Outliers

●

outliers non−outliers

0.0

0.5

1.0

1.5

2.0

2.5

amplitude


0.1

0.2

0.3

0.4

0.5

beyond1std

●


0.00

0.10

0.20

0.30

fpr_mid20

●

●


0.0

0.1

0.2

0.3

0.4

0.5

fpr_mid35

DistribuCon (boxplots) for outliers (and non-‐variables)

FuncConal Data Analysis Staicu et al. test

• Suppose we want to test H0: µ(t) © µ, that is, that the meanfunction is constant

• for light curves, we are testing whether or not a star is variable

• Staicu, Li, Crainicanu, and Ruppert (2012) develop likelihoodratio tests about the mean of functional data

• The challenge is to take account of the correlation byestimating the correlation function

• The test can be applied to dense or sparsely observedfunctional data

• More general null hypotheses• µ(t) is a polynomial of degree p

• The means of two samples of functional data are equal

Conclusions •  CRTS provides a rich data set of 500 M lightcurves for

exploraCon. •  So far mostly transients have been looked at •  We are devising ways to explore all lightcurves with

addiConal informaCon available due to proximity (e.g. ability to register)

•  ExisCng parameters have redundancies which we plan to eliminate through clustering as well as dimensional reducCon techniques

•  Future plans: connect CRTS to brighter surveys like DASCH (overlap for a small fracCon of sources, but large Cme sample), and with fainter surveys like LSST (starCng with simulaCons to get ready for the actual survey). [DASCH -‐ Digital Access to a Sky Century @ Harvard]

ExploratoryAnalysisofLightCurves€¦ · LBV Asteroids AGN Rotation Eclipse Microlensing...

Documents

Transcript of ExploratoryAnalysisofLightCurves€¦ · LBV Asteroids AGN Rotation Eclipse Microlensing...

Exploratory*Analysis*of*LightCurves*€¦ · LBV Asteroids AGN Rotation Eclipse Microlensing...

Documents

Transcript of Exploratory*Analysis*of*LightCurves*€¦ · LBV Asteroids AGN Rotation Eclipse Microlensing...

ExploratoryAnalysisofLightCurves€¦ · LBV Asteroids AGN Rotation Eclipse Microlensing...

Transcript of ExploratoryAnalysisofLightCurves€¦ · LBV Asteroids AGN Rotation Eclipse Microlensing...