E-Science talk Edinburgh
Autonomous Pipelines
David Brett Leicester
AUTONOMOUS AUTONOMOUS PIPELINESPIPELINES
David Brett, Leicester University
E-Science talk Edinburgh
Autonomous Pipelines
David Brett Leicester
Project: Why work on an autonomous classification program?
WASP: Wide Angle Search for Planets telescope
Leicester, St-Andrews, Cambridge, QU Belfast and Open Universities.
Variable Identification: Period searching
Classification System: Artificial Neural Networks
Methods and Results.
Methods and the Future.
Talk Map
E-Science talk Edinburgh
Autonomous Pipelines
David Brett Leicester
Why do any of this?Why do any of this?
Tera-scale computing age:
• Volume of collected data
• Repetitive nature of the data reduction
• “Brute force” approach
• Creates one more layer of abstraction
E-Science talk Edinburgh
Autonomous Pipelines
David Brett Leicester
WASPWASP(Wide-Angle Search for Planets)
9.5o
9.5o
Four 20482 CCD chips (recently funding for five)
For comparison: the INT “wide-field camera” images roughly the size of the full
moon
1% photometry down to 13th magnitude and detections down to 17th (30s exposure)
5TB per year (raw)
But what do we do with all those bits?
E-Science talk Edinburgh
Autonomous Pipelines
David Brett Leicester
Source Extraction and Data Reduction
Stages:
• Home grown programs for “cleaning” the raw data.
• Use of conventional packages such as SExtractor for source extraction
• Variability checking programs
• Periodic variability locating programs
• Phased lightcurve recognition software
• Results database
E-Science talk Edinburgh
Autonomous Pipelines
David Brett Leicester
Periodic Variables
Phase-folding:
• Fast to execute
• Easy to implement
• Simple to understand
• e.g. 2 or the L-Statistic
Two Main Methods
Frequency Analysis:
• Slower to execute
• Trickier to code
• More reliable
• e.g. Lomb-Scargle or Schwarzenberg-Czerny
E-Science talk Edinburgh
Autonomous Pipelines
David Brett Leicester
Periodic Variables
Phase-Folding: 2
• Maximum deviation from a constant line.
• Binned data, uses bin mean.
• Intra-bin deviation not taken into account
• Very quick to implement and compute.
• REM! Looking for a maximum, not a minimum.
E-Science talk Edinburgh
Autonomous Pipelines
David Brett Leicester
Periodic Variables
Phase-Folding: L-Statistic
• Also uses binned data.
• Additionally considers intra-bin deviation from bin-mean.
• Divide 2 value by the intra-bin dispersion, enhancing low deviation trial periods.
• Quick and accurate with medium to low-noise data.
• Created by S. Davies, 1990.
E-Science talk Edinburgh
Autonomous Pipelines
David Brett Leicester
Periodic Variables
Frequency Analysis:
Lomb-Scargle
• Uses the whole unbinned data time series (DTS).
• Created by Lomb 1976, refined by Scargle 1982. Code adapted from NR in C.
Period (days)
Stat
E-Science talk Edinburgh
Autonomous Pipelines
David Brett Leicester
Periodic Variables
Frequency Analysis:
Schwarzenberg-Czerny
• Uses the whole unbinned data time series (DTS).
• Created by A. Schwarzenberg-Czerny 1996. Code adapted from S-C code.
Period (days)
Stat
E-Science talk Edinburgh
Autonomous Pipelines
David Brett Leicester
Periodic Variables
Choice of Trial Periods:
• Linear difference in period, dP.
• Linear difference in phase, d.
• Too small a dP and we may search too fine a parameter space and waste CPU time.
• Too large a dP and we will not search finely enough.
OK
7%
E-Science talk Edinburgh
Autonomous Pipelines
David Brett Leicester
Periodic Variables
E-Science talk Edinburgh
Autonomous Pipelines
David Brett Leicester
Periodic Variables
Conclusions:
• Phase-folding methods are swiftest
• Frequency Analysis methods are generally more reliable
• Autonomous pipelines require reliability over speed
• Schwarzenberg-Czerny would be the method of choice
• In which case a better period choice method is needed
E-Science talk Edinburgh
Autonomous Pipelines
David Brett Leicester
Autonomous Classification
2 Main Stages:
• Memory Pattern Matching
• Modification of the Artificial Neural Network (ANN)
INITIAL FINAL
E-Science talk Edinburgh
Autonomous Pipelines
David Brett Leicester
Autonomous Classification
Memory Pattern Matching:
Why?
E-Science talk Edinburgh
Autonomous Pipelines
David Brett Leicester
Autonomous Classification
Memory Pattern Matching:
• It allows us to begin grouping similar shapes together
• This grouping encourages self-organisation
• To pattern-match is the underlying goal!
• Finding a sensible position on the network for a pattern allows us to change the network
How?
E-Science talk Edinburgh
Autonomous Pipelines
David Brett Leicester
Autonomous Classification
Memory Pattern Matching:
Lightcurve Pattern
Node 0 Pattern
Node 1 Pattern
Node 0 has the lowest weight difference vector, node 0 wins
WEIGHTS
E-Science talk Edinburgh
Autonomous Pipelines
David Brett Leicester
Autonomous Classification
Modification of the ANN:
• Modification affects an area
• Lessens as geometrical distance increases
• Area mixing encourages grouping
• The network can self-organise
• Hotspots occur
E-Science talk Edinburgh
Autonomous Pipelines
David Brett Leicester
Autonomous Classification
Modification of the ANN:
• Adjust the weights on the network nodes so that they better represent the lightcurve.
• is the learning parameter. It decreases on each learning iteration of the network. 00.
• P is the power (from the neighbour function) of the current node.
)( iii LwPdw
E-Science talk Edinburgh
Autonomous Pipelines
David Brett Leicester
The Future
• Enhanced clustering mechanism.
• More precise shape-similarity evaluating methods.
• More dynamically adaptive choice of trial periods for period searching.
• Refinement of these ideas and trying other methods.
• Research if >2D networks are worthwhile in the current format.
E-Science talk Edinburgh
Autonomous Pipelines
David Brett Leicester
Questions?Questions?
Top Related