Experiences From The Fermi Data Archive · Operations Center GBM Instrument Operations Center GRB...

22
Experiences From The Fermi Data Archive Dr. Thomas Stephens Wyle IS/Fermi Science Support Center

Transcript of Experiences From The Fermi Data Archive · Operations Center GBM Instrument Operations Center GRB...

Experiences From The Fermi Data Archive

Dr. Thomas Stephens Wyle IS/Fermi Science Support Center

GWODWS – Oct 27, 2011 - 2

A Brief Outline

Fermi Mission Architecture Science Support Center Data Systems Experiences

GWODWS – Oct 27, 2011 - 3

What is Fermi

Two Instruments on board – Large Area Telescope (LAT) (PI – Peter Michelson, Stanford University;

managing institution: SLAC) • Primary instrument, gamma-ray (pair production) telescope – 20 MeV – >300 GeV • Scanning mode – sees 20% of the sky all the time; all parts of sky for ~30 min. every

3 hours. This is the primary mode of operation.

– GLAST Burst Monitor (GBM) (PI – Charles Meegan, NASA/MSFC) • Context instrument – 8 keV – 30 MeV • Observes entire unocculted sky all the time searching for gamma ray bursts

5 year mission (10 year goal) – no consumables on-board to be depleted

Large Area Telescope (LAT)

GLAST Burst Monitor (GBM)

GWODWS – Oct 27, 2011 - 4

Fermi Mission Data Flow GLAST MISSION ELEMENTS

GN

HEASARC GSFC

-

-

DELTA 7920H •

White Sands

TDRSS SN S & Ku

LAT Instrument Science

Operations Center

GBM Instrument Operations Center

GRB Coordinates Network

• Telemetry 1 kbps •

- •

S

Alerts

Data, Command Loads

Schedules

Schedules

Mission Operations Center (MOC)

Fermi Science Support Center

• µ sec •

Fermi Spacecraft

Large Area Telescope & GBM GPS

Fermi Mission Data Flow

GWODWS – Oct 27, 2011 - 5

Fermi Science Support Center Archive

Data (http://fermi.gsfc.nasa.gov/ssc/data) – GBM

• Daily data (CTIME and CSPEC)

• Trigger and burst data (Time Tagged Events)

• All provided through BROWSE

– LAT • Photon (and extended) data

• Spacecraft position and pointing and live time history

• Provided through a custom web server

– Ancillary data • Catalogs

• Background models

• Others

• Provided through BROWSE, FTP and web pages

Science Tools

GWODWS – Oct 27, 2011 - 6

Automated Processing

All data received (from MOC, GIOC, or LISOC) is identified and processed by an automated data ingest pipeline – About 30 different data products received

– Some are science data, others operations data for commanding the spacecraft

Built using OPUS as a framework Custom Perl scripts and C++ programs do the actual data processing – Data Validation

– Data archiving

– Creation of BROWSE tables

– Ingest into custom data server

Pages (e-mails) sent out when problems are encountered

GWODWS – Oct 27, 2011 - 7

LAT Data Server

Queue Manager

Photon Database

Event Database

Pointing and Livetime History Database

GSSC Internal Tools

MySQL Database

Web Interface

Ingest System

HEASARC BROWSE

GWODWS – Oct 27, 2011 - 8

Photon Database Details

Search Nodes

QueueManager

Ingest System

Control Node

Master Data Copy

Local Data Copies

Files to search Query parameters

Node Status Query results

New data

Query requests

Query status

MySQL server

Web Interface

GWODWS – Oct 27, 2011 - 9

Data Queries Over Time

GWODWS – Oct 27, 2011 - 10

Data Served

GWODWS – Oct 27, 2011 - 11

Fermi Data Experiences

We’ve been providing data to the general community for just over two years now (three years to the LAT Collaboration). No serious issues with the data servers Things that went well – Data Challenges

– Training Workshops

Things that have caused headaches – Multiple data sources

– Data Reprocessing

– Data Monitoring and validation

– Science Analysis Tools

GWODWS – Oct 27, 2011 - 12

Data Challenges

Conceived as a method to check the usefulness of the data, data servers, and science analysis software before launch Staged testing – 1 day (1 week) data

– 55 days data (one procession period of the spacecraft orbit)

– 1 year of data

Analysis tools got more and more sophisticated as we went Data server was used to provide data for all three data challenges Provided experience ingesting and serving the data and gathered feedback on usefulness of the data server. – Surprisingly very few changes were suggested.

GWODWS – Oct 27, 2011 - 13

Regularly hold workshops to teach users about the Fermi data and Science Tools Held around the country, usually in the summer Gives people hands on experience with experts there in person to answer questions – Getting the data

– Using the various science tools

Provides valuable feedback to FSSC on variety of topics – Ease of use of server and tools

– Issues with the various OS’s the tools are supported on

– Documentation

User Training

GWODWS – Oct 27, 2011 - 14

FSSC receives data from multiple sources Detailed File Format Description document describes the data we should be receiving Almost a day doesn’t go by when some data received violates this design. Have to decide if the violation is important or not – Does it impact science or operations

– Is it more cosmetic

Probably >90% of the time this is data from the GBM team – Under funded to reliably build automated data processing system

– A lot of processing still done by hand.

Data From Every Direction

GWODWS – Oct 27, 2011 - 15

Data Reprocessing/Reingest

Complete reprocessing of the entire mission data set for a given data type. Has happened three times now for the LAT data, and at least once for the GBM data Primarily this is a huge bookkeeping exercise Also have to monitor resource usage – Bandwidth

– Processing power

– Disk Space

Requires a bit of data validation work – Something we need to be better about automating (my current

project)

GWODWS – Oct 27, 2011 - 16

Data Validation

There are so many ways for the data to get messed up This needs to be as automated and resource light as you can make it – Needs to work on lots of data

– Needs to run as often as possible

Things that have bit us – Missing data (gaps)

– Duplicate data (double ingests)

– Corrupted data (never did figure this one out)

Need good tools for correcting errors as quickly and automatically as possible This is something we are still working on at the FSSC

GWODWS – Oct 27, 2011 - 17

The Fermi mission provides a set of Science Analysis Software tools for analyzing Fermi data – Primarily tools for the LAT data

– GBM data can be processed using already existing tools such as Xspec

Supposedly a single tool set for the instrument team and the general scientific community Suffered (still suffers?) from improper division of responsibility – Instrument team responsible for tool development

– FSSC responsible for dealing with user issues

Science Tools

Backup Slides

GWODWS – Oct 27, 2011 - 19

Large Area Telescope (LAT) Data

Not images → event lists Three different types of data – Photon Data (most used dataset )

• Events that have the highest probability of truly being photons and which have well defined Instrument Response Functions

• Table of detected photons and their properties (RA, Dec, Energy, Time, etc).

– Event Data • Superset of Photon data • Includes more events that are probably particles (e-, e+, p+, etc.) • More data columns (x10) than the basic photon data

– Spacecraft Data • Position and attitude history • Instrument mode • Use in calculating livetime and exposure of instrument

GWODWS – Oct 27, 2011 - 20

Large Area Telescope (LAT) Data

LAT Characteristics – Large FOV (~2.4 sr = 1/6th of the entire sky)

– Large energy range (30MeV to >300 GeV) with energy dependent PSF

– Always on sky survey operation • Entire sky every two orbits (~3 hours)

• Any given position observed for about 30 minutes during the two orbits

• Constantly changing instrument response for a given position on the sky

Photon Data Rate – Currently coming in at about a rate of 4.4 Hz → 120 million per year

– Represents 1% of total data rate

– New data available every 2-3 hours

GWODWS – Oct 27, 2011 - 21

Issues for Serving LAT Data

Continuous data stream (Define an “observation” please.) – We’re looking everywhere all the time

– The concept of an “observation” has no meaning in a traditional sense

Source Confusion (Is this your photon or mine?) – Energy Dependent PSF

• 3.5° at 100 MeV

• 0.15° 1 GeV

– This mean the sources overlap (especially at low energies)

– Analysis requires accounting for all nearby sources as well as the one you are interested in

– Usually need from a 10-15° radius region around your source → ~2% of the entire sky.

GWODWS – Oct 27, 2011 - 22

System Design Considerations

Continuously updating data stream – Data downlinked from the spacecraft every 3-4 hours

– Processed data arriving from the LISOC every 2-4 hours

Want fast response times to queries – Loosely modeled on HEASARC Browse system

• NASA interface used for other high energy mission data

• Only stores metadata and provides immediate query response with lists of available data

– “Delay tolerance” for web response usually < 10s – Need to provide rapid feedback on status

Typical queries will be for some region of the sky over a given time period (possibly the entire mission) Three different types of coordinated data served