National Operations Center Terry M. Raymond Chief of the CAP National Operations Center.
Experiences From The Fermi Data Archive · Operations Center GBM Instrument Operations Center GRB...
Transcript of Experiences From The Fermi Data Archive · Operations Center GBM Instrument Operations Center GRB...
GWODWS – Oct 27, 2011 - 2
A Brief Outline
Fermi Mission Architecture Science Support Center Data Systems Experiences
GWODWS – Oct 27, 2011 - 3
What is Fermi
Two Instruments on board – Large Area Telescope (LAT) (PI – Peter Michelson, Stanford University;
managing institution: SLAC) • Primary instrument, gamma-ray (pair production) telescope – 20 MeV – >300 GeV • Scanning mode – sees 20% of the sky all the time; all parts of sky for ~30 min. every
3 hours. This is the primary mode of operation.
– GLAST Burst Monitor (GBM) (PI – Charles Meegan, NASA/MSFC) • Context instrument – 8 keV – 30 MeV • Observes entire unocculted sky all the time searching for gamma ray bursts
5 year mission (10 year goal) – no consumables on-board to be depleted
Large Area Telescope (LAT)
GLAST Burst Monitor (GBM)
GWODWS – Oct 27, 2011 - 4
Fermi Mission Data Flow GLAST MISSION ELEMENTS
GN
HEASARC GSFC
-
-
DELTA 7920H •
White Sands
TDRSS SN S & Ku
LAT Instrument Science
Operations Center
GBM Instrument Operations Center
GRB Coordinates Network
• Telemetry 1 kbps •
- •
S
Alerts
Data, Command Loads
Schedules
Schedules
Mission Operations Center (MOC)
Fermi Science Support Center
• µ sec •
•
•
Fermi Spacecraft
Large Area Telescope & GBM GPS
Fermi Mission Data Flow
GWODWS – Oct 27, 2011 - 5
Fermi Science Support Center Archive
Data (http://fermi.gsfc.nasa.gov/ssc/data) – GBM
• Daily data (CTIME and CSPEC)
• Trigger and burst data (Time Tagged Events)
• All provided through BROWSE
– LAT • Photon (and extended) data
• Spacecraft position and pointing and live time history
• Provided through a custom web server
– Ancillary data • Catalogs
• Background models
• Others
• Provided through BROWSE, FTP and web pages
Science Tools
GWODWS – Oct 27, 2011 - 6
Automated Processing
All data received (from MOC, GIOC, or LISOC) is identified and processed by an automated data ingest pipeline – About 30 different data products received
– Some are science data, others operations data for commanding the spacecraft
Built using OPUS as a framework Custom Perl scripts and C++ programs do the actual data processing – Data Validation
– Data archiving
– Creation of BROWSE tables
– Ingest into custom data server
Pages (e-mails) sent out when problems are encountered
GWODWS – Oct 27, 2011 - 7
LAT Data Server
Queue Manager
Photon Database
Event Database
Pointing and Livetime History Database
GSSC Internal Tools
MySQL Database
Web Interface
Ingest System
HEASARC BROWSE
GWODWS – Oct 27, 2011 - 8
Photon Database Details
Search Nodes
QueueManager
Ingest System
Control Node
Master Data Copy
Local Data Copies
Files to search Query parameters
Node Status Query results
New data
Query requests
Query status
MySQL server
Web Interface
GWODWS – Oct 27, 2011 - 11
Fermi Data Experiences
We’ve been providing data to the general community for just over two years now (three years to the LAT Collaboration). No serious issues with the data servers Things that went well – Data Challenges
– Training Workshops
Things that have caused headaches – Multiple data sources
– Data Reprocessing
– Data Monitoring and validation
– Science Analysis Tools
GWODWS – Oct 27, 2011 - 12
Data Challenges
Conceived as a method to check the usefulness of the data, data servers, and science analysis software before launch Staged testing – 1 day (1 week) data
– 55 days data (one procession period of the spacecraft orbit)
– 1 year of data
Analysis tools got more and more sophisticated as we went Data server was used to provide data for all three data challenges Provided experience ingesting and serving the data and gathered feedback on usefulness of the data server. – Surprisingly very few changes were suggested.
GWODWS – Oct 27, 2011 - 13
Regularly hold workshops to teach users about the Fermi data and Science Tools Held around the country, usually in the summer Gives people hands on experience with experts there in person to answer questions – Getting the data
– Using the various science tools
Provides valuable feedback to FSSC on variety of topics – Ease of use of server and tools
– Issues with the various OS’s the tools are supported on
– Documentation
User Training
GWODWS – Oct 27, 2011 - 14
FSSC receives data from multiple sources Detailed File Format Description document describes the data we should be receiving Almost a day doesn’t go by when some data received violates this design. Have to decide if the violation is important or not – Does it impact science or operations
– Is it more cosmetic
Probably >90% of the time this is data from the GBM team – Under funded to reliably build automated data processing system
– A lot of processing still done by hand.
Data From Every Direction
GWODWS – Oct 27, 2011 - 15
Data Reprocessing/Reingest
Complete reprocessing of the entire mission data set for a given data type. Has happened three times now for the LAT data, and at least once for the GBM data Primarily this is a huge bookkeeping exercise Also have to monitor resource usage – Bandwidth
– Processing power
– Disk Space
Requires a bit of data validation work – Something we need to be better about automating (my current
project)
GWODWS – Oct 27, 2011 - 16
Data Validation
There are so many ways for the data to get messed up This needs to be as automated and resource light as you can make it – Needs to work on lots of data
– Needs to run as often as possible
Things that have bit us – Missing data (gaps)
– Duplicate data (double ingests)
– Corrupted data (never did figure this one out)
Need good tools for correcting errors as quickly and automatically as possible This is something we are still working on at the FSSC
GWODWS – Oct 27, 2011 - 17
The Fermi mission provides a set of Science Analysis Software tools for analyzing Fermi data – Primarily tools for the LAT data
– GBM data can be processed using already existing tools such as Xspec
Supposedly a single tool set for the instrument team and the general scientific community Suffered (still suffers?) from improper division of responsibility – Instrument team responsible for tool development
– FSSC responsible for dealing with user issues
Science Tools
GWODWS – Oct 27, 2011 - 19
Large Area Telescope (LAT) Data
Not images → event lists Three different types of data – Photon Data (most used dataset )
• Events that have the highest probability of truly being photons and which have well defined Instrument Response Functions
• Table of detected photons and their properties (RA, Dec, Energy, Time, etc).
– Event Data • Superset of Photon data • Includes more events that are probably particles (e-, e+, p+, etc.) • More data columns (x10) than the basic photon data
– Spacecraft Data • Position and attitude history • Instrument mode • Use in calculating livetime and exposure of instrument
GWODWS – Oct 27, 2011 - 20
Large Area Telescope (LAT) Data
LAT Characteristics – Large FOV (~2.4 sr = 1/6th of the entire sky)
– Large energy range (30MeV to >300 GeV) with energy dependent PSF
– Always on sky survey operation • Entire sky every two orbits (~3 hours)
• Any given position observed for about 30 minutes during the two orbits
• Constantly changing instrument response for a given position on the sky
Photon Data Rate – Currently coming in at about a rate of 4.4 Hz → 120 million per year
– Represents 1% of total data rate
– New data available every 2-3 hours
GWODWS – Oct 27, 2011 - 21
Issues for Serving LAT Data
Continuous data stream (Define an “observation” please.) – We’re looking everywhere all the time
– The concept of an “observation” has no meaning in a traditional sense
Source Confusion (Is this your photon or mine?) – Energy Dependent PSF
• 3.5° at 100 MeV
• 0.15° 1 GeV
– This mean the sources overlap (especially at low energies)
– Analysis requires accounting for all nearby sources as well as the one you are interested in
– Usually need from a 10-15° radius region around your source → ~2% of the entire sky.
GWODWS – Oct 27, 2011 - 22
System Design Considerations
Continuously updating data stream – Data downlinked from the spacecraft every 3-4 hours
– Processed data arriving from the LISOC every 2-4 hours
Want fast response times to queries – Loosely modeled on HEASARC Browse system
• NASA interface used for other high energy mission data
• Only stores metadata and provides immediate query response with lists of available data
– “Delay tolerance” for web response usually < 10s – Need to provide rapid feedback on status
Typical queries will be for some region of the sky over a given time period (possibly the entire mission) Three different types of coordinated data served