The Case for Provenance - IVOA
Transcript of The Case for Provenance - IVOA
The Case for ProvenanceJuande Santander-Vela, Arancha DelgadoEuropean Southern Observatory, Archive Department
IVOA InterOp Meeting Garching, 09/11/09
Talk Outline
What is Provenance?
How does Provenance !t in the VO?
The case(s) for Provenance
Conclusions
What is Provenance?
provenance |ˈprävənəns|noun
the place of origin or earliest known history of something : an orange
rug of Iranian provenance.
• the beginning of something's existence; something's origin : they try
to understand the whole universe, its provenance and fate.
See note at origin .• a record of ownership of a work of art or an antique, used as a
guide to authenticity or quality : the manuscript has a distinguished
provenance.
ORIGIN late 18th cent.: from French, from the verb provenir ‘come or stem from,’ from Latin provenire, from pro- ‘forth’ +
venire ‘come.’
What is Provenance?
provenance |ˈprävənəns|noun
the place of origin or earliest known history of something : an orange
rug of Iranian provenance.
• the beginning of something's existence; something's origin : they try
to understand the whole universe, its provenance and fate.
See note at origin .• a record of ownership of a work of art or an antique, used as a
guide to authenticity or quality : the manuscript has a distinguished
provenance.
ORIGIN late 18th cent.: from French, from the verb provenir ‘come or stem from,’ from Latin provenire, from pro- ‘forth’ +
venire ‘come.’
What is Provenance?
provenance |ˈprävənəns|noun
the place of origin or earliest known history of something : an orange
rug of Iranian provenance.
• the beginning of something's existence; something's origin : they try
to understand the whole universe, its provenance and fate.
See note at origin .• a record of ownership of a work of art or an antique, used as a
guide to authenticity or quality : the manuscript has a distinguished
provenance.
ORIGIN late 18th cent.: from French, from the verb provenir ‘come or stem from,’ from Latin provenire, from pro- ‘forth’ +
venire ‘come.’
What is Provenance?
provenance |ˈprävənəns|noun
the place of origin or earliest known history of something : an orange
rug of Iranian provenance.
• the beginning of something's existence; something's origin : they try
to understand the whole universe, its provenance and fate.
See note at origin .• a record of ownership of a work of art or an antique, used as a
guide to authenticity or quality : the manuscript has a distinguished
provenance.
ORIGIN late 18th cent.: from French, from the verb provenir ‘come or stem from,’ from Latin provenire, from pro- ‘forth’ +
venire ‘come.’
What is Provenance?
De!nition points out to:
History of astronomical data products
Ownership/Attribution (Observer, Proposal, Telescope/Instrument)…
Quality
What is Provenance?List of targets
Find existing data per target and band
Filter out data regarding quality parameters
Manipulate retrieved datasets
Dataprovenanceinformation needed
TYPICAL
ASTRONOMICAL
WORKFLOW
CHARACTERISATION
What is Provenance?List of targets
Find existing data per target and band
Filter out data regarding quality parameters
Manipulate retrieved datasets
Dataprovenanceinformation needed
TYPICAL
ASTRONOMICAL
WORKFLOW
CHARACTERISATION
What is Provenance?List of targets
Find existing data per target and band
Filter out data regarding quality parameters
Manipulate retrieved datasets
Dataprovenanceinformation needed
TYPICAL
ASTRONOMICAL
WORKFLOW
What is Provenance?
Galactic & Extragalactic
extinction
Astronomical Source
Atmosphere (Seeing & Opacity)
F1
F2
Fn
Data Reduction Software
P1
P2
Pm
raw !les data products
photon emission
data generation
usesknowledge
of
Telescope, Filters, Detectors…
What is Provenance?
Galactic & Extragalactic
extinction
Astronomical Source
Atmosphere (Seeing & Opacity)
F1
F2
Fn
Data Reduction Software
P1
P2
Pm
raw !les data products
photon emission
data generation
usesknowledge
of
Telescope, Filters, Detectors…
What is Provenance?
Galactic & Extragalactic
extinction
Astronomical Source
Atmosphere (Seeing & Opacity)
F1
F2
Fn
Data Reduction Software
P1
P2
Pm
raw !les data products
photon emission
data generation
usesknowledge
of
Telescope, Filters, Detectors…
NOT ONLY
FITS FILES
What is Provenance?
Galactic & Extragalactic
extinction
Astronomical Source
Atmosphere (Seeing & Opacity)
F1
F2
Fn
Data Reduction Software
P1
P2
Pm
raw !les data products
photon emission
data generation
usesknowledge
of
Telescope, Filters, Detectors…
NOT ONLY
FITS FILES
ALSO
CATALOGUES
What is Provenance?
Observation
Provenance
Software
Processing
Calibration
Ambient Instrumental
Observation
Provenance
…
Project Metadata
INTERNAL
PROVENANCE
WHITEPAPER
BY N. DELMOTTE
What is Provenance?
Observation
Provenance
Software
Processing
Calibration
Ambient Instrumental
Observation
Provenance
…
Project MetadataORGANISATIONAL
INTERNAL
PROVENANCE
WHITEPAPER
BY N. DELMOTTE
What is Provenance?
Observation
Provenance
Software
Processing
Calibration
Ambient Instrumental
Observation
Provenance
…
Project MetadataORGANISATIONAL
“ACQUISITIONAL”
INTERNAL
PROVENANCE
WHITEPAPER
BY N. DELMOTTE
How does Provenance !t in the VO?
Galactic & Extragalactic
extinction
Astronomical Source
Atmosphere (Seeing & Opacity)
F1
F2
Fn
Data Reduction Software
P1
P2
Pm
raw !les data products
photon emission
data generation
usesknowledge
of
Telescope, Filters, Detectors…
How does Provenance !t in the VO?
Galactic & Extragalactic
extinction
Astronomical Source
Atmosphere (Seeing & Opacity)
F1
F2
Fn
Data Reduction Software
P1
P2
Pm
raw !les data products
photon emission
data generation
usesknowledge
of
Telescope, Filters, Detectors…
How does Provenance !t in the VO?
Observation
ObsData
Characterization Provenance
Curation
Policy
PackagingTarget
Accuracy
SensitivityPrecisionResolutionCoverage
Data Model for Observation
Data Model forDataset Characterisation
RADAMS
How does Provenance !t in the VO?
Observation
ObsData
Characterization Provenance
Curation
Policy
PackagingTarget
Accuracy
SensitivityPrecisionResolutionCoverage
Data Model for Observation
Data Model forDataset Characterisation
RADAMSPh.D. Thesis
How does Provenance !t in the VO?
Observation
ObsData
Characterization Provenance
Curation
Policy
PackagingTarget
Accuracy
SensitivityPrecisionResolutionCoverage
Data Model for Observation
Data Model forDataset Characterisation
Extended Provenance
RADAMSPh.D. Thesis
How does Provenance !t in the VO?
Observation
ObsData
Characterization Provenance
Curation
Policy
PackagingTarget
Accuracy
SensitivityPrecisionResolutionCoverage
Data Model for Observation
Data Model forDataset Characterisation
SEE TALK BY
FRANÇOIS BONNAREL
RADAMSPh.D. Thesis
The case(s) for Provenance
Use cases for Provenance, centred around the core concepts:
History of a data set (documentation)
Establishing ownership/providing attribution
Quality assessment
Period
Program Type
Program ID
Run ID Run ID
Night
OB ID
Night Log
Raw File (DP_ID)
DPR CATG
Run ID
Night
Reduced Science Data
Master Calib. Data Ancillary
File
Project
Data Release ID
Structured ReadMe
Association IDs
Main File(s)
DP_ID
Associated File(s)
DP_ID
Level 0
Level 1 Level 2
Level 3
History of a data set (ESO)
bibcode ProgID
RunID
DataID
Level 4
History of a data set (ESO)
COMPILATION BY
A. DELGADO
IDtelescopeinstrumenttechniquemodeambient
observations
seeingair massextinction
ambient
comm/decommnametelescope / focustechniquemodesdetectorsoptical elementsmappings
instruments
IDnamePWLFWHM (type)comm/decommtransmission curvemappings
filters
IDnamewavelength rangegratingblaze angledispersionresolutioncomm/decommtransmission curvemappings
grisms
IDnameslit widthtypecomm/decommmappings
slits
comm/decommefficiencies observatorynameinstruments
telescope
comm/decomm modesmappings
technique
comm/decommtotal transmission curve nameinstrumentoptical elementsmappings
modes
comm/decommsensitivity
detector
etc.
IDnameorderswavelengths...gap orderdispersionresolutioncomm/decommmappings
gratings
History of a data set (ESO)
COMPILATION BY
A. DELGADO
IDtelescopeinstrumenttechniquemodeambient
observations
seeingair massextinction
ambient
comm/decommnametelescope / focustechniquemodesdetectorsoptical elementsmappings
instruments
IDnamePWLFWHM (type)comm/decommtransmission curvemappings
filters
IDnamewavelength rangegratingblaze angledispersionresolutioncomm/decommtransmission curvemappings
grisms
IDnameslit widthtypecomm/decommmappings
slits
comm/decommefficiencies observatorynameinstruments
telescope
comm/decomm modesmappings
technique
comm/decommtotal transmission curve nameinstrumentoptical elementsmappings
modes
comm/decommsensitivity
detector
etc.
IDnameorderswavelengths...gap orderdispersionresolutioncomm/decommmappings
gratings
History of a data set (ESO)
telescope instrument technique detectormode
filters
grisms
gratings
slits
etc.
observatory
ambient objectives
COMPILATION BY
A. DELGADO
History of a data set (ESO)
Galactic & Extragalactic
extinction
Astronomical Source
Atmosphere (Seeing & Opacity)
F1
F2
Fn
Data Reduction Software
P1
P2
Pm
raw !les data products
photon emission
data generation
usesknowledge
of
Telescope, Filters, Detectors…
telescope instrument technique detectormode
filters
grisms
gratings
slits
etc.
observatory
ambient objectives
COMPILATION BY
A. DELGADO
History of a data set (ESO)Telescope Information
Site
Dates: First Light, Commissioning…
Instruments & Optical Elements
Instruments
Technologies
Con!gurations
Optical Elements
Installation dates
Kind (Filter, Grisms, Gratings…)
Names ↔ IDs ↔ Transmission CurvesCOMPILATION BY
A. DELGADO
History of a data set (ESO)Telescope Information
Site
Dates: First Light, Commissioning…
Instruments & Optical Elements
Instruments
Technologies
Con!gurations
Optical Elements
Installation dates
Kind (Filter, Grisms, Gratings…)
Names ↔ IDs ↔ Transmission CurvesTIME CHANGING
COMPILATION BY
A. DELGADO
History of a data set
RADAMS
ObservationProvenance
InstrumentConf
-name
AntennaConf
-polarisation: L,R,X,Y
Feed
-name
BeamConf
+AntennaConf.name-mount-majorAxis-minorAxis-effectiveArea
Antenna
-name-description-shortName-locationName-URL
Instrument
+Instrument.locationName–latitude-longitude-height
Location
+beamConf.name-beamMajor-beamMinor-sensitivity-meanBeamSolidAngle-minorBeamSolidAngle-directivity-gain-resolution
Beam
+beamConf.Name-skyCentreFreq-ifCentreFreq-bandwidth
Receiver
+beamConf.name-chanSeparation-freqResol-velRefFrame-refChanNum-refChanFreq-restFreq-molecule-transition
Spectrum
+beamConf.name-chanSeparation-velResol-velRefFrame-refChanNum-refChanVel-restFreq-molecule-transition
Velocity
{This set is defined for each instrument band AND receiver
configuration
History of a data set
RADAMS
ObservationProvenance
InstrumentConf
-name
AntennaConf
-polarisation: L,R,X,Y
Feed
-name
BeamConf
+AntennaConf.name-mount-majorAxis-minorAxis-effectiveArea
Antenna
-name-description-shortName-locationName-URL
Instrument
+Instrument.locationName–latitude-longitude-height
Location
+beamConf.name-beamMajor-beamMinor-sensitivity-meanBeamSolidAngle-minorBeamSolidAngle-directivity-gain-resolution
Beam
+beamConf.Name-skyCentreFreq-ifCentreFreq-bandwidth
Receiver
+beamConf.name-chanSeparation-freqResol-velRefFrame-refChanNum-refChanFreq-restFreq-molecule-transition
Spectrum
+beamConf.name-chanSeparation-velResol-velRefFrame-refChanNum-refChanVel-restFreq-molecule-transition
Velocity
{This set is defined for each instrument band AND receiver
configuration
INSPIRED BY
LAMB!&!POWER’S
RAW RADIO DM!NOTE
History of a data set
InstrumentConf
-timeStamp
Processing
Observation Provenance AmbientConditions
+Processing.timeStamp-parameter.name-parameter.kind-parameter.value-parameter.sigma-parameter.calCoeff[n]
Calibration
+Processing.timeStamp-kind-softwarePackage-parameter.name-parameter.kind-parameter.value
ProcessingStep
RADAMS
History of a data set
Observation Provenance InstrumentConf
-timeStamp-opacity-temperature-humidity-wind
AmbientConditions+AmbientConditions.timeStamp-opacity-skydipStart-azimuth-elevation.[n]-tsky.[n]-atmosphericModel
OpacityCurve
RADAMS
History of a data set
Observation Provenance InstrumentConf
-timeStamp-opacity-temperature-humidity-wind
AmbientConditions+AmbientConditions.timeStamp-opacity-skydipStart-azimuth-elevation.[n]-tsky.[n]-atmosphericModel
OpacityCurve
TIME CHANGING
RADAMS
telescope instrument technique detectormode
filters
grisms
gratings
slits
etc.
observatory
ambient objectives
History of a data setObservationProvenance
InstrumentConf
-name
AntennaConf
-polarisation: L,R,X,Y
Feed
-name
BeamConf
+AntennaConf.name-mount-majorAxis-minorAxis-effectiveArea
Antenna
-name-description-shortName-locationName-URL
Instrument
+Instrument.locationName–latitude-longitude-height
Location
+beamConf.name-beamMajor-beamMinor-sensitivity-meanBeamSolidAngle-minorBeamSolidAngle-directivity-gain-resolution
Beam
+beamConf.Name-skyCentreFreq-ifCentreFreq-bandwidth
Receiver
+beamConf.name-chanSeparation-freqResol-velRefFrame-refChanNum-refChanFreq-restFreq-molecule-transition
Spectrum
+beamConf.name-chanSeparation-velResol-velRefFrame-refChanNum-refChanVel-restFreq-molecule-transition
Velocity
{This set is defined for each instrument band AND receiver
configuration
telescope instrument technique detectormode
filters
grisms
gratings
slits
etc.
observatory
ambient objectives
History of a data setObservationProvenance
InstrumentConf
-name
AntennaConf
-polarisation: L,R,X,Y
Feed
-name
BeamConf
+AntennaConf.name-mount-majorAxis-minorAxis-effectiveArea
Antenna
-name-description-shortName-locationName-URL
Instrument
+Instrument.locationName–latitude-longitude-height
Location
+beamConf.name-beamMajor-beamMinor-sensitivity-meanBeamSolidAngle-minorBeamSolidAngle-directivity-gain-resolution
Beam
+beamConf.Name-skyCentreFreq-ifCentreFreq-bandwidth
Receiver
+beamConf.name-chanSeparation-freqResol-velRefFrame-refChanNum-refChanFreq-restFreq-molecule-transition
Spectrum
+beamConf.name-chanSeparation-velResol-velRefFrame-refChanNum-refChanVel-restFreq-molecule-transition
Velocity
{This set is defined for each instrument band AND receiver
configuration
UNIFICATION OF PROVENANCE AT THE
AMBIENT, TELESCOPE_CONF LEVELS:
THE REST IS VERY HETEROGENOUS
History of a data setCan only be done with:
Historical archive of:
optical elements
software (possibly with virtualization)
con!guration values / fudge factors
ambient information
DIMM seeing, opacity, conductivity…
History of a data setCan only be done with:
Historical archive of:
optical elements
software (possibly with virtualization)
con!guration values / fudge factors
ambient information
DIMM seeing, opacity, conductivity…+ SYSTEMATIC
OBSERVING LOGS!
Ownership TrackingPeriod
Program Type
Program ID
Run ID Run ID
Night
OB ID
Night Log
Raw File (DP_ID)
DPR CATG
Run ID
Night
Reduced Science Data
Master Calib. Data Ancillary
File
Project
Data Release ID
Structured ReadMe
Association IDs
Main File(s)
DP_ID
Associated File(s)
DP_ID
Level 0
Level 1 Level 2
Level 3
bibcode ProgID
RunID
DataID
Level 4
Ownership TrackingPeriod
Program Type
Program ID
Run ID Run ID
Night
OB ID
Night Log
Raw File (DP_ID)
DPR CATG
Run ID
Night
Reduced Science Data
Master Calib. Data Ancillary
File
Project
Data Release ID
Structured ReadMe
Association IDs
Main File(s)
DP_ID
Associated File(s)
DP_ID
Level 0
Level 1 Level 2
Level 3
bibcode ProgID
RunID
DataID
Level 4
Ownership Tracking
Can only rely on unique identi!ers being maintained
Other VO manipulations (SAMP messages, cross-matching) can lose associations
Need to provide services for identi!cation of key IDs
Increase of the role of the IVOA Registry?
This is stopping some small publishers!
Ownership Tracking
Can only rely on unique identi!ers being maintained
Other VO manipulations (SAMP messages, cross-matching) can lose associations
Need to provide services for identi!cation of key IDs
Increase of the role of the IVOA Registry?
This is stopping some small publishers!
THIS TRACKING CAN BE USED TO
MEASURE PUBLICATION!RATIOS
Quality Assessment
Linked to:
Acquisition con!guration
Actual problems with instruments/telescope
Weather
Intended usage of the dataset(s)!
Quality Assessment
Linked to:
Acquisition con!guration
Actual problems with instruments/telescope
Weather
Intended usage of the dataset(s)!
NO OBJECTIVE, ONE-SIZE-FITS-ALL,
QUALITY ASSESSMENT METRIC
Quality Assessment
Linked to:
Acquisition con!guration
Actual problems with instruments/telescope
Weather
Intended usage of the dataset(s)!
Quality Assessment
Linked to:
Acquisition con!guration
Actual problems with instruments/telescope
Weather
Intended usage of the dataset(s)!
TIME CHANGING
Queriable Provenance?
Most queries, on Characterisation/Target/Curation
Instrument-specific forms for instrument configurations
Allow UType/UCD/IVOAT key-value pairs?
Instrument-speci!c a priori quality assessment
A posteriori, usage-speci!c quality assessment, o#ine > usage-speci!c query to data Provenance
Conclusions
Provenance is an integral part of the Observation DM
Provenance comes with discipline, but allows for quality science
Di$erent approach for di$erent kinds of instruments, but all under the same general framework
Provenance should be accessible for any item; specialised TAP version?
Conclusions
Data centres: consistent naming/coding, plus mappings for existing data
History, history, history!
Should we forget about past data, and focus in the future?