The Case for Provenance - IVOA

61
The Case for Provenance Juande Santander-Vela, Arancha Delgado European Southern Observatory, Archive Department IVOA InterOp Meeting Garching, 09/11/09

Transcript of The Case for Provenance - IVOA

The Case for ProvenanceJuande Santander-Vela, Arancha DelgadoEuropean Southern Observatory, Archive Department

IVOA InterOp Meeting Garching, 09/11/09

Talk Outline

What is Provenance?

How does Provenance !t in the VO?

The case(s) for Provenance

Conclusions

What is Provenance?

What is Provenance?

provenance |ˈprävənəns|noun

the place of origin or earliest known history of something : an orange

rug of Iranian provenance.

• the beginning of something's existence; something's origin : they try

to understand the whole universe, its provenance and fate.

See note at origin .• a record of ownership of a work of art or an antique, used as a

guide to authenticity or quality : the manuscript has a distinguished

provenance.

ORIGIN late 18th cent.: from French, from the verb provenir ‘come or stem from,’ from Latin provenire, from pro- ‘forth’ +

venire ‘come.’

What is Provenance?

provenance |ˈprävənəns|noun

the place of origin or earliest known history of something : an orange

rug of Iranian provenance.

• the beginning of something's existence; something's origin : they try

to understand the whole universe, its provenance and fate.

See note at origin .• a record of ownership of a work of art or an antique, used as a

guide to authenticity or quality : the manuscript has a distinguished

provenance.

ORIGIN late 18th cent.: from French, from the verb provenir ‘come or stem from,’ from Latin provenire, from pro- ‘forth’ +

venire ‘come.’

What is Provenance?

provenance |ˈprävənəns|noun

the place of origin or earliest known history of something : an orange

rug of Iranian provenance.

• the beginning of something's existence; something's origin : they try

to understand the whole universe, its provenance and fate.

See note at origin .• a record of ownership of a work of art or an antique, used as a

guide to authenticity or quality : the manuscript has a distinguished

provenance.

ORIGIN late 18th cent.: from French, from the verb provenir ‘come or stem from,’ from Latin provenire, from pro- ‘forth’ +

venire ‘come.’

What is Provenance?

provenance |ˈprävənəns|noun

the place of origin or earliest known history of something : an orange

rug of Iranian provenance.

• the beginning of something's existence; something's origin : they try

to understand the whole universe, its provenance and fate.

See note at origin .• a record of ownership of a work of art or an antique, used as a

guide to authenticity or quality : the manuscript has a distinguished

provenance.

ORIGIN late 18th cent.: from French, from the verb provenir ‘come or stem from,’ from Latin provenire, from pro- ‘forth’ +

venire ‘come.’

What is Provenance?

De!nition points out to:

History of astronomical data products

Ownership/Attribution (Observer, Proposal, Telescope/Instrument)…

Quality

What is Provenance?

What is Provenance?List of targets

Find existing data per target and band

Filter out data regarding quality parameters

Manipulate retrieved datasets

Dataprovenanceinformation needed

TYPICAL

ASTRONOMICAL

WORKFLOW

CHARACTERISATION

What is Provenance?List of targets

Find existing data per target and band

Filter out data regarding quality parameters

Manipulate retrieved datasets

Dataprovenanceinformation needed

TYPICAL

ASTRONOMICAL

WORKFLOW

CHARACTERISATION

What is Provenance?List of targets

Find existing data per target and band

Filter out data regarding quality parameters

Manipulate retrieved datasets

Dataprovenanceinformation needed

TYPICAL

ASTRONOMICAL

WORKFLOW

What is Provenance?

Galactic & Extragalactic

extinction

Astronomical Source

Atmosphere (Seeing & Opacity)

F1

F2

Fn

Data Reduction Software

P1

P2

Pm

raw !les data products

photon emission

data generation

usesknowledge

of

Telescope, Filters, Detectors…

What is Provenance?

Galactic & Extragalactic

extinction

Astronomical Source

Atmosphere (Seeing & Opacity)

F1

F2

Fn

Data Reduction Software

P1

P2

Pm

raw !les data products

photon emission

data generation

usesknowledge

of

Telescope, Filters, Detectors…

What is Provenance?

Galactic & Extragalactic

extinction

Astronomical Source

Atmosphere (Seeing & Opacity)

F1

F2

Fn

Data Reduction Software

P1

P2

Pm

raw !les data products

photon emission

data generation

usesknowledge

of

Telescope, Filters, Detectors…

NOT ONLY

FITS FILES

What is Provenance?

Galactic & Extragalactic

extinction

Astronomical Source

Atmosphere (Seeing & Opacity)

F1

F2

Fn

Data Reduction Software

P1

P2

Pm

raw !les data products

photon emission

data generation

usesknowledge

of

Telescope, Filters, Detectors…

NOT ONLY

FITS FILES

ALSO

CATALOGUES

What is Provenance?

Observation

Provenance

Software

Processing

Calibration

Ambient Instrumental

Observation

Provenance

Project Metadata

INTERNAL

PROVENANCE

WHITEPAPER

BY N. DELMOTTE

What is Provenance?

Observation

Provenance

Software

Processing

Calibration

Ambient Instrumental

Observation

Provenance

Project MetadataORGANISATIONAL

INTERNAL

PROVENANCE

WHITEPAPER

BY N. DELMOTTE

What is Provenance?

Observation

Provenance

Software

Processing

Calibration

Ambient Instrumental

Observation

Provenance

Project MetadataORGANISATIONAL

“ACQUISITIONAL”

INTERNAL

PROVENANCE

WHITEPAPER

BY N. DELMOTTE

How does Provenance !t in the VO?

How does Provenance !t in the VO?

Galactic & Extragalactic

extinction

Astronomical Source

Atmosphere (Seeing & Opacity)

F1

F2

Fn

Data Reduction Software

P1

P2

Pm

raw !les data products

photon emission

data generation

usesknowledge

of

Telescope, Filters, Detectors…

How does Provenance !t in the VO?

Galactic & Extragalactic

extinction

Astronomical Source

Atmosphere (Seeing & Opacity)

F1

F2

Fn

Data Reduction Software

P1

P2

Pm

raw !les data products

photon emission

data generation

usesknowledge

of

Telescope, Filters, Detectors…

How does Provenance !t in the VO?

How does Provenance !t in the VO?

Observation

ObsData

Characterization Provenance

Curation

Policy

PackagingTarget

Accuracy

SensitivityPrecisionResolutionCoverage

Data Model for Observation

Data Model forDataset Characterisation

RADAMS

How does Provenance !t in the VO?

Observation

ObsData

Characterization Provenance

Curation

Policy

PackagingTarget

Accuracy

SensitivityPrecisionResolutionCoverage

Data Model for Observation

Data Model forDataset Characterisation

RADAMSPh.D. Thesis

How does Provenance !t in the VO?

Observation

ObsData

Characterization Provenance

Curation

Policy

PackagingTarget

Accuracy

SensitivityPrecisionResolutionCoverage

Data Model for Observation

Data Model forDataset Characterisation

Extended Provenance

RADAMSPh.D. Thesis

How does Provenance !t in the VO?

Observation

ObsData

Characterization Provenance

Curation

Policy

PackagingTarget

Accuracy

SensitivityPrecisionResolutionCoverage

Data Model for Observation

Data Model forDataset Characterisation

SEE TALK BY

FRANÇOIS BONNAREL

RADAMSPh.D. Thesis

The case(s) for Provenance

The case(s) for Provenance

Use cases for Provenance, centred around the core concepts:

History of a data set (documentation)

Establishing ownership/providing attribution

Quality assessment

History of a data set (ESO)

Period

Program Type

Program ID

Run ID Run ID

Night

OB ID

Night Log

Raw File (DP_ID)

DPR CATG

Run ID

Night

Reduced Science Data

Master Calib. Data Ancillary

File

Project

Data Release ID

Structured ReadMe

Association IDs

Main File(s)

DP_ID

Associated File(s)

DP_ID

Level 0

Level 1 Level 2

Level 3

History of a data set (ESO)

bibcode ProgID

RunID

DataID

Level 4

History of a data set (ESO)

COMPILATION BY

A. DELGADO

IDtelescopeinstrumenttechniquemodeambient

observations

seeingair massextinction

ambient

comm/decommnametelescope / focustechniquemodesdetectorsoptical elementsmappings

instruments

IDnamePWLFWHM (type)comm/decommtransmission curvemappings

filters

IDnamewavelength rangegratingblaze angledispersionresolutioncomm/decommtransmission curvemappings

grisms

IDnameslit widthtypecomm/decommmappings

slits

comm/decommefficiencies observatorynameinstruments

telescope

comm/decomm modesmappings

technique

comm/decommtotal transmission curve nameinstrumentoptical elementsmappings

modes

comm/decommsensitivity

detector

etc.

IDnameorderswavelengths...gap orderdispersionresolutioncomm/decommmappings

gratings

History of a data set (ESO)

COMPILATION BY

A. DELGADO

IDtelescopeinstrumenttechniquemodeambient

observations

seeingair massextinction

ambient

comm/decommnametelescope / focustechniquemodesdetectorsoptical elementsmappings

instruments

IDnamePWLFWHM (type)comm/decommtransmission curvemappings

filters

IDnamewavelength rangegratingblaze angledispersionresolutioncomm/decommtransmission curvemappings

grisms

IDnameslit widthtypecomm/decommmappings

slits

comm/decommefficiencies observatorynameinstruments

telescope

comm/decomm modesmappings

technique

comm/decommtotal transmission curve nameinstrumentoptical elementsmappings

modes

comm/decommsensitivity

detector

etc.

IDnameorderswavelengths...gap orderdispersionresolutioncomm/decommmappings

gratings

History of a data set (ESO)

telescope instrument technique detectormode

filters

grisms

gratings

slits

etc.

observatory

ambient objectives

COMPILATION BY

A. DELGADO

History of a data set (ESO)

Galactic & Extragalactic

extinction

Astronomical Source

Atmosphere (Seeing & Opacity)

F1

F2

Fn

Data Reduction Software

P1

P2

Pm

raw !les data products

photon emission

data generation

usesknowledge

of

Telescope, Filters, Detectors…

telescope instrument technique detectormode

filters

grisms

gratings

slits

etc.

observatory

ambient objectives

COMPILATION BY

A. DELGADO

History of a data set (ESO)Telescope Information

Site

Dates: First Light, Commissioning…

Instruments & Optical Elements

Instruments

Technologies

Con!gurations

Optical Elements

Installation dates

Kind (Filter, Grisms, Gratings…)

Names ↔ IDs ↔ Transmission CurvesCOMPILATION BY

A. DELGADO

History of a data set (ESO)Telescope Information

Site

Dates: First Light, Commissioning…

Instruments & Optical Elements

Instruments

Technologies

Con!gurations

Optical Elements

Installation dates

Kind (Filter, Grisms, Gratings…)

Names ↔ IDs ↔ Transmission CurvesTIME CHANGING

COMPILATION BY

A. DELGADO

History of a data set

RADAMS

ObservationProvenance

InstrumentConf

-name

AntennaConf

-polarisation: L,R,X,Y

Feed

-name

BeamConf

+AntennaConf.name-mount-majorAxis-minorAxis-effectiveArea

Antenna

-name-description-shortName-locationName-URL

Instrument

+Instrument.locationName–latitude-longitude-height

Location

+beamConf.name-beamMajor-beamMinor-sensitivity-meanBeamSolidAngle-minorBeamSolidAngle-directivity-gain-resolution

Beam

+beamConf.Name-skyCentreFreq-ifCentreFreq-bandwidth

Receiver

+beamConf.name-chanSeparation-freqResol-velRefFrame-refChanNum-refChanFreq-restFreq-molecule-transition

Spectrum

+beamConf.name-chanSeparation-velResol-velRefFrame-refChanNum-refChanVel-restFreq-molecule-transition

Velocity

{This set is defined for each instrument band AND receiver

configuration

History of a data set

RADAMS

ObservationProvenance

InstrumentConf

-name

AntennaConf

-polarisation: L,R,X,Y

Feed

-name

BeamConf

+AntennaConf.name-mount-majorAxis-minorAxis-effectiveArea

Antenna

-name-description-shortName-locationName-URL

Instrument

+Instrument.locationName–latitude-longitude-height

Location

+beamConf.name-beamMajor-beamMinor-sensitivity-meanBeamSolidAngle-minorBeamSolidAngle-directivity-gain-resolution

Beam

+beamConf.Name-skyCentreFreq-ifCentreFreq-bandwidth

Receiver

+beamConf.name-chanSeparation-freqResol-velRefFrame-refChanNum-refChanFreq-restFreq-molecule-transition

Spectrum

+beamConf.name-chanSeparation-velResol-velRefFrame-refChanNum-refChanVel-restFreq-molecule-transition

Velocity

{This set is defined for each instrument band AND receiver

configuration

INSPIRED BY

LAMB!&!POWER’S

RAW RADIO DM!NOTE

History of a data set

InstrumentConf

-timeStamp

Processing

Observation Provenance AmbientConditions

+Processing.timeStamp-parameter.name-parameter.kind-parameter.value-parameter.sigma-parameter.calCoeff[n]

Calibration

+Processing.timeStamp-kind-softwarePackage-parameter.name-parameter.kind-parameter.value

ProcessingStep

RADAMS

History of a data set

Observation Provenance InstrumentConf

-timeStamp-opacity-temperature-humidity-wind

AmbientConditions+AmbientConditions.timeStamp-opacity-skydipStart-azimuth-elevation.[n]-tsky.[n]-atmosphericModel

OpacityCurve

RADAMS

History of a data set

Observation Provenance InstrumentConf

-timeStamp-opacity-temperature-humidity-wind

AmbientConditions+AmbientConditions.timeStamp-opacity-skydipStart-azimuth-elevation.[n]-tsky.[n]-atmosphericModel

OpacityCurve

TIME CHANGING

RADAMS

telescope instrument technique detectormode

filters

grisms

gratings

slits

etc.

observatory

ambient objectives

History of a data setObservationProvenance

InstrumentConf

-name

AntennaConf

-polarisation: L,R,X,Y

Feed

-name

BeamConf

+AntennaConf.name-mount-majorAxis-minorAxis-effectiveArea

Antenna

-name-description-shortName-locationName-URL

Instrument

+Instrument.locationName–latitude-longitude-height

Location

+beamConf.name-beamMajor-beamMinor-sensitivity-meanBeamSolidAngle-minorBeamSolidAngle-directivity-gain-resolution

Beam

+beamConf.Name-skyCentreFreq-ifCentreFreq-bandwidth

Receiver

+beamConf.name-chanSeparation-freqResol-velRefFrame-refChanNum-refChanFreq-restFreq-molecule-transition

Spectrum

+beamConf.name-chanSeparation-velResol-velRefFrame-refChanNum-refChanVel-restFreq-molecule-transition

Velocity

{This set is defined for each instrument band AND receiver

configuration

telescope instrument technique detectormode

filters

grisms

gratings

slits

etc.

observatory

ambient objectives

History of a data setObservationProvenance

InstrumentConf

-name

AntennaConf

-polarisation: L,R,X,Y

Feed

-name

BeamConf

+AntennaConf.name-mount-majorAxis-minorAxis-effectiveArea

Antenna

-name-description-shortName-locationName-URL

Instrument

+Instrument.locationName–latitude-longitude-height

Location

+beamConf.name-beamMajor-beamMinor-sensitivity-meanBeamSolidAngle-minorBeamSolidAngle-directivity-gain-resolution

Beam

+beamConf.Name-skyCentreFreq-ifCentreFreq-bandwidth

Receiver

+beamConf.name-chanSeparation-freqResol-velRefFrame-refChanNum-refChanFreq-restFreq-molecule-transition

Spectrum

+beamConf.name-chanSeparation-velResol-velRefFrame-refChanNum-refChanVel-restFreq-molecule-transition

Velocity

{This set is defined for each instrument band AND receiver

configuration

UNIFICATION OF PROVENANCE AT THE

AMBIENT, TELESCOPE_CONF LEVELS:

THE REST IS VERY HETEROGENOUS

History of a data setCan only be done with:

Historical archive of:

optical elements

software (possibly with virtualization)

con!guration values / fudge factors

ambient information

DIMM seeing, opacity, conductivity…

History of a data setCan only be done with:

Historical archive of:

optical elements

software (possibly with virtualization)

con!guration values / fudge factors

ambient information

DIMM seeing, opacity, conductivity…+ SYSTEMATIC

OBSERVING LOGS!

Ownership TrackingPeriod

Program Type

Program ID

Run ID Run ID

Night

OB ID

Night Log

Raw File (DP_ID)

DPR CATG

Run ID

Night

Reduced Science Data

Master Calib. Data Ancillary

File

Project

Data Release ID

Structured ReadMe

Association IDs

Main File(s)

DP_ID

Associated File(s)

DP_ID

Level 0

Level 1 Level 2

Level 3

bibcode ProgID

RunID

DataID

Level 4

Ownership TrackingPeriod

Program Type

Program ID

Run ID Run ID

Night

OB ID

Night Log

Raw File (DP_ID)

DPR CATG

Run ID

Night

Reduced Science Data

Master Calib. Data Ancillary

File

Project

Data Release ID

Structured ReadMe

Association IDs

Main File(s)

DP_ID

Associated File(s)

DP_ID

Level 0

Level 1 Level 2

Level 3

bibcode ProgID

RunID

DataID

Level 4

Ownership Tracking

Ownership Tracking

Can only rely on unique identi!ers being maintained

Other VO manipulations (SAMP messages, cross-matching) can lose associations

Need to provide services for identi!cation of key IDs

Increase of the role of the IVOA Registry?

This is stopping some small publishers!

Ownership Tracking

Can only rely on unique identi!ers being maintained

Other VO manipulations (SAMP messages, cross-matching) can lose associations

Need to provide services for identi!cation of key IDs

Increase of the role of the IVOA Registry?

This is stopping some small publishers!

THIS TRACKING CAN BE USED TO

MEASURE PUBLICATION!RATIOS

Quality Assessment

Quality Assessment

Linked to:

Acquisition con!guration

Actual problems with instruments/telescope

Weather

Intended usage of the dataset(s)!

Quality Assessment

Linked to:

Acquisition con!guration

Actual problems with instruments/telescope

Weather

Intended usage of the dataset(s)!

NO OBJECTIVE, ONE-SIZE-FITS-ALL,

QUALITY ASSESSMENT METRIC

Quality Assessment

Linked to:

Acquisition con!guration

Actual problems with instruments/telescope

Weather

Intended usage of the dataset(s)!

Quality Assessment

Linked to:

Acquisition con!guration

Actual problems with instruments/telescope

Weather

Intended usage of the dataset(s)!

TIME CHANGING

Queriable Provenance?

Most queries, on Characterisation/Target/Curation

Instrument-specific forms for instrument configurations

Allow UType/UCD/IVOAT key-value pairs?

Instrument-speci!c a priori quality assessment

A posteriori, usage-speci!c quality assessment, o#ine > usage-speci!c query to data Provenance

Conclusions

Conclusions

Provenance is an integral part of the Observation DM

Provenance comes with discipline, but allows for quality science

Di$erent approach for di$erent kinds of instruments, but all under the same general framework

Provenance should be accessible for any item; specialised TAP version?

Conclusions

Data centres: consistent naming/coding, plus mappings for existing data

History, history, history!

Should we forget about past data, and focus in the future?

Vielen Dank!STARTING WITH MY GERMAN CLASSES ;-)