Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery......

58
Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC entation: It’s not just discovery... 50% change in global average Why? i checked my 2002 email archives, and here is what i found out: it appears that the current 3rd generation algorithm was implemented into operations around Oct-Nov 2002 time frame. cannot say more precisely, as all email correspondence i am looking at, talks about this indirectly. (maybe it's what's referred to as the Phase II algorithm.) At the same time, we had implemented quite a few other changes fixing data bugs and formats: view angle problem, increased digitization in all channel's reflectances and AODs, etc. The jump is deemed due to introducing 3rd generation algorithm, which replaced the 2nd generation. The new numbers (~0.08) look more realistic than the previous ones (~0.05 or so). The changes seen in the data is close to the expected effect of this change. The 3rd gen alg takes into account the exact spectral response, whereas the 2nd gen is generic ("one size fits

Transcript of Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery......

Page 1: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC

Documentation: It’s not just discovery...

50% change in global average

Why?i checked my 2002 email archives, and here is what i found out:

it appears that the current 3rd generation algorithm was implemented into operations around Oct-Nov 2002 time frame. cannot say more precisely, as all email correspondence i am looking at, talks about this indirectly. (maybe it's what's referred to as the Phase II algorithm.) At the same time, we had implemented quite a few other changes fixing data bugs and formats: view angle problem, increased digitization in all channel's reflectances and AODs, etc.

The jump is deemed due to introducing 3rd generation algorithm, which replaced the 2nd generation. The new numbers (~0.08) look more realistic than the previous ones (~0.05 or so). The changes seen in the data is close to the expected effect of this change. The 3rd gen alg takes into account the exact spectral response, whereas the 2nd gen is generic ("one size fits all").

hopefully this settles the issue..

Page 2: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

Data Quality - Documents

Page 3: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

Data Quality - Granules

Page 4: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

<<DataType>>DQ_Scope

+ level : MD_ScopeCode + extent [0..1] : EX_Extent + levelDescription [0..*] :

MI_Metadata

Data Quality - Standards

<<Abstract>>DQ_Element

DQ_ConformanceResult

+ specification : CI_Citation+ explanation : CharacterString+ pass : Boolean

<<CodeList>>MD_EvaluationMethodTypeCode

+ directInternal + directExternal+ indirect

DQ_DataQuality

+ scope : DQ_Scope+ standAloneReport 0..1

DQ_QuantitativeResult

+ valueType [0..1] : RecordType + valueUnit : UnitOfMeasure + errorStatistic [0..1] : CharacterString + value [1..*] : Record

<<CodeList>>MD_ScopeCode

+ attribute + feature + attributeType + featureType+ collectionHardware + propertyType+ collectionSession + fieldSession+ dataset + software+ series + service+ nonGeographicDataset + model+ dimensionGroup + tile

DQ_CoverageResult

+ report 0..*

DQ_StandaloneReportInformation

+ reportReference : CI_Citation+ abstract: CharacterString

DQ_MeasureReference DQ_Evaluation DQ_Result

+ resultScope: DQ_Scope [0..1]

DQ_DescriptiveResult

<<Union>>MD_ScopeDescription

+ attributes : Set<GF_AttributeType> + features : Set<GF_FeatureType> + featureInstances : Set<GF_FeatureType> + attributeInstances : Set<GF_AttributeType> + dataset : CharacterString + other : CharacterString

LI_Lineage

Page 5: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

GOES-R Data Quality - Documents

Level 2+ Volcanic Ash: Detection and HeightL2+ Volcanic Ash Science DescriptionL2+ Volcanic Ash Algorithm DescriptionL2+ Volcanic Ash Source InformationL2+ Volcanic Ash Applicable ATBDsL2+ Volcanic Ash Quality AlgorithmsL2+ Volcanic Ash Source Data InputsL2+ Volcanic Ash Production NotesL2+ Volcanic Ash Data Fields (TBR-16)L2+ Volcanic Ash Metadata Description and DefinitionL2+ Volcanic Ash Expected Periodicity

Page 6: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

Documents = Standards

• Science Description - MD_DataIdentification/abstract• Algorithm Description - LE_Algorithm/description• Source Information - MD_DistributionInformation• Applicable ATDBs - LE_Algorithm/citation• Quality Algorithms - DQ_DataQuality/DQ_MeasureReference• Source Data Inputs - LI_Lineage/source• Production Notes - processStep/description• Data Fields- MD_ContentInfo• Metadata Description and Definition - seems redundant• Expected Periodicity - resourceMaintenance

Page 7: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

Documentation Objects = Standards

NESDIS Documentation Object MappingMetadata DocumentSystem Description DocumentSystem Maintenance ManualInterface Control DocumentAlgorithm Theoretical Basis Document

Page 8: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

Multiple Dialects of the Same Content

Documents

CI_Citation

XSLT Translation

XML Reference

Granules/Catalogs Standards

Page 10: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

<<DataType>>DQ_Scope

+ level : MD_ScopeCode + extent [0..1] : EX_Extent + levelDescription [0..*] :

MI_Metadata

DQ_DataQuality - 19157

<<Abstract>>DQ_Element

DQ_ConformanceResult

+ specification : CI_Citation+ explanation : CharacterString+ pass : Boolean

<<CodeList>>MD_EvaluationMethodTypeCode

+ directInternal + directExternal+ indirect

DQ_DataQuality

+ scope : DQ_Scope+ standAloneReport 0..1

DQ_QuantitativeResult

+ valueType [0..1] : RecordType + valueUnit : UnitOfMeasure + errorStatistic [0..1] : CharacterString + value [1..*] : Record

<<CodeList>>MD_ScopeCode

+ attribute + feature + attributeType + featureType+ collectionHardware + propertyType+ collectionSession + fieldSession+ dataset + software+ series + service+ nonGeographicDataset + model+ dimensionGroup + tile

DQ_CoverageResult

+ report 0..*

DQ_StandaloneReportInformation

+ reportReference : CI_Citation+ abstract: CharacterString

DQ_MeasureReference DQ_Evaluation DQ_Result

+ resultScope: DQ_Scope [0..1]

DQ_DescriptiveResult

<<Union>>MD_ScopeDescription

+ attributes : Set<GF_AttributeType> + features : Set<GF_FeatureType> + featureInstances : Set<GF_FeatureType> + attributeInstances : Set<GF_AttributeType> + dataset : CharacterString + other : CharacterString

23

2

2

2

4

5

5 55

5

LI_Lineage

1

Page 11: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

ISO Lineage Model

Source Source Source Source Source

Step Step Step Product

Processing and Algorithm Descriptions

Page 12: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

LI_Lineage

Page 13: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

Attributes:role [how many] : object type

how many = [minimum..maximum]minimum = 0: optionalminimum = 1: required* = any number

how many = blank: required, onehow many = [1..*] : required, any numberhow many = [1..2] : required, one or twohow many = [0..1] : optional, zero or onehow many = [0..*] : optional, any number

Type: package abbreviation_type

UML package abbreviation =XML namespace = Document section

Role: what this object does for mecontact: CI_ResponsiblePartydescription: CharacterString

Operations: generally not used in ISO UML

UML.1

Page 14: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

UML.2

LI_Lineage =the LI_Lineageclass is in the Lineage (LI) Package

statement [0..1] : CharacterString = LI_Lineage can have up to one statement which is a

CharacterString

source [0..*] : LI_Source = LI_Lineage can have any number of sources which are LI_Sources

processStep [0..*] : LI_Lineage can have any number of processSteps which are LE_ProcessSteps

Page 15: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

Volcanic Ash Detection Sources

Page 16: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

Volcanic Ash Detection Processing

Page 17: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

ISO Lineage

DQ_Lineage (19115-2)

MI_Metadata

+ lineage 0..1

LI_Lineage

+ statement [0..1] : CharacterString

LE_Source

+ description [0..1] : CharacterString+ scaleDenominator [0..1] : MD_RepresentativeFraction+ sourceReferenceSystem [0..1] : MD_ReferenceSystem+ sourceCitation [0..1] : CI_Citation+ sourceExtent [0..*] : EX_Extent+ processedLevel[0..1] : MD_Identifier+ resolution[0..1] : LE_NominalResolution+ sourcemetadata [0..*] : MD_Reference

LE_ProcessStep

+ description : CharacterString+ rationale [0..1] : CharacterString+ dateTime [0..1] : DateTime+ processor [0..*] : CI_ResponsibleParty+ extent [0..*] : EX_Extent+ reference [0.*] : CI_Citation

+ source 0..* + processStep 0..*

+ output, source0..*

If(count(source) + count(processStep) =0) and(DQ_DataQuality.cope.level = 'dataset' or 'series')then statement is mandatory

LE_Processing

+ identifier : MD_Identifier+ softwareReference[0..*] : CI_Citation+ procedureDescription[0..1] : CharacterString+ documentation[0..*] : CI_Citation+ runTimeParameters[0..1] : CharacterString

LE_Algorithm

+ citation: CI_Citation+ description : CharacterString

+ processingInformation0..*

+ algorithm 0..*LE_ProcessStepReport

+ name : CharacterString+ description[0..1] : CharacterString+ fileType[0..1] : CharacterString

+ report 0..*

+ sourceStep0..*

Page 18: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

Granule Lineage - 1

Page 19: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

Granule Lineage - 2

Page 20: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

Volcanic Ash Detection Lineage in the GranuleOption 1: one identifier:<nc:attribute name="history" value="uniqueIdentifier"/>

Option 2: lineage group with filenames as unique identifiers: <nc:group name="lineage"> <nc:attribute name="Source:clear_sky_masks" value="filename"/> <nc:attribute name="Source:global_emissivity_MODIS" value="filename"/> <nc:attribute name="Source:global_emissivity_MODIS" value="filename"/> <nc:attribute name="Source:global_land_cover_UMD" value="filename"/> <nc:attribute name="Source:cloud_probablilty_LUT" value="filename"/> <nc:attribute name="Source:volcanic_ash_coefficient_file" value="filename"/> <nc:attribute name="Source:NWP_GFS_current_analysis_and_forecast_data" value="filename"/> <nc:attribute name="Algorithm" value="GOESR_ABI_ATBD_Aviation_VolAsh_v2.0.doc"/> </nc:group>

Option 3: lineage group with uniqueIdentifiers:<nc:group name="lineageInformation">

<nc:attribute name = "processStep" value="uniqueIdentifier"/> includes processingInformation / algorithm / output

<nc:attribute name="sourceName" value="uniqueIdentifier"/><nc:attribute name="sourceName" value="uniqueIdentifier"/><nc:attribute name="sourceName" value="uniqueIdentifier"/>

</ nc:group>

Page 21: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

Database and XML Keys

CitationIDTitleDateFriend_IDLocation_ID

CitationIDTitleDateFriend_IDLocation_ID

PersonIDNameEMail

PersonIDNameEMail

OnlineResourceIDNameURL

OnlineResourceIDNameURL

XML<person id=JaneDoe> <friend xlink:href="#JohnDoe"/></person>…………<person id=JohnDoe> <friend xlink:href="#JaneDoe"/></person>

Page 22: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

XML Attributes: Objects and ReferencesISO XML consists of tags, elements (with or without content), and attributes. An attribute is a name/value pair that exists within a start-tag or empty-element tag. Attributes provide additional information about an element which is not part of the data. Attribute values must contain either single or double quotes. This example shows a step element with one attribute, number with a value of “3”:<step number="3">Connect A to B.</step>

Many of the XML attributes used in the ISO Standards fall into two groups: identifiers and references:Identifiers: id and uuidReferences: uuidref and xlink:href

Objects that start with upper case letters have identifiers (id and uuid)Roles that start with lower case letters have references (uuidref and xlink:href)

object:CI_ResponsiblePartyid="JaneDoe"

object:CI_ResponsiblePartyid="JohnDoe"

role: friend xlink:href=#JohnDoe role: friend xlink:href=#JaneDoe

Page 23: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

ISO Lineage Model - 2

Source Astep ps1

Source Bstep ps1

Source Cstep ps2

Source Dstep ps2

Source Estep ps3

Step ps1source Asource Boutput C

Step ps2source Csource Doutput E

Step ps3source E

Product

Processing and Algorithm Descriptions

Page 24: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

XML Attributes: Objects and References<gmd:lineage> <gmi:LI_Lineage> <gmd:processStep id="ps1"> <gmd:LI_ProcessStep> <source xlink:href="#A"/> <source xlink:href="#B"/> <output xlink:href="#C"/> <gmd:LI_ProcessStep> </gmd:processStep> <gmd:processStep id="ps2"> <gmd:LI_ProcessStep> <source xlink:href="#C"/> <source xlink:href="#D"/> <output xlink:href="#E"/> <gmd:LI_ProcessStep> </gmd:processStep> <gmd:source id="A"><gmd:LI_Source><gmd:sourceStep xlink:href="ps1"/></gmd:LI_Source></gmd:source> <gmd:source id="B"><gmd:LI_Source><gmd:sourceStep xlink:href="ps1"/></gmd:LI_Source></gmd:source> <gmd:source id="C"><gmd:LI_Source><gmd:sourceStep xlink:href="ps2"/></gmd:LI_Source></gmd:source> <gmd:source id="D"><gmd:LI_Source><gmd:sourceStep xlink:href="ps2"/</gmd:LI_Source>></gmd:source> <gmd:source id="E"><gmd:LI_Source><gmd:sourceStep xlink:href="ps3"/></gmd:LI_Source></gmd:source> </gmi:LI_Lineage></gmd:lineage>

Page 25: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

ISO Lineage

DQ_Lineage (19115-2)

MI_Metadata

+ lineage 0..1

LI_Lineage

+ statement [0..1] : CharacterString

LE_Source

+ description [0..1] : CharacterString+ scaleDenominator [0..1] : MD_RepresentativeFraction+ sourceReferenceSystem [0..1] : MD_ReferenceSystem+ sourceCitation [0..1] : CI_Citation+ sourceExtent [0..*] : EX_Extent+ processedLevel[0..1] : MD_Identifier+ resolution[0..1] : LE_NominalResolution+ sourcemetadata [0..*] : MD_Reference

LE_ProcessStep

+ description : CharacterString+ rationale [0..1] : CharacterString+ dateTime [0..1] : DateTime+ processor [0..*] : CI_ResponsibleParty+ extent [0..*] : EX_Extent+ reference [0.*] : CI_Citation

+ source 0..* + processStep 0..*

+ output, source0..*

If(count(source) + count(processStep) =0) and(DQ_DataQuality.cope.level = 'dataset' or 'series')then statement is mandatory

LE_Processing

+ identifier : MD_Identifier+ softwareReference[0..*] : CI_Citation+ procedureDescription[0..1] : CharacterString+ documentation[0..*] : CI_Citation+ runTimeParameters[0..1] : CharacterString

LE_Algorithm

+ citation: CI_Citation+ description : CharacterString

+ processingInformation0..*

+ algorithm 0..*LE_ProcessStepReport

+ name : CharacterString+ description[0..1] : CharacterString+ fileType[0..1] : CharacterString

+ report 0..*

+ sourceStep0..*

References

Page 26: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

<<DataType>>DQ_Scope

+ level : MD_ScopeCode + extent [0..1] : EX_Extent + levelDescription [0..*] :

MI_Metadata

DQ_DataQuality - 19157

<<Abstract>>DQ_Element

DQ_ConformanceResult

+ specification : CI_Citation+ explanation : CharacterString+ pass : Boolean

<<CodeList>>MD_EvaluationMethodTypeCode

+ directInternal + directExternal+ indirect

DQ_DataQuality

+ scope : DQ_Scope+ standAloneReport 0..1

DQ_QuantitativeResult

+ valueType [0..1] : RecordType + valueUnit : UnitOfMeasure + errorStatistic [0..1] : CharacterString + value [1..*] : Record

<<CodeList>>MD_ScopeCode

+ attribute + feature + attributeType + featureType+ collectionHardware + propertyType+ collectionSession + fieldSession+ dataset + software+ series + service+ nonGeographicDataset + model+ dimensionGroup + tile

DQ_CoverageResult

+ report 0..*

DQ_StandaloneReportInformation

+ reportReference : CI_Citation+ abstract: CharacterString

DQ_MeasureReference DQ_Evaluation DQ_Result

+ resultScope: DQ_Scope [0..1]

DQ_DescriptiveResult

<<Union>>MD_ScopeDescription

+ attributes : Set<GF_AttributeType> + features : Set<GF_FeatureType> + featureInstances : Set<GF_FeatureType> + attributeInstances : Set<GF_AttributeType> + dataset : CharacterString + other : CharacterString

23

2

2

2

4

5

5 55

4

LI_Lineage

1

Page 27: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

DQ_Scope

Page 28: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

Attributes:role [how many] : object type

how many = [minimum..maximum]minimum = 0: optionalminimum = 1: required* = any number

how many = blank: required, onehow many = [1..*] : required, any numberhow many = [1..2] : required, one or twohow many = [0..1] : optional, zero or onehow many = [0..*] : optional, any number

Type: package abbreviation_type

UML package abbreviation =XML namespace = Document section

Role: what this object does for mecontact: CI_ResponsiblePartydescription: CharacterString

Operations: generally not used in ISO UML

UML.1

Page 29: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

<<DataType>>DQ_Scope

+ level : MD_ScopeCode + extent [0..1] : EX_Extent + levelDescription [0..*] : MD_ScopeDescription

UML.2

<<DataType>>DQ_Scope =the DQ_Scopeis a DataTypein the Data Quality (DQ) Package

level : MD_ScopeCode = a DQ_Scope must have one level which is a MD_ScopeCode

extent [0..1] : EX_Extent = a DQ_Scope can have up to 1 extent which is an

EX_Extent

levelDescription [0..*] : MD_ScopeDescription = a DQ_Scope can have any number

of levelDescriptions which are MD_ScopeDescriptions

Page 30: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

DQ_Scope<<DataType>>

DQ_Scope

+ level : MD_ScopeCode + extent [0..1] : EX_Extent + levelDescription [0..*] : MD_ScopeDescription

<<CodeList>>MD_ScopeCode

+ attribute + feature + attributeType + featureType+ collectionHardware + propertyType+ collectionSession + fieldSession+ dataset + software+ series + service+ nonGeographicDataset + model+ dimensionGroup + tile

<<Union>>MD_ScopeDescription

+ attributes : Set<GF_AttributeType> + features : Set<GF_FeatureType> + featureInstances : Set<GF_FeatureType> + attributeInstances : Set<GF_AttributeType> + dataset : CharacterString + other : CharacterString

Page 31: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

DessertDessert

PiePie Ice CreamIce Cream

AppleApple PecanPecan VanillaVanilla ChocolateChocolate

<<Abstract>>Dessert

<<Abstract>>Dessert

<<Abstract>>Pie

<<Abstract>>Pie

<<Abstract>>Ice Cream

<<Abstract>>Ice Cream

AppleApple PecanPecan VanillaVanilla ChocolateChocolate

Abstract Dessert

= "is a" or "can be a"

Page 32: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

<<DataType>>EX_Extent

+description [0..1]: Character String

EX_BoundingPolygon

+polygon [0..1]: GM_Object

EX_GeographicBoundingBox+westBoundingLongitude: Decimal+eastBoundingLongitude: Decimal+southBoundingLatiitude: Decimal+northBoundingLatiitude: Decimal

EX_GeographicDescription

+geographicIdentifier: MD_Identifier

EX_VerticalExtent+minimumValue: Real+maximumValue: Real

EX_TemporalExtent+extent: TM_Primitive

EX_SpatialTemporalExtent

<<Abstract>>EX_GeographicExtent

+extentTypeCode [0..1]: Boolean="1"

count(description + geographicElement + temporalElement + verticalElement) > 0

EX_Extent

Page 33: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

<<DataType>>DQ_Scope

+ level : MD_ScopeCode + extent [0..1] : EX_Extent + levelDescription [0..*] :

MI_Metadata

DQ_DataQuality - 19157

<<Abstract>>DQ_Element

DQ_ConformanceResult

+ specification : CI_Citation+ explanation : CharacterString+ pass : Boolean

<<CodeList>>MD_EvaluationMethodTypeCode

+ directInternal + directExternal+ indirect

DQ_DataQuality

+ scope : DQ_Scope+ standAloneReport 0..1

DQ_QuantitativeResult

+ valueType [0..1] : RecordType + valueUnit : UnitOfMeasure + errorStatistic [0..1] : CharacterString + value [1..*] : Record

<<CodeList>>MD_ScopeCode

+ attribute + feature + attributeType + featureType+ collectionHardware + propertyType+ collectionSession + fieldSession+ dataset + software+ series + service+ nonGeographicDataset + model+ dimensionGroup + tile

DQ_CoverageResult

+ report 0..*

DQ_StandaloneReportInformation

+ reportReference : CI_Citation+ abstract: CharacterString

DQ_MeasureReference DQ_Evaluation DQ_Result

+ resultScope: DQ_Scope [0..1]

DQ_DescriptiveResult

<<Union>>MD_ScopeDescription

+ attributes : Set<GF_AttributeType> + features : Set<GF_FeatureType> + featureInstances : Set<GF_FeatureType> + attributeInstances : Set<GF_AttributeType> + dataset : CharacterString + other : CharacterString

23

2

2

2

4

5

5 55

4

LI_Lineage

1

Page 34: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

DQ_StandAloneReport

Page 35: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

MI_Metadata

StandAloneReport

DQ_DataQuality

+ scope : DQ_Scope+ standAloneReport

0..1

DQ_StandaloneReportInformation

+ reportReference : CI_Citation+ abstract: CharacterString

Global or Variable Attribute:<attribute name="references" value="GOES-R Product Definition and Users Guide for Geostationary Operational Environmental Satellite R-Series (GOES-R) Core Ground Segment"/>

Page 36: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

<<DataType>>DQ_Scope

+ level : MD_ScopeCode + extent [0..1] : EX_Extent + levelDescription [0..*] :

MI_Metadata

DQ_DataQuality - 19157

<<Abstract>>DQ_Element

DQ_ConformanceResult

+ specification : CI_Citation+ explanation : CharacterString+ pass : Boolean

<<CodeList>>MD_EvaluationMethodTypeCode

+ directInternal + directExternal+ indirect

DQ_DataQuality

+ scope : DQ_Scope+ standAloneReport 0..1

DQ_QuantitativeResult

+ valueType [0..1] : RecordType + valueUnit : UnitOfMeasure + errorStatistic [0..1] : CharacterString + value [1..*] : Record

<<CodeList>>MD_ScopeCode

+ attribute + feature + attributeType + featureType+ collectionHardware + propertyType+ collectionSession + fieldSession+ dataset + software+ series + service+ nonGeographicDataset + model+ dimensionGroup + tile

DQ_CoverageResult

+ report 0..*

DQ_StandaloneReportInformation

+ reportReference : CI_Citation+ abstract: CharacterString

DQ_MeasureReference DQ_Evaluation DQ_Result

+ resultScope: DQ_Scope [0..1]

DQ_DescriptiveResult

<<Union>>MD_ScopeDescription

+ attributes : Set<GF_AttributeType> + features : Set<GF_FeatureType> + featureInstances : Set<GF_FeatureType> + attributeInstances : Set<GF_AttributeType> + dataset : CharacterString + other : CharacterString

23

2

2

2

4

5

5 55

4

LI_Lineage

1

Page 37: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

DQ_Element

<<Abstract>>DQ_Element

DQ_MeasureReference

+ measureIdentification: MD_Identifier [0..1]+ nameOfMeasure: CharacterString [0..*]+ measureDescription: CharacterString [0..1]

DQ_EvaluationMethod

+ dateTime: DateTime [0..*]+ evaluationMethodDescription: CharacterString [0..1]+ evaluationProceedure: CI_Citation [0..1]+ referenceDoc: CI_Citation [0..*]+ evaluationMethodType: DQ_EvaluationMethodTypeCode [0..1]

DQ_Result

+ dateTime: DateTime [0..*]+ resultScope: DQ_ScopeCode [0..1]

Page 38: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

DQ_MeasureReference

DQ_MeasureReference

+ measureIdentification: MD_Identifier [0..1]+ nameOfMeasure: CharacterString [0..*]+ measureDescription: CharacterString [0..1]

Page 39: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

Measure Registry / Database

Quality Measuremeasure identifier namealiaselement namebasic measuredefinitiondescriptionparametervalue typevalue structuresource referenceexample

Quality Measuremeasure identifier namealiaselement namebasic measuredefinitiondescriptionparametervalue typevalue structuresource referenceexample

DQ_MeasureReference

+ measureIdentification: MD_Identifier [0..1]+ nameOfMeasure: CharacterString [0..*]+ measureDescription: CharacterString [0..1]

Page 40: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

DQM_MeasureDQM_Measure

+ measureIdentifier: MD_Identifier+ name: CharacterString+ alias: CharacterString [0..*]+ elementName: TypeName [1..*]+ definition: CharacterString+ description: DQM_Description+ valueType: TypeName+ valueStructure: DQM_ValueStructure+ example: DQM_Description [0..*]

DQM_SourceReference

+ citation: CI_Citation

DQM_BasicMeasure

+ name: CharacterString+ definition: CharacterString+ example: DQM_Description [0..*]+ valueType: TypeName

DQM_Description

+ textDescription: CharacterString+ extendedDescription: MD_BrowseGraphic

DQM_Parameter

+ name: CharacterString+ definition: CharacterString+ description: DQM_Description+ example: DQM_Description [0..*]+ valueType: TypeName+ valueStructure: DQM_ValueStructure

<<CodeList>>DQM_ValueStructure

+ bag +table+ set + matrix+ sequence +coverage

Page 41: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

Variable Data Quality - Scalers

<variable name="ash_cloud_height" type="float" shape="i j"> <attribute name="coverage_type" value="physicalMeasurement"/> <attribute name="description" value="The estimated ash cloud height (km)"/> <attribute name="units" value="km"/> <attribute name="long_name" value="Ash Cloud Height"/> <attribute name="standard_name" value="test_variable_standard_name"/> <attribute name ="data_quality_authority" value="gov.noaa.goes-r"/> <attribute name="dq:count_cloud_temperature_high_quality" value="1024" type="int"/> <attribute name="dq:count_cloud_temperature_medium_quality" value="145" type="int"/> <attribute name="dq:count_cloud_temperature_low_quality" value="0" type="int"/> <attribute name="dq:count_cloud_emissivity_high_quality" value="1024" type="int"/> <attribute name="dq:count_cloud_emissivity_medium_quality" value="145" type="int"/> <attribute name="dq:count_cloud_emissivity_low_quality" value="0" type="int"/> <attribute name="dq:count_cloud_emissivity_high_quality" value="1024" type="int"/> <attribute name="dq:count_cloud_emissivity_medium_quality" value="145" type="int"/> <attribute name="dq:count_cloud_emissivity_low_quality" value="0" type="int"/> <attribute name="dq:count_beta_high_quality" value="1024" type="int"/> <attribute name="dq:count_beta_medium_quality" value="145" type="int"/> <attribute name="dq:count_beta_quality" value="0" type="int"/> <attribute name="dq:count_attempted_ash_retrievals" value="1024" type="int"/></variable>

<variable name="someVariableName><attribute name="dq:measure" value="result" type="float"/>

</variable >

Page 42: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

Granule Data Quality - Scalers

<group name="Quality:ScalarInformation"> <attribute name ="data_quality_authority" value="gov.noaa.goes-r"/> <attribute name="volcanic_ash_mass_loading:Total_mass_of_volcanic_ash" value="1" type="float"/> <attribute name="volcanic_ash_mass_loading:Mean_ash_mass_loading_in_scene" value="1" type="float"/> <attribute name="volcanic_ash_mass_loading:Minimum_ash_mass_loading_value_in scene" value="1" type="float"/> <attribute name="volcanic_ash_mass_loading:Maximum_ash_mass_loading_value_in scene" value="1" type="float"/> <attribute name="volcanic_ash_mass_loading:Standard_deviation_ash_mass_loading_value_in scene" value="1" type="float"/> <attribute name="volcanic_ash_mass_loading:Mean_ash_cloud_height_in_scene" value="1" type="float"/> <attribute name="volcanic_ash_mass_loading:Minimum_ash_cloud_height_in scene" value="1" type="float"/> <attribute name="volcanic_ash_mass_loading:Maximum_ash_cloud_height_in scene" value="1" type="float"/> <attribute name="ash_cloud_height:count_cloud_temperature_high_quality" value="1024" type="int"/> <attribute name="ash_cloud_height:count_cloud_temperature_medium_quality" value="145" type="int"/> <attribute name="ash_cloud_height:count_cloud_temperature_low_quality" value="0" type="int"/> <attribute name="ash_cloud_height:count_cloud_emissivity_high_quality" value="1024" type="int"/> <attribute name="ash_cloud_height:count_cloud_emissivity_medium_quality" value="145" type="int"/> <attribute name="ash_cloud_height:count_cloud_emissivity_low_quality" value="0" type="int"/> <attribute name="ash_cloud_height:count_cloud_emissivity_high_quality" value="1024" type="int"/> <attribute name="ash_cloud_height:count_cloud_emissivity_medium_quality" value="145" type="int"/> <attribute name="ash_cloud_height:count_cloud_emissivity_low_quality" value="0" type="int"/> <attribute name="ash_cloud_height:count_beta_high_quality" value="1024" type="int"/> <attribute name="ash_cloud_height:count_beta_medium_quality" value="145" type="int"/> <attribute name="ash_cloud_height:count_beta_quality" value="0" type="int"/> <attribute name="ash_cloud_height:count_attempted_ash_retrievals" value="1024" type="int"/></group>

<group name="Quality:ScalerInformation><attribute name="variable:measure" value="result" type="float"/>

</group>

Page 43: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

<<DataType>>DQ_Scope

+ level : MD_ScopeCode + extent [0..1] : EX_Extent + levelDescription [0..*] :

MI_Metadata

DQ_DataQuality - 19157

<<Abstract>>DQ_Element

DQ_ConformanceResult

+ specification : CI_Citation+ explanation : CharacterString+ pass : Boolean

<<CodeList>>MD_EvaluationMethodTypeCode

+ directInternal + directExternal+ indirect

DQ_DataQuality

+ scope : DQ_Scope+ standAloneReport 0..1

DQ_QuantitativeResult

+ valueType [0..1] : RecordType + valueUnit : UnitOfMeasure + errorStatistic [0..1] : CharacterString + value [1..*] : Record

<<CodeList>>MD_ScopeCode

+ attribute + feature + attributeType + featureType+ collectionHardware + propertyType+ collectionSession + fieldSession+ dataset + software+ series + service+ nonGeographicDataset + model+ dimensionGroup + tile

DQ_CoverageResult

+ report 0..*

DQ_StandaloneReportInformation

+ reportReference : CI_Citation+ abstract: CharacterString

DQ_MeasureReference DQ_Evaluation DQ_Result

+ resultScope: DQ_Scope [0..1]

DQ_DescriptiveResult

<<Union>>MD_ScopeDescription

+ attributes : Set<GF_AttributeType> + features : Set<GF_FeatureType> + featureInstances : Set<GF_FeatureType> + attributeInstances : Set<GF_AttributeType> + dataset : CharacterString + other : CharacterString

23

2

2

2

4

5

5 55

4

LI_Lineage

1

Page 44: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

DQ_Result

DQ_ConformanceResult

+ specification : CI_Citation+ explanation : CharacterString+ pass : Boolean

DQ_QuantitativeResult

+ valueType [0..1] : RecordType + valueUnit : UnitOfMeasure + errorStatistic [0..1] : CharacterString + value [1..*] : Record

DQ_DescriptiveResult

+ statement: CharacterString

QE_CoverageResult

+ resultFile : MX_DataFile+ resultFormat: MD_Format+ resultContentDescription: MD_CoverageDescription+ resultSpatialRepresentation: MD_SpatialRepresentation+ spatialRepresentationType : MD_SpatialRepresentationTypeCode

DQ_Result

+ resultScope: DQ_Scope [0..1]

Page 45: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

<variable name="ash_cloud_height" type="float" shape="i j"> <attribute name="coverage_type" value="physicalMeasurement"/> <attribute name="description" value="The estimated ash cloud height (km)"/> <attribute name="units" value="km"/> <attribute name="long_name" value="Ash Cloud Height"/> <attribute name="standard_name" value="test_variable_standard_name"/> <attribute name="dq:count_cloud_temperature_high_quality" value="1024" type="int"/> <attribute name="dq:count_cloud_temperature_medium_quality" value="145" type="int"/> <attribute name="dq:count_cloud_temperature_low_quality" value="0" type="int"/> <attribute name="dq:count_cloud_emissivity_high_quality" value="1024" type="int"/> <attribute name="dq:count_cloud_emissivity_medium_quality" value="145" type="int"/> <attribute name="dq:count_cloud_emissivity_low_quality" value="0" type="int"/> <attribute name="dq:count_cloud_emissivity_high_quality" value="1024" type="int"/> <attribute name="dq:count_cloud_emissivity_medium_quality" value="145" type="int"/> <attribute name="dq:count_cloud_emissivity_low_quality" value="0" type="int"/> <attribute name="dq:count_beta_high_quality" value="1024" type="int"/> <attribute name="dq:count_beta_medium_quality" value="145" type="int"/> <attribute name="dq:count_beta_quality" value="0" type="int"/> <attribute name="dq:count_attempted_ash_retrievals" value="1024" type="int"/></variable>

Granule Metadata (NcML)

<variable name="Band_1" shape="Time Latitude Longitude" type="float"><attribute name="long_name" value="Band 1" /><attribute name="valid_min" value="0" /><attribute name="valid_max" value="255" /><attribute name="units" value="counts" /><attribute name="standard_name" value="band_1" /><attribute name="scale_factor" value="0.1" /><attribute name="number_of_good_calibration_points" value="19880" /><attribute name="number_of_good_navigation_points" value="19880" /><attribute name="number_of_good_overall_points" value="19880" /><attribute name="reflectance_actual_range" type="int" value="-18 323" /><attribute name="reflectance_Sample_Size" type="int" value="50760" /><attribute name="reflectance_Mean" type="float" value="116.12569" /><attribute name="reflectance_Standard_Deviation" type="float" value="114.663765" /><attribute name="radiance_actual_range" type="int" value="-18 323" /><attribute name="radiance_Sample_Size" type="int" value="50760" /><attribute name="radiance_Mean" type="float" value="116.12569" /><attribute name="radiance_Standard_Deviation" type="float" value="114.663765" /><attribute name="brightness_temperature_actual_range" type="int" value="-18 323" /><attribute name="brightness_temperature_Sample_Size" type="int" value="50760" /><attribute name="brightness_temperature_Mean" type="float" value="116.12569" /><attribute name="brightness_temperature_Standard_Deviation" type="float" value="114.663765" /></variable>datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata datadata

getNcML

<!-- XML encoding of Variable object --> <xsd:element name="variable"> <xsd:complexType> <xsd:sequence> <xsd:element ref="attribute" minOccurs="0" maxOccurs="unbounded"/> <xsd:element ref="values" minOccurs="0"/> <xsd:element ref="variable" minOccurs="0" maxOccurs="unbounded"/> <xsd:element ref="logicalView" minOccurs="0"/> <xsd:element ref="remove" minOccurs="0" maxOccurs="unbounded"/> </xsd:sequence> <xsd:attribute name="name" type="xsd:token" use="required"/> <xsd:attribute name="type" type="DataType"/> <xsd:attribute name="shape" type="xsd:token"/> <xsd:attribute name="orgName" type="xsd:string"/> </xsd:complexType> </xsd:element>

NcML Schema Variable Type Definition

Page 46: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

<gmd:result> <gmd:DQ_QuantitativeResult> <gmd:valueType> <gco:RecordType xlink:href="http://www.unidata.ucar.edu/schemas/netcdf/ncml-2.2.xsd #xpointer(//element[@name=variable'])">Definition of netCDF variableType</gco:RecordType> </gmd:valueType> <gmd:valueUnit nilReason="inapplicable"/> <gmd:value> <gco:Record xlink:href="http://www.ngdc.noaa.gov/ncmlService/granuleIdentifier #xpointer(/netcdf/variable[@name=variableName])"> Attributes for variable = memberName in granule = granuleIdentifier></gco:Record> </gmd:value> </gmd:DQ_QuantitativeResult></gmd:result>

Variable Quality - ISO with XML Reference

RecordType

Record

Page 47: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

<gmd:result> <gmd:DQ_QuantitativeResult> <gmd:valueType> <gco:RecordType xlink:href="http://www.unidata.ucar.edu/schemas/netcdf/ncml-2.2.xsd #xpointer(//element[@name='group'])">Definition of netCDF groupType</gco:RecordType> </gmd:valueType> <gmd:valueUnit nilReason="inapplicable"/> <gmd:value> <gco:Record xlink:href="http://www.ngdc.noaa.gov/ncmlService/granuleIdentifier #xpointer(/netcdf/group[@name= Quality:ScalerInformation])"> Attributes for variable = Quality:ScalerInformation in granule = granuleIdentifier></gco:Record> </gmd:value> </gmd:DQ_QuantitativeResult></gmd:result>

Granule Quality - ISO with XML Reference

RecordType

Record

Page 48: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

<gmd:DQ_DataQuality> <gmd:scope> <gmd:DQ_Scope> <gmd:level> <gmd:MD_ScopeCode codeList="gmxCodelists.xml#gmd:MD_ScopeCode" codeListValue="attribute">attribute</gmd:MD_ScopeCode> </gmd:level> <gmd:levelDescription> <gmd:MD_ScopeDescription> <gmd:other> <gco:CharacterString>ash_cloud_height</gco:CharacterString> </gmd:other> </gmd:MD_ScopeDescription> </gmd:levelDescription> </gmd:DQ_Scope> </gmd:scope> <gmd:report> <gmd:DQ_QuantitativeAttributeAccuracy> <gmd:nameOfMeasure> <gco:CharacterString>dq:count_cloud_temperature_high_quality</gco:CharacterString> </gmd:nameOfMeasure> <gmd:result> <gmd:DQ_QuantitativeResult> <gmd:valueType> <gco:RecordType>int</gco:RecordType> </gmd:valueType> <gmd:valueUnit/> <gmd:value> <gco:Record>1024</gco:Record> </gmd:value> </gmd:DQ_QuantitativeResult> </gmd:result> </gmd:DQ_QuantitativeAttributeAccuracy> </gmd:report>

Standard Data Quality - ISO without reference

Value

Type

Measure

Variable

Page 49: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

DQ_Result

DQ_ConformanceResult

+ specification : CI_Citation+ explanation : CharacterString+ pass : Boolean

DQ_QuantitativeResult

+ valueType [0..1] : RecordType + valueUnit : UnitOfMeasure + errorStatistic [0..1] : CharacterString + value [1..*] : Record

DQ_DescriptiveResult

+ statement: CharacterString

QE_CoverageResult

+ resultFile : MX_DataFile+ resultFormat: MD_Format+ resultContentDescription: MD_CoverageDescription+ resultSpatialRepresentation: MD_SpatialRepresentation+ spatialRepresentationType : MD_SpatialRepresentationTypeCode

DQ_Result

+ resultScope: DQ_Scope [0..1]

Page 50: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

Variable Data Quality - Coverages <variable name="ash_detection_quality_flag" type="byte" shape="i j"> <attribute name="coverage_type" value="qualityInformation"/> <attribute name="long_name" value="Ash Detection Quality Flag"/> <attribute name="flag_masks"

value="1b 1b 2b 2b 4b 4b 56b 56b 56b 192b 192b 1892b 1892b 1892b 1892b 1892b"/> <attribute name="flag_values" value="0 1 0 1 0 1 0 1 2 0 0 0 1 2 3 4"/> <attribute name="flag_names" value="Overall_QF Overall_QF Invalid_Data_QF Invalid_Data_QF Local_Zenith_Angle_QF Local_Zenith_Angle_QF Ash_Single_Layer_Confidence_QF Ash_Single_Layer_Confidence_QF Ash_Single_Layer_Confidence_QF Spare Spare Ash_Multi_Layer_Confidence_QF Ash_Multi_Layer_Confidence_QF Ash_Multi_Layer_Confidence_QF Ash_Multi_Layer_Confidence_QF Ash_Multi_Layer_Confidence_QF"/> <attribute name="flag_meanings" value="High_Quality Low_Quality High_Quality Low_Quality High_Quality Low_Quality High Moderate Low Spare Spare High Moderate Low Very_Low Not_Ash"/> </variable>

Page 51: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

MD_Band

+ peakResponse [0..1] : Real + bitsPerValue [0..1] : Integer + toneGradation [0..1] : Integer

MI_CoverageDescription RevisionsMD_Metadata

+contentInfo 0..*

<<CodeList>>MD_CoverageContentTypeCode

+ image + thematicClassification + physicalMeasurement + referenceInformation+ qualityInformation + auxilliaryData+ modelResult

MD_CoverageDescription

+ attributeDescription : RecordType + contentType [1.*]: MD_CoverageContentTypeCode+ processingLevelCode [0..1]: MD_Identifier

+dimension 0..*

MI_RangeElementDescription

+ name : CharacterString+ definition : CharacterString+ rangeElement[1..*] : Record

+rangeElementDescription 0..*

MD_SampleDimension

+ minValue [0..1] : Real + maxValue [0..1] : Real+ units [0..1] : UnitOfMeasure+ scaleFactor [0..1] : Real + offset [0..1] : Real+ numberOfValues [0..1] : Integer+ meanValue [0..1] : Real+ standardDeviation [0..1] : Real+ otherAttributeType [0..1] : RecordType+ otherAttribute [0..1] : Record

MD_RangeDimension

+ sequenceIdentifier [0..1] : MemberName+ name[0..*]: MD_Identifier + description [0..1] : CharacterString

minValue, maxValue and units must have units of length. RangeElement, otherAttributeType, and other Attribute have cardinality [0..0]

+rangeElementDescription

0..*

Page 52: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

ISO Flags <gmi:rangeElementDescription> <gmi:MI_RangeElementDescription> <gmi:name> <gco:CharacterString>Overall_QF</gco:CharacterString> </gmi:name> <gmi:definition> <gco:CharacterString>High_Quality</gco:CharacterString> </gmi:definition> <gmi:rangeElement> <gco:Record>0</gco:Record> </gmi:rangeElement> </gmi:MI_RangeElementDescription></gmi:rangeElementDescription><gmi:rangeElementDescription> <gmi:MI_RangeElementDescription> <gmi:name> <gco:CharacterString>Overall_QF</gco:CharacterString> </gmi:name> <gmi:definition> <gco:CharacterString>Low_Quality</gco:CharacterString> </gmi:definition> <gmi:rangeElement> <gco:Record>1</gco:Record> </gmi:rangeElement> </gmi:MI_RangeElementDescription></gmi:rangeElementDescription>

Flag Meaning

Flag Meaning

Flag Name

Flag Value

Flag Name

Flag Value

Page 53: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

DQ_Evaluation

DQ_Evaluation

+ dateTime: DateTime [0..*]+ evaluationMethodDescription: CharacterString [0..1]+ evaluationProcedure: CI_Citation [0..1]+ referenceDoc: CI_Citation [0..*]+ evaluationMethodType: DQ_EvaluationMethodTypeCode [0..1]

DQ_DataEvaluation DQ_Aggregation

+ sourceQualityResult: CharacterString [2..*]

DQ_FullInspection DQ_SamplebasedInspection

+ samplingScheme: CharacterString+ lotDescription: CharacterString+ samplingRatio: CharacterString

DQ_IndirectEvaluation

<<CodeList>>DQ_EvaluationMethodTypeCode

+ directInternal+ directExternal+ indirect

Page 54: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

Multiple Dialects of the Same Content

Documents

CI_Citation

XSLT Translation

XML Reference

Granules/Catalogs Standards

Page 55: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

<<DataType>>DQ_Scope

+ level : MD_ScopeCode + extent [0..1] : EX_Extent + levelDescription [0..*] :

MI_Metadata

Data Quality - Standards

<<Abstract>>DQ_Element

DQ_ConformanceResult

+ specification : CI_Citation+ explanation : CharacterString+ pass : Boolean

<<CodeList>>MD_EvaluationMethodTypeCode

+ directInternal + directExternal+ indirect

DQ_DataQuality

+ scope : DQ_Scope+ standAloneReport 0..1

DQ_QuantitativeResult

+ valueType [0..1] : RecordType + valueUnit : UnitOfMeasure + errorStatistic [0..1] : CharacterString + value [1..*] : Record

<<CodeList>>MD_ScopeCode

+ attribute + feature + attributeType + featureType+ collectionHardware + propertyType+ collectionSession + fieldSession+ dataset + software+ series + service+ nonGeographicDataset + model+ dimensionGroup + tile

DQ_CoverageResult

+ report 0..*

DQ_StandaloneReportInformation

+ reportReference : CI_Citation+ abstract: CharacterString

DQ_MeasureReference DQ_Evaluation DQ_Result

+ resultScope: DQ_Scope [0..1]

DQ_DescriptiveResult

<<Union>>MD_ScopeDescription

+ attributes : Set<GF_AttributeType> + features : Set<GF_FeatureType> + featureInstances : Set<GF_FeatureType> + attributeInstances : Set<GF_AttributeType> + dataset : CharacterString + other : CharacterString

LI_Lineage

Page 56: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

Questions?

Questions / Comments / Suggestions: [email protected]

Page 57: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

<<DataType>>DQ_Scope

+ level : MD_ScopeCode + extent [0..1] : EX_Extent + levelDescription [0..*] :

MD_Metadata

DQ_DataQuality - 19115

LI_Lineage

<<Abstract>>DQ_Element

+ nameOfMeasure [0..*] : CharacterString + measureIdentification [0..1] : MD_Identifier + measureDescription [0..1] : CharacterString + evaluationMethodType [0..1] : DQ_EvaluationMethodTypeCode + evaluationMethodDescription [0..1] : CharacterString + evaluationProcedure [0..1] : CI_Citation + dateTime [0..*] : DateTime + result [1..2] : DQ_Result

DQ_ConformanceResult

+ specification : CI_Citation+ explanation : CharacterString+ pass : Boolean

<<CodeList>>MD_EvaluationMethodTypeCode

+ directInternal + directExternal+ indirect

DQ_DataQuality

+ scope : DQ_Scope+ lineage 0..1

DQ_QuantitativeResult

+ valueType [0..1] : RecordType + valueUnit : UnitOfMeasure + errorStatistic [0..1] : CharacterString + value [1..*] : Record

"report" or "linage" role is mandatory if scope.DQ_Scope.level = 'dataset'

"levelDescription" is mandatory if "level" notEqual 'dataset' or 'series'

<<Abstract>>DQ_Result

<<CodeList>>MD_ScopeCode

+ attribute + feature + attributeType + featureType+ collectionHardware + propertyType+ collectionSession + fieldSession+ dataset + software+ series + service+ nonGeographicDataset + model+ dimensionGroup + tile

DQ_CoverageResult

+ report 0..*

Page 58: Documenting Data Quality Ted Habermann, NOAA/NESDIS/NGDC Documentation: It’s not just discovery... 50% change in global average Why? i checked my 2002.

DQ_Element

<<Abstract>>DQ_Element

+ nameOfMeasure [0..*] : CharacterString + measureIdentification [0..1] : MD_Identifier + measureDescription [0..1] : CharacterString + evaluationMethodType [0..1] : DQ_EvaluationMethodTypeCode + evaluationMethodDescription [0..1] : CharacterString + evaluationProcedure [0..1] : CI_Citation + dateTime [0..*] : DateTime + result [1..2] : DQ_Result

<<Abstract>>DQ_Element

+ measure [0..*] : DQ_MeasureReference + evaluation [0..1] : DQ_Evaluation + result [1..2] : DQ_Result

DQ_MeasureReference

+ measureIdentification: MD_Identifier [0..1]+ nameOfMeasure: CharacterString [0..*]+ measureDescription: CharacterString [0..1]