Quality, Uncertainty and Bias – by way of example(s)
description
Transcript of Quality, Uncertainty and Bias – by way of example(s)
![Page 1: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/1.jpg)
1
Peter Fox
Data Science – ITEC/CSCI/ERTH-6961
Week 11, November 13, 2012
Quality, Uncertainty and Bias – by way of example(s)
![Page 2: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/2.jpg)
Where are we in respect to the data challenge?
“The user cannot find the data;
If he can find it, cannot access it;
If he can access it, ;
he doesn't know he doesn't know how goodhow good they are; they are;
if he finds them good, he can not if he finds them good, he can not mergemerge them with other data”them with other data”
The Users View of IT, NAS 1989
2
![Page 3: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/3.jpg)
Definitions – atmospheric science
• Quality– Is in the eyes of the beholder – worst
case scenario… or a good challenge
• Uncertainty–has aspects of accuracy (how
accurately the real world situation is assessed, it also includes bias) and precision (down to how many digits)
3
![Page 4: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/4.jpg)
Definitions – atmospheric science
• Bias has two aspects:– Systematic error resulting in the distortion of
measurement data caused by prejudice or faulty measurement technique
– A vested interest, or strongly held paradigm or condition that may skew the results of sampling, measuring, or reporting the findings of a quality assessment:
• Psychological: for example, when data providers audit their own data, they usually have a bias to overstate its quality.
• Sampling: Sampling procedures that result in a sample that is not truly representative of the population sampled. 4
![Page 5: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/5.jpg)
Data quality needs: fitness for purpose
• Measuring Climate Change:– Model validation: gridded contiguous data with uncertainties– Long-term time series: bias assessment is the must , especially
sensor degradation, orbit and spatial sampling change
• Studying phenomena using multi-sensor data:– Cross-sensor bias is needed
• Realizing Societal Benefits through Applications:– Near-Real Time for transport/event monitoring - in some cases,
coverage and timeliness might be more important that accuracy– Pollution monitoring (e.g., air quality exceedance levels) – accuracy
• Educational (users generally not well-versed in the intricacies of quality; just taking all the data as usable can impair educational lessons) – only the best products
![Page 6: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/6.jpg)
6
Producers Consumers
Quality Control
Fitness for Purpose Fitness for Use
Quality Assessment
Trustee Trustor
![Page 7: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/7.jpg)
Quality Control vs. Quality Assessment
• Quality Control (QC) flags in the data (assigned by the algorithm) reflect “happiness” of the retrieval algorithm, e.g., all the necessary channels indeed had data, not too many clouds, the algorithm has converged to a solution, etc.
• Quality assessment is done by analyzing the data “after the fact” through validation, intercomparison with other measurements, self-consistency, etc. It is presented as bias and uncertainty. It is rather inconsistent and can be found in papers, validation reports all over the place.
![Page 8: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/8.jpg)
20080602 Fox VSTO et al.
8
![Page 9: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/9.jpg)
Level 2 data
9
![Page 10: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/10.jpg)
Level 2 data
• Swathfor MISR, orbit 192 (2001)
10
![Page 11: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/11.jpg)
Factors contributing to uncertainty and bias in L2
• Physical: instrument, retrieval algorithm, aerosol spatial and temporal variability…
• Input: ancillary data used by the retrieval algorithm
• Classification: erroneous flagging of the data • Simulation: the geophysical model used for the
retrieval • Sampling: the averaging within the retrieval
footprint
![Page 12: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/12.jpg)
Level 3 data
12
![Page 13: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/13.jpg)
What is Level 3 accuracy?It is not often defined in Earth Science….• If Level 2 errors are known, the corresponding Level
3 error can be computed, in principle, but…• Processing from L2L3 daily L3 monthly may
reduce random noise but can also exacerbate systematic bias and introduce additional sampling bias
• Quality is usually presented in the form of standard deviations (i.e., variability within a grid box), and sometimes pixel counts and quality histograms are provided. QC Flags are rare (MODIS Land Surface).
• Convolution of natural variability with sensor/retrieval uncertainty and bias – need to understand their relative contribution to differences between data
• This does not solve sampling bias
![Page 14: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/14.jpg)
MODIS vs. MERIS
Same parameter Same space & time
Different results – why?
MODIS MERIS
A threshold used in MERIS processing effectively excludes high aerosol values. Note: MERIS was designed primarily as an ocean-color instrument, so aerosols are “obstacles” not signal.
![Page 15: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/15.jpg)
Why is it so difficult?
• Quality is perceived differently by data providers and data recipients.
• There are many different qualitative and quantitative aspects of quality.
• Methodologies for dealing with data qualities are just emerging
• Almost nothing exists for remote sensing data quality
![Page 16: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/16.jpg)
ISO Model for Data Quality
![Page 17: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/17.jpg)
But beware the limited nature of the ISO Model
Q: Examples of “Temporal Consistency” quality issues?
ISO: Temporal Consistency=“correctness of the order of events”
Land Surface Temperature anomaly from Advanced Very High Resolution Radiometer
trend artifact from orbital drift
discontinuity artifact from
change in satellites
![Page 18: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/18.jpg)
Going Beyond ISO for Data Quality
• Drilling Down on Completeness
• Expanding Consistency
• Examining Representativeness
• Generalizing Accuracy
![Page 19: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/19.jpg)
Drill Down on Completeness
• Spatial Completeness: coverage of daily product
Due to a wider swath, MODIS AOD covers more area than MISR. The seasonal and zonal patterns are rather similar
![Page 20: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/20.jpg)
MODIS Aqua AOD Average Daily Spatial Coverage By Region and
SeasonMODIS Aqua AOD
DJF MAM JJA SON
Global 38% 42% 45% 41%
Arctic 0% 5% 19% 4%
Subarctic 3% 26% 49% 25%
N Temperate 43% 43% 51% 52%
Tropics 46% 48% 49% 44%
S Temperate 45% 59% 60% 49%
Subantarctic 32% 17% 10% 24%
Antarctic 5% 0% 0% 1%
This table and chart is Quality Evidence for the Spatial Completeness (Quality Property) of MODIS Aqua Dataset
![Page 21: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/21.jpg)
Expanding Consistency• Temporal “Consistency”
Terra: +0.005 before 2004; -.005 after 2004, relative to Aeronet
Aqua: no change over time relative to Aeronet
From Levy, R., L. Remer, R. Kleidman, S. Mattoo, C. Ichoku, R. Kahn, and T. Eck, 2010. Global evaluation of the Collection 5 MODIS dark-target aerosol products over land, Atmos. Chem. Phys., 10, 10399-10420, doi:10.5194/acp-10-10399-2010.
![Page 22: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/22.jpg)
Examining Temporal Representativeness
How well does a Daily Level 3 file represent the AOD for that day?
Terra Aqua
Arctic
Subarctic
N. Temperate
N. Temperate
S. Temperate
Antarctic
Subantarctic
0 24 0 24
Local Time Local Time
Chance of an overpass during a given hour of the day
![Page 23: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/23.jpg)
How well does a monthly product represent all the days of the month?
• Completeness: MODIS dark target algorithm does not work for deserts• Representativeness: monthly aggregation is not enough for MISR and
even MODIS• Spatial sampling patterns are different for MODIS Aqua and MISR Terra:
“pulsating” areas over ocean are oriented differently due to different orbital direction during day-time measurement
MODIS Aqua AOD July 2009 MISR Terra AOD July 2009
![Page 24: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/24.jpg)
Examining Spatial Representativeness
Neither pixel count nor standard deviation alone express how representative the grid cell value is
![Page 25: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/25.jpg)
Generalizing Accuracy• Aerosol Optical Depth
– Different sources of uncertainty:• Low AOD: surface reflectance• High AOD: assumed aerosol models
– Also: distribution is closer to lognormal than normal
• Thus, “normal” accuracy expressions are problematic:– Slope relative to Ground Truth (Aeronet)– Correlation Coefficient– Root-mean-square error
• Instead, common practice with MODIS data is:– Percent falling within expected error bounds of
e.g., +0.05 + 0.2Aeronet
![Page 26: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/26.jpg)
Data from multiple sources to be used together:• Current sensors/missions: MODIS, MISR, GOES, OMI.
Harmonization needs:• It is not sufficient just to have the data from different sensors and
their provenances in one place• Before comparing and fusing data, things need to be harmonized:
• Metadata: terminology, standard fields, units, scale • Data: format, grid, spatial and temporal resolution,
wavelength, etc.• Provenance: source, assumptions, algorithm, processing steps• Quality: bias, uncertainty, fitness-for-purpose, Quality: bias, uncertainty, fitness-for-purpose,
validationvalidation
Dangers of easy data access without proper assessment of the joint data usage - It is easy to use data incorrectly
Intercomparison of data from multiple sensors
26
![Page 27: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/27.jpg)
Example: South Pacific
27
Anomaly
MODIS Level 3 dataday definition leads to artifact in correlation
![Page 28: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/28.jpg)
…is caused by an Overpass Time Difference
28
![Page 29: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/29.jpg)
Investigation of artifacts in AOD correlation between MODIS and MISR near the Dateline
Standard MODIS Terra and MISR Using calendar data-day definition for each pixel
Using Local time-based data-day definition for each
pixel
Progressively removing artifacts by applying appropriate dataday definition for Level3 daily data generation for both MODIS Terra and MISR
![Page 30: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/30.jpg)
Different kinds of reported data quality
• Pixel-level Quality: algorithmic guess at usability of data point– Granule-level Quality: statistical roll-up of Pixel-level Quality
• Product-level Quality: how closely the data represent the actual geophysical state
• Record-level Quality: how consistent and reliable the data record is across generations of measurements
Different quality types are often erroneously assumed having the same meaning
Ensuring Data Quality at these different levels requires different focus and action
![Page 31: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/31.jpg)
Sensitivity of Aerosol and Chlorophyll Relationship to Data-Day Definition
• The standard Level 3 daily MODIS Aerosol Optical Depth(AOD) at 550nm generated by the atmosphere group uses granule UTC-time based data-day, while the standard Level 3 daily SeaWiFS Chlorophyll (Chl) generated by ocean group uses the pixel-based Local Solar Time (LST) data-day.
• The correlation coefficients between Chl and AOT differ significantly near the dateline due to the data-day definition.
• This study suggests that the same or similar statistical aggregation methods using the same or similar data-day definitions, should be used when creating climate time series for different parameters to reduce potential emergence of artifacts.
![Page 32: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/32.jpg)
Correlation between MODIS Aqua AOD (Ocean group product) and MODIS-Aqua AOD (Atmosphere group product)
Pixel Count distribution
Only half of the Data Day artifact is present because the Ocean Group uses the better
Data Day definition!
Sensitivity Study: Effect of the Data Day definition on Ocean Color data correlation with Aerosol data
Starting with Aerosols:
![Page 33: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/33.jpg)
Correlation between MODIS Aqua Chlorophyll and MODIS-Aqua AOD 550nm (Atmosphere group product) for Apr 1 – Jun 4 2007
Pixel Count distribution
Data Day effect is quite visible!
Sensitivity Study: Effect of the Data Day definition on Ocean Color data correlation with Aerosol data
Continuing with Chlorophyll and Aerosols:
GEO-CAPE impact: observation time difference with ACE and other sensors may lead to artifacts in comparison statistics
![Page 34: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/34.jpg)
Sensitivity of Aerosol and Chl Relationship to Data-Day Definition
Correlation Coefficients MODIS AOT at 550nm and SeaWiFS Chl
Difference between Correlation of A and B:A: MODIS AOT of LST and SeaWiFS ChlB: MODIS AOT of UTC and SeaWiFS Chl
Artifact: difference between using LST and the calendar
UTC-based dataday
Artifact: difference between using LST and the calendar
UTC-based dataday
![Page 35: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/35.jpg)
Presenting data quality to users
Split quality (viewed here broadly) into two categories:
• Global or product level quality information, e.g. consistency, completeness, etc., that can be presented in a tabular form.
• Regional/seasonal: various approaches: – maps with outlines regions, one map per
sensor/parameter/season– scatter plots with error estimates, one per a combination
of Aeronet station, parameter, and season; with different colors representing different wavelengths, etc.
![Page 36: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/36.jpg)
But really what works is…
36
![Page 37: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/37.jpg)
Quality Labels
![Page 38: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/38.jpg)
Quality Labels
Generated for a request for 20-90 deg N, 0-180 deg E
![Page 39: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/39.jpg)
Advisory Report (Dimension Comparison Detail)
39
![Page 40: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/40.jpg)
Advisory Report (Expert Advisories Detail)
40
![Page 41: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/41.jpg)
Quality Comparison Table for Level-3 AOD (Global example)
Quality Aspect MODIS MISR
Completeness
Total Time Range Platform Time Range 2/2/200-presentTerra 2/2/2000-present
Aqua 7/2/2002-present
Local Revisit Time Platform Time Range Platform Time Range
Terra 10:30 AM Terra 10:30 AM
Aqua 1:30 PM
Revisit Time global coverage of entire earth in 1 day; coverage overlap near pole
global coverage of entire earth in 9 days & coverage in 2 days in polar region
Swath Width 2330 km 380 km
Spectral AOD AOD over ocean for 7 wavelengths (466, 553, 660, 860, 1240, 1640, 2120 nm );AOD over land for 4 wavelengths (466, 553, 660, 2120 nm (land)
AOD over land and ocean for 4 wavelengths (446, 558, 672, and 866 nm)
AOD Uncertainty or Expected Error (EE)
+-0.03+- 5% (over ocean; QAC > = 1)+-0.05+-20% (over land, QAC=3);
63% fall within 0.05 or 20% of Aeronet AOD; 40% are within 0.03 or 10%
Successful Retrievals
15% of Time 15% of Time (slightly more because of retrieval over Glint region also)
![Page 42: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/42.jpg)
Going down to the individual level
42
![Page 43: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/43.jpg)
The quality of data can vary considerably
AIRS Parameter Best (%)
Good (%)
Do Not Use (%)
Total Precipitable Water
38 38 24
Carbon Monoxide 64 7 29
Surface Temperature 5 44 51
Version 5 Level 2 Standard Retrieval Statistics
![Page 44: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/44.jpg)
Data Quality• Validation of aerosol data show that not all data
pixel labeled as “bad” are actually bad if looking at from a bias perspective.
• But many pixels are biased due to various reasons
2/18/2011
From Levy et al, 2009
44
![Page 45: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/45.jpg)
Percent of Biased Data in MODIS Aerosols Over Land Increase as
Confidence Flag Decreases
*Compliant data are within + 0.05 + 0.2Aeronet
Statistics from Hyer, E., J. Reid, and J. Zhang, 2010, An over-land aerosol optical depth data set for data assimilation by filtering, correction, and aggregation of MODIS Collection 5 optical depth retrievals, Atmos. Meas. Tech. Discuss., 3, 4091–4167.
![Page 46: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/46.jpg)
The effect of bad qualitydata is often not negligible
Total Column Precipitable Water Quality
Best Good Do Not Usekg/m2
Hurricane Ike, 9/10/2008
![Page 47: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/47.jpg)
…or they can be more complicated
Hurricane Ike, viewed by the Atmospheric Infrared Sounder (AIRS)
PBest : Maximum pressure for which quality value is
“Best” in temperature profiles
Air Temperatureat 300 mbar
![Page 48: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/48.jpg)
Quality flags are also sometimes packed together into bytes
Cloud Mask Status Flag0=Undetermined1=Determined
Cloud Mask Cloudiness Flag0=Confident cloudy1=Probably cloudy2=Probably clear3=Confident clear
Day / Night Flag0=Night1=Day
Sunglint Flag0=Yes1=No
Snow/ Ice Flag0=Yes1=No
Surface Type Flag0=Ocean, deep lake/river1=Coast, shallow lake/river2=Desert3=Land
Bitfield arrangement for the Cloud_Mask_SDS variable in atmospheric products from Moderate Resolution Imaging Spectroradiometer (MODIS)
![Page 49: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/49.jpg)
So, replace bad-quality pixels with fill values!!!!
Mask based on user criteria
(Quality level < 2)
Good quality data pixels
retained
Output file has the same format and structure as the input file (except for extra mask and original_data fields)
Original data array(Total column precipitable water)
![Page 50: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/50.jpg)
Visualizations help users see the effect of different quality filters
Best quality only
Best + Good quality
Data withinproduct file
![Page 51: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/51.jpg)
Initial settings are based on Science Team recommendation.
(Note: “Good” retains retrievals that Good or better).You can choose settings for all parameters at once...
... or parameter by parameter.
Or, let users select their own criteria...
51
![Page 52: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/52.jpg)
Types of Bias Correction
Type of Correction
Spatial Basis
Temporal Basis
Pros Cons
Relative (Cross-sensor) linear Climatological
Region Season Not influenced by data in other regions, good sampling
Difficult to validate
Relative (Cross-sensor) non-linear Climatological
Global Full data record
Complete sampling Difficult to validate
Anchored Parameterized Linear
Near Aeronet stations
Full data record
Can be validated Limited areal sampling
Anchored Parameterized Non-Linear
Near Aeronet stations
Full data record
Can be validated Limited insight into correction2/18/2011
52
![Page 53: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/53.jpg)
Data Quality Issues• Validation of aerosol data show that not all data pixels labeled as
“bad” are actually bad if looking at from a bias perspective.
• But many pixels are biased differently due to various reasons
From Levy et al, 2009
![Page 54: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/54.jpg)
Quality & Bias assessment using FreeMind
from the Aerosol Parameter Ontology
FreeMind allows capturing various relations between various aspects of aerosol measurements, algorithms, conditions, validation, etc. The “traditional” worksheets do not support complex multi-dimensional nature of the task
![Page 55: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/55.jpg)
Reference: Hyer, E. J., Reid, J. S., and Zhang, J., 2011: An over-land aerosol optical depth data set for data assimilation by filtering, correction, and aggregation of MODIS Collection 5 optical depth retrievals, Atmos. Meas. Tech., 4, 379-408, doi:10.5194/amt-4-379-2011
(General) Statement: Collection 5 MODIS AOD at 550 nm during Aug-Oct over Central South America highly over-estimates for large AOD and in non-burning season underestimates for small AOD, as compared to Aeronet; good comparisons are found at moderate AOD.Region & season characteristics: Central region of Brazil is mix of forest, cerrado, and pasture and known to have low AOD most of the year except during biomass burning season
(Example) : Scatter plot of MODIS AOD and AOD at 550 nm vs. Aeronet from ref. (Hyer et al, 2011) (Description Caption) shows severe over-estimation of MODIS Col 5 AOD (dark target algorithm) at large AOD at 550 nm during Aug-Oct 2005-2008 over Brazil. (Constraints) Only best quality of MODIS data (Quality =3 ) used. Data with scattering angle > 170 deg excluded. (Symbols) Red Lines define regions of Expected Error (EE), Green is the fitted slopeResults: Tolerance= 62% within EE; RMSE=0.212 ; r2=0.81; Slope=1.00For Low AOD (<0.2) Slope=0.3. For high AOD (> 1.4) Slope=1.54
(Dominating factors leading to Aerosol Estimate bias): 1.Large positive bias in AOD estimate during biomass burning season may be due to wrong assignment of Aerosol absorbing characteristics.(Specific explanation) a constant Single Scattering Albedo ~ 0.91 is assigned for all seasons, while the true value is closer to ~0.92-0.93. [ Notes or exceptions: Biomass burning regions in Southern Africa do not show as large positive bias as in this case, it may be due to different optical characteristics or single scattering albedo of smoke particles, Aeronet observations of SSA confirm this] 2. Low AOD is common in non burning season. In Low AOD cases, biases are highly dependent on lower boundary conditions. In general a negative bias is found due to uncertainty in Surface Reflectance Characterization which dominates if signal from atmospheric aerosol is low.
0 1 2 Aeronet AOD
Central South America
* Mato Grosso
* Santa Cruz
* Alta Floresta
![Page 56: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/56.jpg)
Completeness: Observing Conditions for MODIS AOD at 550 nm Over Ocean
Region Ecosystem % of Retrieval Within
Expected Error
Average Aeronet AOD
AOD Estimation Relative to Aeronet
US Atlantic Ocean
Dominated by Fine mode aerosols (smoke & sulfate)
72% 0.15 Over- estimated(by 7%) *
Indian Ocean Dominated by Fine mode aerosols (smoke & sulfate)
64 % 0.16 Over- estimated (by 7% ) *
Asian Pacific Oceans
Dominated by fine aerosol, not dust
56% 0.21 Over-estimated (by 13%)
“Saharan” Ocean
Outflow Regions in Atlantic dominated by Dust in Spring
56% 0.31 Random Bias (1%) *
Mediterranean Dominated by fine aerosol
57% 0.23 Under- estimated (by 6% ) *
*Remer L. A. et al., 2005: The MODIS Aerosol Algorithm, Products and Validation. Journal of the Atmospheric Sciences, Special Section. 62, 947-973.
![Page 57: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/57.jpg)
Completeness: Observing Conditions for MODIS AOD at 550 nm Over Land
Region Ecosystem % of Retrieval Within Expected
Error
Correlation W.r.t Chinese ground Sun Hazetometer
AOD Estimation Relative to Ground based sensor
Yanting, China Agriculture Site (central China)
45% slope=1.04 ; offset= -0.063 Corr ^2 = 0.83
Slightly Over- estimated
Fukung, China Semi Desert (North West China) Site
7% slope=1.65 offset=0.074 Corr ^2 = 0.58
Over- estimated(more than 100% at large AOD values
Beijing Urban SiteIndustrial Pollution
35% Slope = 0.38, Offset = 0.086, Corr ^2 = 0.46%
Severely Under-estimated(more than 100% at large AOD values)
* Li Z. et al, 2007: Validation and understanding of Moderate Resolution Imaging Spectroradiometer aerosol products (C5) using ground-based measurements from the handheld Sun photometer network in China, JGR, VOL. 112, D22S07, doi:10.1029/2007JD008479.
![Page 58: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/58.jpg)
Summary
• Quality is very hard to characterize, different groups will focus on different and inconsistent measures of quality– HOW WOULD YOU ADDRESS THIS?
• Products with known Quality (whether good or bad quality) are more valuable than products with unknown Quality.– Known quality helps you correctly assess fitness-for-use
• Harmonization of data quality is even more difficult that characterizing quality of a single data product
58
![Page 59: Quality, Uncertainty and Bias – by way of example(s)](https://reader036.fdocuments.net/reader036/viewer/2022062409/5681487b550346895db583a3/html5/thumbnails/59.jpg)
What is next• Project discussions….
• A3 – coming back to you this week
• Next week 12 – Nov. 20 - Webs of Data and Data on the Web, the Deep Web, Data Discovery, Data Integration– Project write ups due.
• Reading for this week – see web site
• Last class is week 13, Nov. 27 – project presentations (and final assignment due) 59