INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

99
INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS

Transcript of INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Page 1: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

INTRODUCTION TO THE DATA QUALITY

OBJECTIVES PROCESS

Page 2: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Course Objectives

At the conclusion of this course, participants will understand:

• The Agency's Quality System and the elements of the DQO Process

• How the DQO process applies to EPA programs

• How to interpret the consequences of potential decision errors.

Page 3: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Systematic Planning

• Agency policy requires the use of a systematic planning process to develop performance criteria

• DQO Process defines performance and acceptance criteria for decision making

• EPA recommends the DQO Process

Page 4: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

What is the DQO Process?

The DQO Process is a systematic

planning process for generating

environmental data that will be sufficient

for their intended use.

Page 5: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

What are DQOs?

DQOs are quantitative and qualitative criteria that:

• Clarify study objectives

• Define appropriate types of data to collect

• Specify the tolerable levels of potential decision errors

Page 6: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

DQO Process

• Planning Tool for Managing Decision Errors

• Improves: Planning Effectiveness Design Efficiency Defensibility of results/decisions

• Generates appropriate data Type Quality Quantity

Page 7: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

DQO Process

Designed to answer:

• What do you need?

• Why do you need it?

• How will you use it?

• What is your tolerance for errors?

Page 8: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

DQO Process: Underlying Principles

1. All collected data have error.

2. Nobody can afford absolute certainty.

3. The DQO Process defines tolerable error rates.

4. Absent DQOs, decisions are uninformed.

5. Uninformed decisions tend to be conservative and expensive.

Page 9: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

DQOs Strike a Balance

DQOs

Decreasing

Increasing

TimeResources

Uncertainty

Decreasing

Increasing

Page 10: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

DQOs in the Context of the Project Life Cycle

Planning

Implementation

Assessment

Make the Decision

Conduct Data QualityAssessment

Plan for Data CollectionUsing the DQO Process

Collect Environmental Data Using Documented Sampling

Schemes

EPA QA/G-4

EPA QA/G-5

EPA QA/G-9

Page 11: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

The DQO Process

Problem(Investigation or Study)

Resource Effective DataCollection Design

1. State the Problem.

2. Identify the Decision.

3. Identify the Inputs to the Decision.

4. Define the Boundaries of the Study.

5. Develop a Decision Rule.

6. Specify Tolerable Limits on Decision Errors.

7. Optimize the Design.

Page 12: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Repeated Application of the DQO Process

ITERATEAS

NEEDED

STARTDEVELOPING

DQOs

PRIMARYSTUDY

DECISION

STATETHE

PROBLEM

OPTIMIZETHE

DESIGN

IDENTIFY THE

DECISION

DEVELOP A

DECISION RULE

DEFINETHE

STUDYBOUNDARIES

IDENTIFYINPUTS TO THE

DECISION

INCREASING LEVEL OF EVALUATION EFFORT

STATETHE

PROBLEM

OPTIMIZETHE

DESIGN

IDENTIFY THE

DECISIONDEVELOP ADECISION RULE DEFINE

THE STUDY

BOUNDARIES

IDENTIFYINPUTS TO THE

DECISION

SPECIFYLIMITS

ONDECISION ERRORS

STATETHE

PROBLEM

OPTIMIZETHE

DESIGN

IDENTIFY THE

DECISION

DEVELOP A

DECISION RULE

DEFINETHE

STUDYBOUNDARIES

IDENTIFYINPUTS TO THE

DECISION

STUDY PLANNING

COMPLETED

STUDY PLANNING

COMPLETED

SPECIFYLIMITS

ONDECISION ERRORS SPECIFY

LIMITS ON

DECISION ERRORS

STUDY PLANNING

COMPLETED

INTER-MEDIATE-

STUDYDECISION

ADVANCEDSTUDY

DECISIONDECIDE NOT TO USEPROBA-BILISTICSAMPLINGAPPROACH

Page 13: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Data Quality Objectives:Outputs from Each Step of the Process

Problem:

Decision:

Inputs:

Boundaries:

Decision Rule:

Limits on Decision Errors:

DQOs

Page 14: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

The DQO Process PromotesCommunication

Parameter:- Mean, percentileRisk:- CarcinogenMedia:- Soil, WaterVariance:- Variability in dataSample:- Analytical portion

Parameter:- Limits of studyRisk:- Poor decisionMedia:- Press, TVVariance:- Exception to a ruleSample:- Collection of items

PARAMETERRISK

MEDIAVARIANCESAMPLE

Decision Maker Data Collector

DQO

Page 15: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

A Quality Planning Model

Effective Communication

DQOs

(Environmental Data)

Needs

Understanding

Approval

DECISION MAKER(Data User)

DATACOLLECTOR

PERFORMANCE SPECIFICATIONS

Page 16: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

The DQO Process EncouragesEfficient Planning

• Clearly stated objectives

• A framework for organizing complex issues

• Limits on decision errors specified

• Efficient resource expenditure

Page 17: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

DATA QUALITY OBJECTIVES

Page 18: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Seven Steps of the DQO Process

1. State the problem to be resolved.

2. Identify the decision to be made.

3. Identify the inputs to the decision.

4. Define the boundaries of the study.

5. Develop a decision rule.

6. Specify the tolerable limits on decision errors.

7. Optimize the design for obtaining the data.

Page 19: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Stating the Problem

• Risk Assessor• Scientist/Engineer• Statistician/Data Analyst

• Data User/Decision Maker• Lab and Field Personnel• QA Specialist

Who should participate on the planning team?

What is the problem?

What resources are available?

What time is available?

What important social/political issues have an impact on the decision?

Page 20: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Wood Preserving Site:Background

• U.S. State - led investigation of possible soil contamination problem

• Creosoting of timbers

• Soil contaminated with creosote

• Contains Polyaromatic Hydrocarbons (PAHs)

• Early Sampling Results:–Soil PAH concentration in low activity area 0-80 mg/kg–Soil PAH concentration in high activity area 80-140 mg/kg–Off site: Not detected–Future land use will be residential

Page 21: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Wood Preserving Site:Background

The Team:

• Decision Maker

• Chemist

• Field Sampling Technician

• QA Specialist

• Risk Assessor/Toxicologist

• Environmental Scientist with Statistical Training

Page 22: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Wood Preserving Site:Problem Statement

The Problem: Obvious creosote contamination in

the soil may pose a danger to

human health or the environment.

Information is necessary to

determine the extent of danger.

Resources: Measurement Budget = $100,000

Time Limit: Remediate in 1 year

Socio-political: Future land use is residential

Page 23: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Identifying the Decision

• Identify the principal study question. Clarify the main issue to be resolved.

• Specify the alternative actions that would result from each resolution.

Associate a course of action with each possible answer.

• Define the decision statement that must be resolved to address the problem.

Combine the principal study question and the alternative actions into a specific decision statement.

Page 24: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Wood Preserving Site:Identifying the Decision

Study Question:– Does creosote contamination in the soil pose an

unacceptable danger to human health or the environment?

Alternative Actions:– Remediate the soil– Do not remediate the soil (no action)

Decision Statement:– Determine whether the creosote contamination in

soil poses a danger that requires remediation.

Page 25: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Identifying Inputs for the Decision

• Focus on what information is needed for the decision.

• Identify the variables/characteristics to be measured.

• Identify the information needed to establish the action level.

Page 26: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Wood Preserving Site:Inputs Needed for Decision

Variable of Interest: PAHs Some PAHs are carcinogens that

are dangerous to human health.

Action Level: Set by a toxicologist using

relevant site-specific exposure

assessment at 50 ppm.

Page 27: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Defining the Boundaries

•Define the spatial boundary for the decision Define the geographical area within which decisions

apply Define the media of concern Divide each medium into homogeneous strata

•Define the temporal boundary of the decision Determine the time frame to which the study results

apply Determine when to study

•Define a scale of decision making

• Identify practical constraints on data collection

Page 28: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Wood Preserving Site:Spatial Boundaries

• Define the geographical area within which decisions apply:The property boundary (No PAHs detected off site)

• Specify the characteristics that define the population of interest:PAHs in surface soil to 15 cm depth

• Divide each medium into homogeneous strata:The site has been divided into two areas:1) Area of high activity where the concentration is

expected to be high 2) Area of low activity where the concentration is expected

to be low

Page 29: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Wood Preserving Site:Temporal Boundaries

• Determine the time frame to which the study results apply:

The results will represent future conditions at the site. (Future lifetime exposure for residents)

• Determine when data should be collected:

Sampling begins in 3 months. Remediation completed within 1 year. Sampling results will not vary depending on weather conditions

Page 30: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Wood Preserving Site:Defining the Boundaries

Scale of Decision Making:

• Decisions will be made for each residential lot-sized area (based on future land use)

Practical Constraints:

• Existing structures and debris may limit sampling locations

Page 31: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Develop a Decision Rule

Develop an "if/then" statement that incorporates:

• The population parameter of interest

(e.g., mean, maximum, percentile)

• The scale of decision making

(e.g., residential lot size)

• The action-triggering value

• The alternative actions

Page 32: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Wood Preserving Site:Decision Rule

Use average (mean) PAH concentrations to identify lots that pose a health threat.

– If the true mean PAH concentration within a residential lot is greater than 50 mg/kg, then the soil will be remediated.

– If not, then the soil will be left in situ.

Page 33: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Specify Limits on Decision Errors

• Determine the possible range of the parameter of interest

• Determine baseline condition (null hypothesis)

• Determine consequences of each decision error. Consequences may include:

Health risks Ecological risks Political risks Social risks Resource risks

Page 34: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Specifying Limits on Decision Error

• Specify the gray region - a range of possible parameter values where the consequences of decision errors are relatively minor (too close to call)

– Bounded on one side by the action level

– Bounded on the other side by the parameter value where the consequences of making a decision error begins to be significant

• Set quantitative limits on false rejection and false acceptance errors by considering the consequences of these potential decision errors.

Page 35: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Statistical Error Types

• Rejecting the baseline condition when it is true is a False Rejection error, F(r).

Decision: Not hazardous when it actually is hazardous

• Accepting the baseline condition when it is false is a False Acceptance error, F(a).

Decision: Hazardous when it actually is not hazardous

Page 36: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Decision Errors:Synonyms and Plain English

If the baseline assumption is that the program or site is in compliance, then:

False Rejection Error F(r), Type I Error, False Positive•Deciding program or site not in compliance when it is•An overreaction to a situation•Wasted resources, unnecessary expenditure

False Acceptance Error F(a), Type II Error, False Negative•Deciding program or site is in compliance when it is

not•A missed opportunity for correction•Allowing a hazard to public health or the ecosystem

Page 37: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

False Rejection and False Acceptance

Baseline Condition: True mean level equal or below standard

Alternative: True mean level above standard

CorrectFalse

AcceptanceF(a)

FalseRejection

F(r)

Correct

In Actuality

Decisionbased on a sample

Below Standard

AboveStandard

Below Standard

AboveStandard

Page 38: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Decision Errors:Synonyms and Plain English

If the baseline assumption is that the program or site is NOT in compliance, then:

False Rejection Error F(r), Type I Error, False Positive• Deciding program or site is in compliance when it is not• A missed opportunity for correction• Allowing a hazard to public health or the ecosystem

False Acceptance Error F(a), Type II Error, False Negative• Deciding program or site not in compliance when it is• An overreaction to a situation• Wasted resources, unnecessary expenditure

Page 39: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

The Probability of Making FalseRejection Decision Errors

If the true mean is much greater than the action level, few low readings will occur. So, there is a small chance of reaching a wrong conclusion.

If the true mean is close to the action level, many low readings will occur. Erroneous conclusions are much more likely.

100 ppmTrue Mean

50 ppmAction Level

75 ppmAction Level

100 ppmTrue Mean

Page 40: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Specify Limits on Decision Error(Construct a "What If" Table)

Assign probability values to points above and below the action level that reflect the tolerable probabilities for decision errors.

MeasuredConc. (ppm) Decision

TrueConc. (ppm)

ErrorType Aversion

Tolerable*Probability

>50 Cleanup 0-20False

rejection Severe 10%

>50 Cleanup 20-35 Falserejection

Moderate 20%

>50 Cleanup 35-50 Falserejection

Minor 30%

<50 Leave 50-100 Falseaccept

Minor Gray Region

<50 Leave 100-150 Falseaccept

Moderate 20%

<50 Leave 150-200 Falseaccept

Severe 10%

<50 Leave 200-250 Falseaccept

Very Severe 5%

* Probabilities are based on planning team discussions.

Page 41: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Decision Performance Goal Diagram

0 50 100 150 200 2500

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Tole

rabl

e C

han

ce o

f Dec

idin

g th

at t

heP

ara

me

ter

Exc

eeds

the

Act

ion

Le

vel

True Concentration of PAH (mg/kg)

Gray Region(Too close to call)

Tolerable False Rejection

DecisionError Rates

Tolerable False

AcceptanceDecision

Error Rates

ActionLevel

Baseline Condition: Mean < 50

Page 42: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Optimize the Design

• Develop general data collection design alternatives– Simple random sampling– Simple random sampling with compositing– Stratified random sampling

• For each design, develop cost formula, select a proposed method of data analysis, develop method for estimating sample size to correspond to method for data analysis

• Select the most resource-effective design– consider cost, human resources, other constraints– consider performance of design

Page 43: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Decision Performance Goal Diagramwith Performance Curve

0 50 100 150 200 2500

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Tole

rabl

e C

han

ce o

f Dec

idin

g th

at t

heP

ara

me

ter

Exc

eeds

the

Act

ion

Le

vel

True Concentration of PAH (mg/kg)

Gray Region(Too close to call)

Tolerable False Rejection

DecisionError Rates

Tolerable False

AcceptanceDecision

Error Rates

ActionLevel

Baseline Condition: Mean < 50

Page 44: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

DQO Process Output

• Qualitative and Quantitative Framework for a study

• Feeds directly into the Quality Assurance Project Plan which is mandatory for EPA environmental data collection activities

Page 45: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

DATA QUALITY OBJECTIVES:Cadmium Contaminated Fly Ash

Example

Page 46: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Case Study Introduction

Case study - Cadmium contaminated fly ash waste

• Output from a DQO case study

• Shows how the steps of the DQO process aid in developing a sampling design

• Illustrates decisions that could be made within the Resource Conservation and Recovery Act (RCRA) Program

• Not intended to represent the policies of the RCRA Program

Page 47: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Cadmium Contaminated Fly Ash Waste:Background Information

• Municipal incinerator

• Fly ash dumped in municipal landfill

• Company calls ash "Non-hazardous"

Page 48: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Background Information

New waste stream:

• Contains cadmium

− Toxic effects: inhalation and ingestion exposure

− Short term and chronic effects

• The new ash will be tested using Toxicity Characteristic Leaching Procedure (TCLP).

• Waste will be classified as hazardous if the cadmium concentration in the TCLP > 1mg/liter.

Page 49: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Background Information

• Pilot study - to determine the variability of the cadmium concentration in ash

• Results:

– Relatively constant variability within containers

– Relatively high variability between containers

Page 50: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

The DQO Process:State the Problem

• Members of Planning Team–Plant Manager - Chemist–Plant Engineer Manager - Quality Assurance–Statistician/Data Analyst

• The Problem–To determine which loads of ash should be sent to a RCRA

facility and which can be dumped in the municipal landfill

• Available resources–The difference in cost between municipal and RCRA disposal

is $6750.

• Project constraints–Cost (Budget approximately $3,000 for sampling)

Page 51: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

The DQO Process:Identify the Decision

• Define the alternative actions.– The waste fly ash could be disposed of in a RCRA

landfill.– The waste fly ash could be disposed of in a municipal

landfill.

• Form alternatives into a decision statement.– Determine if the cadmium concentration in the TCLP

leachate exceeds RCRA regulatory standards.

Page 52: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

The DQO Process:Identify the Inputs to the Decision

• Identify key information.– Concentration of cadmium in fly ash– Fly ash samples subjected to the TCLP test and

analyzed for cadmium

• Identify information to establish the Action Level.– RCRA standard (1.0 mg/l using the TCLP method)

• Confirm that appropriate analytical methods exist.– Cadmium is a metal that has a detection limit well

below the RCRA standard.

Page 53: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

The DQO Process:Define the Boundaries of the Study

• Identify the spatial boundaries.

– Fly ash in containerized bins; at least 70% capacity

• Identify temporal boundaries.

– The ash does not present an exposure hazard and will not degrade; no sampling time constraints are necessary.

• Define the scale of decision making.

– A decision will be made about each container.

• Identify practical considerations that may interfere with the study.

– Physically obtaining samples from the containers

Page 54: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

The DQO Process:Develop a Decision Rule

• The Parameter of Interest

– The average concentration of cadmium

• Specify the Action Level for the study.

– The RCRA standard for cadmium (1.0 mg/l) in TCLP leachate

• Develop a Decision Rule.

– If the average cadmium concentration in a bin is more than 1.0 mg/l, then the ash will be disposed of in a RCRA facility.

– If the average cadmium concentration in a bin is less than 1.0 mg/l, then the ash will be disposed of in a municipal landfill.

Page 55: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

The DQO Process:Specify Limits on Decision Errors

• Determine baseline condition

– Null hypothesis = "hazardous" (RCRA requirement) mean > 1.0 mg/l

• Identify decision errors

– False rejection:

Decide mean < 1.0 mg/l when mean > 1.0 mg/l

– False acceptance:

Decide mean > 1.0 mg/l when mean < 1.0 mg/l

• Identify limits on decision errors & gray region

Page 56: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

The DQO Process:Tolerable Limits of Decision Error

Decision Performance for Cadmium Compliance TestingBaseline Condition: Ash is hazardous, mean > 1

0 0.25 0.5 0.75 1 1.25 1.5 1.75 2 2.250

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Tole

rabl

e C

hanc

e of

Dec

larin

g N

onco

mpl

ianc

e

True Mean Concentration of Cadmium (mg/l)

Tolerable False

AcceptanceDecision

Error Rates

Action Level

Tolerable False

RejectionDecision

Error Rates

Gray Region

Page 57: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Optimize the Design

• Develop general data collection design alternatives

– Simple random sampling

– Simple random sampling with compositing

– Sequential random sampling

• For each design, develop cost formula, select a proposed method of data analysis, develop method for estimating sample size to correspond to method for data analysis

• Select the most resource-effective design

– consider cost, human resources, other constraints

– consider performance of design

Page 58: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

The DQO Process:Optimize the Design

Elements of the Design:

• Hypothesis Test• Statistical Model• Design Description/Option• Sample Location• Sample Cost• Sample Size• Design Performance

Page 59: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Design Options:Simple Random Sampling

• Simple Random Sample

– Simplest type of probability sampling– Every point in the sampling medium has an equal

chance of being selected.

• Application

– Small variance– Inexpensive sampling and analysis

Page 60: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Design Options:Composite Sampling

• Physically combining multiple samples then drawing one or more sub-samples for analysis

• Application:

– When an average concentration is sought and there is no need to detect peak concentrations

– Large variance (allows the researchers to sample a larger number of locations)

– Reduces total cost when analytical costs are higher than sample collection costs

Page 61: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Design Options:Sequential Sampling

• Conduct several rounds of sampling and analysis; perform statistical test between each round to make one of three decisions:

–Accept null hypothesis –Reject null hypothesis–Collect more samples

• Application

–When sampling and analysis costs are high–When information about sampling or measurement

variability is lacking–When the waste is stable over time frame of the sampling

effort

Page 62: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Sample/Analysis/Disposal Costs

• Sample collection costs from each container- $10/sample

• TCLP cost - $150/analysis

• 15 tons of ash per container

• $500/ton RCRA landfill ($7,500 per container)

• $50/ton municipal landfill ($750 per container)

Page 63: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Decision Performance Goal Diagramwith Performance Curve:Simple Random Sampling

0 0.25 0.5 0.75 1 1.25 1.5 1.75 2 2.250

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Number of samples 37Cost of Data Collection Design $5,920

Tole

rabl

e C

hanc

e of

Dec

idin

g th

at t

heP

aram

eter

Exc

eeds

the

Act

ion

Leve

l

True Value of the Parameter (Mean Concentration, mg/l)

Tolerable False

AcceptanceDecision

Error Rates

Gray Region

Tolerable False

RejectionDecision

Error Rates

PERFORMANCE CURVE

Baseline Condition: Ash is hazardous, mean > 1

Page 64: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Decision Performance Goal Diagramwith Performance Curve:

Relaxed Decision Error Constraints

0 0.25 0.5 0.75 1 1.25 1.5 1.75 2 2.250

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Number of samples 20Cost of Data Collection Design $3,200

Tole

rabl

e C

hanc

e of

Dec

idin

g th

at t

heP

aram

eter

Exc

eeds

the

Act

ion

Leve

l

True Value of the Parameter (Mean Concentration, mg/l)

Tolerable False

AcceptanceDecision

Error Rates

Gray Region

Tolerable False

RejectionDecision

Error Rates

PERFORMANCE CURVE

Baseline Condition: Ash is hazardous, mean > 1

Page 65: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Decision Performance Goal Diagramwith Performance Curve:

Increased Gray Region Width

0 0.25 0.5 0.75 1 1.25 1.5 1.75 2 2.250

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Number of samples 13Cost of Data Collection Design $2,080

Tole

rabl

e C

han

ce o

f Dec

idin

g th

at t

heP

ara

me

ter

Exc

eeds

the

Act

ion

Lev

el

True Value of the Parameter (Mean Concentration, mg/l)

Tolerable False

AcceptanceDecision

Error Rates

Gray Region

PERFORMANCE CURVE

Tolerable False

PositiveRejection

Error Rates

Baseline Condition: Ash is hazardous, mean > 1

Page 66: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Decision Performance Goal Diagramwith Performance Curve:

Simple Random Sampling with Compositing

Number of samples 64Number of Analyses 16Cost of Data Collection Design $3,040

0 0.25 0.5 0.75 1 1.25 1.5 1.75 2 2.250

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Tole

rabl

e C

hanc

e o

f De

cidi

ng

that

the

Par

amet

er

Exc

eed

s th

e A

ctio

n L

eve

l

True Value of the Parameter (Mean Concentration, mg/l)

Tolerable False

AcceptanceDecision

Error Rates

Gray Region

Tolerable False

RejectionDecision

Error Rates

PERFORMANCE CURVE

Baseline Condition: Ash is hazardous, mean > 1

Page 67: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Compare Overall Efficiency

*Simple Random Sampling $5920

Simple Random Sampling with $3200

Relaxed Decision Error Constraints

Simple Random Sampling with $2080

Increased Gray Region Width

*Simple Random Sampling with $3040

Compositing

Budget = $3,000

* Used original Decision Error Limits

Page 68: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Contamination of Tarheel County's Sole Drinking Water Source/System

Page 69: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Drinking Water Problem

Week 1 Quarterly monitoring of drinking water did not detectany contaminants above drinking water standards.

Week 2 Groundwater is the drinking water source for Tarheel County. Atrazine was discovered in surface waters (that are hydraulically connected to groundwater) at level up to 500 ppb, which is well above the maximum

contaminant level (MCL) of 3 ppb.

Week 3 Source of contamination has not been identified.

Week 4 - Citizens are concerned about threat to public health Present and demand that State and Local official ensure that

water is safe to drink.

Page 70: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Tarheel County Water Supply System

• 6 wells in wellfield• Water company operates water system• System capacity: 8.6 million gallons/day (MGD)• System demand: 3-5 MGD• System serves 25,000 residents• Minimal Treatment (chlorination only)• Centralized above-ground storage holds water from all

wells• Capacity is nearly 10 gallons to ensure 4-hour

residence time for chlorination

Page 71: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Tarheel CountyWater Supply System

Assignment:

Decide whether the level of atrazine in

drinking water exceeds the MCL and

requires corrective action.

Page 72: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Data Quality Objectives Decision Error Feasibility

Trials Software (DQO/DEFT)

Page 73: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

The Purpose of DEFT

• DEFT determines the feasibility of DQOs based on sample size and cost for several sampling designs

• DQOs are feasible if at least one sampling design can satisfy the DQOs (decision error limits, cost constraints, time limitations, etc.).

Page 74: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Uses of DEFT

• Aids in iterations between steps 6 and 7 of the DQO process

• That is, it provides a smooth transition between the specific DQOs and the development of a data collection design

• As a learning tool, facilitates understanding and communication

Page 75: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

What DEFT Cannot Do

DEFT should not be used to decide on a final data collection design or sample size.

It cannot account for differences between:

• Media• Contaminants• Spatial boundaries• Temporal boundaries

Page 76: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

How DEFT Works

• Utilizes outputs of the DQO process

• Evaluates several basic collection designs

• Estimates the number of samples

• Estimates costs of data collection designs

Page 77: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

What DQO Outputs are Necessaryas DEFT Inputs?

• Limits on decision errors

• Action level

• Possible range of parameter (minimum, maximum)

• Cost of sample collection and analysis per sample

• Location and width of gray region

• Estimated standard deviation

• Null hypothesis (H0)

Page 78: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Analysis of DEFT

Allows user to:

• Determine effect or change DQOs• View Decision Performance Goal Diagram• Change sampling design

– Simple Random Sampling– Composite Random Sampling– Stratified Random Sampling

• Set sample size• Save DQOs, design information, and decision

performance goal diagram to a file

Page 79: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

DECISION PERFORMANCE GOAL DIAGRAM

Simple Random Sampling conc. prob. typeSample Size - 37 0.25 0.100 F(a)Cost - $5920.00 0.75 0.200 F(a)Action Level - 1.00 1.00 0.050 F(r) 1.50 0.010 F(r) Press any key to return to input screen.

0 0.25 0.5 0.75 1 1.25 1.5 1.75 2 2.250

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Pro

babi

lity

of D

ecid

ing

tha

t th

eM

ean

Exc

eeds

the

Act

ion

Leve

l

Concentration

F(a)

Action Level

F(r)

Gray Region

Page 80: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

DEFT in the Project Life Cycle

DEFT DEFT

Planning Implementing Assessing

Page 81: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Beyond the DQO Process

Page 82: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

The Project Life Cycle

Planning

Implementation

Assessment

Make the Decision

Conduct Data QualityAssessment

Plan for Data CollectionUsing the DQO Process

Collect Environmental Data Using Documented Sampling

Schemes

EPA QA/G-4

EPA QA/G-5

EPA QA/G-9

Page 83: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

What Is A QA Project Plan?

• Mandatory planning document

• Part of mandatory Agency-wide Quality System

• Description of how data will be collected, assessed, and analyzed

• Project Blueprint - who, what, where, when, why

• Living document that is revised to reflect significant changes

Page 84: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

QA Project Plans (QAPPs)

• QAPPs must be approved prior to the start of data collection

• QAPPs are required when environmental data operations occur in:

– Intramural projects– Contracts, work assignments, delivery orders– Grants, cooperative agreements– Interagency agreements (when negotiated)– State-EPA agreements– Responses to statutory or regulatory requirements

and to consent agreements

Page 85: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

What Does A QA Project Plan Do For You?

When you are asked:

− "What did you do?"

− "How did you do it?"

− "Why did you do it?"

− "Did you do it correctly?"

The QA Project Plan has the answer.

Page 86: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Elements of a QA Project Plan

Group A. Project Management

Group B. Data Generation and Acquisition

Group C. Assessment and Oversight

Group D. Data Validation and Usability

Page 87: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Group A: Project Management Element

1. Title and Approval Sheet2. Table of Contents3. Distribution List4. Project/Task Organization5. Problem Definition/Background6. Project/Task Description7. Quality Objectives and Criteria8. Special Training Requirements/Certification9. Documentation and Records

Page 88: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Group B: Data Generation & Acquisition Elements

1. Sampling Process Design (Experimental Design)2. Sampling Methods Requirements3. Sample Handling and Custody Requirements 4. Analytical Methods Requirements5. Quality Control Requirements6. Instrument/Equipment Testing, Inspection, and

Maintenance Requirements7. Instrument Calibration and Frequency8. Inspection/Acceptance Requirements for Supplies

and Consumables9. Data Acquisition Requirements (Non-Direct

Measurements10. Data Management

Page 89: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Elements in Group C & Group D

Group C: Assessment & Oversight Elements 1. Assessments and Response Actions2. Reports to Management

Group D: Data Validation & Usability Elements1. Data Review, Validation, and Verification Requirements2. Validation and Verification Methods 3. Reconciliation with User Requirements

Page 90: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Data Quality Assessment (DQA)

• A process to determine if data are adequate for their intended use– scientific and statistical evaluation– determine if data are of the right type, quality, and

quantity

• Sample data are used to make decisions during DQA

• Does data provide "sufficient evidence" to draw conclusions?

Page 91: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Data Quality

• Data quality is meaningful only when "data quality" relates to intended use of data

• Some data are of adequate quality for some purposes but not for others

• Need to determine if the data are of the right type, quality, and quantity for their intended use

Page 92: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Data Quality Assessment Can

Answer:– Do the data violate the conceptual site model or

test assumptions?– Did I collect enough data? – What is my conclusion?

Can Not Answer:– Did I make a decision error?

(good decision -- bad outcome)– What are the "true" conditions?– Do I need different types of data?

Page 93: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Data Quality Assessment Can

Decision maker's contribution:

− Inspection of data for scientific anomalies

− Responsibility for transcription errors

− Assessment of effect of QA and QC deviations

− Professional contextual judgment

Page 94: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

DQA is a Joint Effort

Statistician's contribution:

− Graphical display of data and trends

− Statistical analysis required by the DQO

− Investigation of assumption violations

− Identification of potential outliers

− Providing direction for data improvement

Page 95: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

The 5 Steps of

Data Quality Assessment 1. Review the DQOs and Sampling Design

2. Conduct a Preliminary Data Review

3. Select the Statistical Test

4. Verify the Assumptions of the Statistical Test

5. Draw Conclusions from the Data

Page 96: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Guidance for Data Quality Assessment: Practical Methods for Data Analysis (G-9)

• Written for non-statisticians

• Supplements Agency guidance

• Does not replace statistical texts

• Regular supplements– Current examples– Shared information

Page 97: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

DataQUEST

• A PC-based software package that performs baseline Data Quality Assessment

• Provides simple tools to a wide audience

• Implements statistical methods described in guidance (G-9)

• Supplements guidance so description of statistical tools is not contained in the User's Guide

Page 98: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

Advantages

• Menu-based System - no special language or commands like statistical packages

• Does not treat data as discreet numbers in graphs like spreadsheets

• More standards statistical graphs than spreadsheets

Page 99: INTRODUCTION TO THE DATA QUALITY OBJECTIVES PROCESS.

QA Guidancewww.epa.gov/quality

Guidance for the Data Quality Objectives Process (G-4)− Planning process that ties data collection designs to user defined

decision error tolerances

Guidance for QA Project Plans (G-5)− Utilizes outputs of DQO Process for detailing data collection

operations, the "blue-print" of data collection

Guidance for Data Quality Assessment (G-9)− Assessment of data to establish if they meet user-defined decision

error limits