Statistical Methods for Nitrate Vulnerable Zone Review...

Report Reference: UC10943.07

November 2015

Statistical Methods for Nitrate Vulnerable

Zone Review 2017

RESTRICTION: This report has the following limited distribution:

External: Environment Agency

Any enquiries relating to this report should be referred to the Project Manager at the

following address:

WRc plc,

Frankland Road, Blagrove,

Swindon, Wiltshire, SN5 8YF

Telephone: + 44 (0) 1793 865000

Website: www.wrcplc.co.uk

Follow Us:

WRc is an Independent Centre

of Excellence for Innovation and

Growth. We bring a shared

purpose of discovering and

delivering new and exciting

solutions that enable our clients

to meet the challenges of the

future. We operate across the

Water, Environment, Gas, Waste

and Resources sectors.

http://www.wrcplc.co.uk/

http://uk.linkedin.com/company/wrc-plc

https://twitter.com/WRcPlc

http://www.facebook.com/pages/WRc-Plc/288329784573893

© Environment Agency 2015 The contents of this document are subject to copyright and all rights are reserved. No part of this document may be reproduced, stored in a retrieval system or transmitted, in any form or by any means electronic, mechanical, photocopying, recording or otherwise, without the prior written consent of Environment Agency.

This document has been produced by WRc plc.

Statistical Methods for Nitrate Vulnerable Zone

Review 2017

Authors:

Vicki Bewes

Project Statistician

Catchment Management

Date: November 2015

Report Reference: UC10943.07

Andrew Davey

Senior Statistician


Project Manager: Rob Stapleton

Project No.: 16426-0

Alice Goudie

Database Analyst


Client: Environment Agency

Client Manager: Stephanie Cole

Alastair Halliday

Statistician


Document History

Version

number

Purpose Issued by Quality Checks

Approved by

Date

V0.1 Interim report to confirm data processing. Andrew Davey NA 12/06/15

V0.2 Draft report to confirm statistical methodology and QA procedures

Andrew Davey

Rob Stapleton 03/07/15

V0.3 Draft report addressing EA comments on V0.2 and including kriging analysis

Rob Stapleton Rob Stapleton 24/07/15

V0.4 Draft final report addressing EA comments on V0.2 and V0.3 Rob Stapleton Rob Stapleton 04/08/15

V0.5 Final report addressing EA comments on V0.4 Rob Stapleton Rob Stapleton 27/08/15

V0.6 Inclusion of metadata appendix for SQL database and geodatabase

Rob Stapleton Rob Stapleton 24/09/15

V0.7 Amended final report with updated kriging results and Appendix C

Andrew Davey Rob Stapleton 26/11/15

Contents

Glossary ................................................................................................................................... 1

Executive Summary ................................................................................................................. 4

1. Introduction .................................................................................................................. 5

1.1 Background ................................................................................................................. 5

1.2 Objectives .................................................................................................................... 5

1.3 Structure of this report ................................................................................................. 6

2. Surface water methodology ........................................................................................ 7

2.1 Source datasets .......................................................................................................... 7

2.2 Data processing .......................................................................................................... 9

2.3 Data analysis methods .............................................................................................. 14

3. Groundwater methodology ........................................................................................ 23

3.1 Source datasets ........................................................................................................ 23

3.2 Data processing ........................................................................................................ 25

3.3 Data analysis methods .............................................................................................. 29

3.4 Kriging ....................................................................................................................... 45

4. Recommendations .................................................................................................... 54

References ............................................................................................................................. 55

Appendices

Appendix A Multiple Outlier Test................................................................................. 56

Appendix B Meta Data Overview ................................................................................ 59

Appendix C Evaluation of alternative kriging models .................................................. 71

List of Tables

Table 2.1 Purpose codes included in surface water analysis ................................... 7

Table 2.2 Summary of water company surface water dataset .................................. 9

Table 2.3 Sampling point types included in surface water analysis ........................ 10

Table 2.4 Summary of surface water data processing ............................................ 13

Table 2.5 Summary of confidence classes .............................................................. 14

Table 2.6 Minimum data rules for surface water analysis ....................................... 15

Table 2.7 Summary of surface water methodological refinements ......................... 21

Table 3.1 Purpose codes included in groundwater dataset .................................... 23

Table 3.2 Summary of water company surface water dataset ................................ 25

Table 3.3 Sampling point types included in groundwater analysis.......................... 27

Table 3.4 Summary of groundwater data processing ............................................. 29

Table 3.5 Minimum data rules for groundwater analysis ......................................... 31

Table 3.6 Summary of groundwater methodological refinements ........................... 42

Table 3.7 Parameter estimates for current and future concentration variogram models .................................................................................... 49

Table A.1 Critical values (P = 1%) of the tmax statistic ............................................. 57

Table B.1 Metadata containing descriptions for each SQL database table contained within the SW and GW SQL Server Database ................................................................................................. 59

Table B.2 Metadata containing descriptions and a summary/purpose for each Geodatabase Table included in the final File Geodatabase ........................................................................................... 61

Table B.3 Metadata containing descriptions and a summary/purpose for each Geodatabase feature class included in the final File Geodatabase ........................................................................................... 66

Table C.1 Comparative kriging performance of alternative variogram models ..................................................................................................... 72

List of Figures

Figure 2.1 Historical trend and future forecast of 95th percentile nitrate

concentration using quantile regression, and comparison to the 11.29 mg N/l (50 mg NO3/l) standard (red dashed line) .................... 20

Figure 3.1 Historical trend and future forecast of mean nitrate concentration using AntB, and comparison to the 9.71 mg N/l (43 mg NO3/l) standard (horizontal blue dashed line) ........................ 36

Figure 3.2 Historical trend and future forecast of mean nitrate concentration using AntC, and comparison to the 9.71 mg N/l (43 mg NO3/l) standard (horizontal blue dashed line) ........................ 39

Figure 3.3 Historical trend and future forecast of mean nitrate concentration using the ‘Mean Concentration’ method, and

comparison to the 9.71 mg N/l (43 mg NO3/l) standard (horizontal blue dashed line) ................................................................... 41

Figure 3.4 Location of the 4,160 groundwater monitoring points used in the kriging analysis .................................................................................. 48

Figure 3.5 Variogram for current (a) and future (b) 95th percentile TIN concentrations ......................................................................................... 51

Figure 3.6 Kriged predictions of current (2015) 95th percentile TIN

concentration ........................................................................................... 52

Figure 3.7 Kriged predictions of future (2027) 95th percentile TIN

concentration ........................................................................................... 53

Figure C.1 Comparison of observed and predicted nitrate concentrations ......................................................................................... 73

Environment Agency

Report Reference: UC10943.07/16426-0 November 2015

© Environment Agency 2015 1

Glossary

Autocorrelation Autocorrelation describes the tendency for observations in a

sequence to be correlated with preceding observations. The strength

of correlation between successive pairs of measurements is quantified

by a lag-1 autocorrelation coefficient, which takes values between -1

(perfect negative correlation) and +1 (perfect positive correlation).

Positive autocorrelation is common in water quality monitoring data,

where the measured concentration in one sample often gives a good

idea of what the measured concentration will be in the next sample

Bimodal distribution The spread of concentration values in a set of water quality samples

commonly has a single peak (or ‘mode’), with the bulk of samples

having intermediate values and a relatively low number of samples

having low or high values. Some datasets, however, show a spread of

values with two peaks - a cluster of lower concentrations and a

second cluster of higher concentrations; this is termed a bimodal

distribution.

A bimodal distribution can indicate strongly fluctuating water quality, or

indicate that water samples have been collected from two locations

with different levels of pollution.

Coefficient of variation (CoV) A standardised measure of the degree of variability in a dataset,

relative to the mean value of the dataset. The CoV is calculated by

dividing the mean by the standard deviation. Monitoring points with

higher mean concentrations tend also to show higher variability in

concentrations, so the CoV provides a way of comparing variability

among monitoring locations with different levels of pollution.

Environment Agency



Confidence interval / limits A confidence interval quantifies uncertainty in the estimate of a

parameter by giving a range of values that is likely to include the true

(unknown) population parameter. For example, a 90% confidence

interval around a sample mean indicates that one can be 90%

confident that the true population mean lies within that range. In other

words, there is a probability of only 10% or 0.1 that the true mean

value lies outside the stated confidence interval.

The upper and lower bounds of the confidence interval are called

confidence limits.

Kriging Kriging is a statistical interpolation technique. It is used to predict the

value of a function at a given point by computing a weighted average

of the known values of the function in the neighbourhood of the point

(i.e. predict TIN concentrations at unmonitored locations, based on

TIN concentrations at monitored locations). The closer the monitoring

point is to the location for which concentrations are to be predicted,

the greater the weight given to the monitoring point.

(Arithmetic) Mean This is a measure of the central tendency or ‘middle’ value of a

dataset. It is the sum of the data values divided by the number of

observations.

Monte Carlo simulation Monte Carlo methods are a broad class of computational algorithms

that rely on repeated random sampling to simulate a situation and

obtain numerical results; typically a simulation is run many times over

in order to obtain the probability distribution of an unknown parameter.

The name comes from the resemblance of the technique to the act of

playing and recording results in a real gambling casino. Monte Carlo

methods are often used in physical and mathematical problems when

it is difficult or impossible to obtain a closed-form expression, or

unfeasible to apply a deterministic algorithm.

MOT Short for Multiple Outlier Test. By definition, outliers are not

representative of conditions at a monitoring location and can therefore

be removed legitimately from the dataset to avoid giving a false

picture of water quality. Where there appear to be several outliers, the

Multiple Outlier Test considers whether these measurements are likely

to be true outliers, that ought to be excluded, or whether they are high

or low measurements within the normal range of variability, that ought

to be included in the analysis.

Environment Agency



Outlier An outlier is an observation in a set of data that is far removed in

value from the others in the same data set – i.e. one that has an

unusually large or small value compared to others. Outliers are not

representative of water quality at the monitored location. For example,

they may indicate that a pollution event has occurred or that a water

sample has become contaminated after being taken.

Percentile A percentile is a summary statistic that provides information about the

distribution (spread) of values in a defined population. The pth

percentile is the smallest value such that at least p% of the items in

the population are no larger than it. For example, the 95th percentile is

the value that exceeds 95% of the population and is exceeded by 5%

of the population.

Quantile regression Quantile regression is a statistical technique that explores how one or

more independent variables influence a specified percentile value of

the response variable. In contrast to conventional linear regression,

which seeks to explain variation in the mean of the response variable,

quantile regression can explore how the occurrence of high values

(high nitrate concentrations in this case) changes over time.

Regression analysis Regression analysis is a process for estimating relationships among

variables. It includes a wide range of techniques for analysing several

variables, when the focus is on the relationship between a dependent

variable (e.g. nitrate concentration) and one or more independent

variables (e.g. time).

Statistical significance In statistics, statistical significance is achieved when a test produces a

p-value that is less than a pre-determined significance level. The p-

value is the probability of obtaining a result (i.e. a change, difference,

relationship) at least as extreme as that observed if there was

genuinely no change/difference/relationship in reality. By convention,

results are declared statistically significant when the p-value is less

than 0.05; this means that there is a less than 5% (or 1 in 20) chance

of falsely concluding that there is a change/difference/relationship

when in fact there isn’t. In other words, statistical significance testing

provides a safeguard against spurious results that can arise due to

random chance.

Standard deviation A metric used to quantify the amount of variation or dispersion of a set

of data values. A standard deviation close to 0 indicates that the data

points tend to be very close to the mean of the dataset, while a high

standard deviation indicates that the data points are spread out over a

wider range of values.

Environment Agency



Executive Summary

The Nitrates Directive (91/676/EEC) is intended to protect waters against nitrate pollution from

agricultural sources. Member States are required to identify waters which are or could become polluted

by nitrates and to designate these waters and all land draining to them as Nitrate Vulnerable Zones

(NVZs). The Directive sets the following criteria for identifying polluted waters:

Surface freshwaters and groundwaters which contain or could contain, if preventative action is

not taken, more than 50 mg NO3/l nitrate.

NVZ Reviews were carried out in 1993, 1998, 2002, 2008 and 2013. As part of the 2017 NVZ Review,

the Environment Agency (EA) commissioned WRc to undertake a statistical analysis of river and

groundwater monitoring data in order to provide an up-to-date assessment of current and future nitrate

concentrations in England. The Environment Agency intends to combine this information with other

lines of evidence to form statements of case for new and renewed NVZ designations in 2017.

For consistency and comparability, this Review follows closely the methodology used for the previous

(2013) Review of NVZs in England and Wales. However, the present Review provided an opportunity to

learn lessons from the 2013 Review and improve how the method was applied and the results were

presented.

The main steps in the statistical analysis were as follows.

1. Nitrate monitoring data from routine monitoring sites in England between 1990 and 2014 (surface

waters) and 1980 and 2014 (groundwaters) was collated from the Environment Agency and

water companies.

2. The data were processed to: (i) exclude monitoring points that did not represent raw water quality

in rivers and groundwaters; (ii) exclude negative and zero measurements; (iii) adjust “less-than”

measurements; (iv) estimate the concentration of total inorganic nitrogen (TIN) and total oxidised

nitrogen (TON) in each sample; (v) identify replicate samples taken on the same day; and (vi)

exclude samples taken by auto-samplers under high flow conditions.

3. Historical trends in TIN concentration at each monitoring point were characterised and then

extrapolated to estimate current (2015) and future (2020 for rivers and 2027 for groundwaters)

water quality. The statistical methods used to assess current and future status at each monitoring

point depended upon the amount of data available. The results were used to assess whether or

not the 95th percentile TIN concentration exceeds 50 mg NO3/l, now or in the future, and the

degree of confidence in the assessment.

4. Groundwater monitoring points only provide data on nitrate concentrations at specific discrete

locations within aquifers and so a statistical interpolation technique (kriging) was used to

understand spatial patterns in the groundwater quality and hence estimate nitrate concentrations

at unmonitored locations.

Environment Agency



1. Introduction

1.1 Background

The Nitrates Directive (91/676/EEC) is intended to protect waters against nitrate pollution from

agricultural sources. Member States are required to identify waters which are or could

become polluted by nitrates and to designate these waters and all land draining to them as

Nitrate Vulnerable Zones (NVZs). Farmers in designated areas must follow an Action

Programme to reduce pollution from agricultural sources of nitrate. The criteria for identifying

waters as polluted are established in the Directive, which also sets out monitoring

requirements. NVZ designations must be reviewed at least every four years.

The Directive sets the following criteria for identifying polluted waters:

Surface freshwaters which contain or could contain, if preventative action is not taken

(i.e. Action Programme measures), more than 50 mg NO3/l nitrate.1

Groundwater which contains or could contain, if preventative action is not taken, more

than 50 mg NO3/l nitrate.

1.2 Objectives

As part of the 2017 NVZ Review, the Environment Agency (EA) commissioned WRc to

undertake a statistical analysis of surface water (i.e. rivers, streams and land drains) and

groundwater monitoring data in order to provide an up-to-date assessment of current and

future nitrate concentrations in England.

For consistency and comparability, this review follows closely the methodology used for the

previous (2013) Review of NVZs in England and Wales (Environment Agency 2012a, 2012b).

However, the present Review provided an opportunity to learn lessons from the 2013 Review

and improve how the method was applied and the results were presented. Some refinements

were made to the methodology to make better use of the available data and to provide a more

rigorous system of checks and balances. These refinements are highlighted and discussed in

this document.

1 50 mg/l nitrate as NO3 is equivalent to 11.29 mg/l nitrate as N

Environment Agency



1.3 Structure of this report

The remainder of this report is divided into three sections. Section 2 explains and justifies the

method used to analyse trends in surface water quality, Section 3 explains and justifies the

method used to analyse trends in groundwater quality (including kriging), and Section 4

suggests some possible refinements to the methodology that the EA may wish to consider for

future NVZ Reviews. Further technical information is provided in Appendices A to C.

Environment Agency



2. Surface water methodology

2.1 Source datasets

2.1.1 Environment Agency monitoring data

The EA extracted from its WIMS database all water quality data collected at routine surface

water monitoring sites in England between 1990 and 2015. The determinand codes extracted

were:

0111 (Ammoniacal nitrogen as N);

0116 (Total oxidised nitrogen as N);

0117 (Nitrate as N);

0118 (Nitrite as N); and

9880 (Nitrate as NO3).

Data supplied to the EA by external organisations (Purpose Code = XO) was not included,

thereby preventing any double-counting of water company monitoring data. Measurements

with purpose codes relating to reactive sampling of pollution incidents and monitoring of waste

sites were not provided as they were thought to be unrepresentative of normal water quality.

Table 2.1 shows which purpose codes were contained within the data provided by the EA.

Table 2.1 Purpose codes included in surface water analysis

Purpose

Code Purpose Code Description Included?

CA Compliance audit (permit) Yes

CF Compliance formal (permit) No

CI Statutory audit (operator data) Yes

CO Water quality UWWTD monitoring data Yes

CS Water quality operator self-monitoring compliance data Yes

IA IPPC/IPC monitoring (Agency audit - permit) Yes

IF IPPC/IPC monitoring (formal sample) No

II IPPC/IPC monitoring (Agency investigation) Yes

IO IPPC/IPC monitoring (operator self-monitoring data) Yes

Environment Agency



Purpose

Code Purpose Code Description Included?

IT Instrument trial No

MI Statutory failures (follow ups at designated points) No

MN Monitoring (national Agency policy) Yes

MP Environmental monitoring (GQA & RE only) Yes

MS Environmental monitoring statutory (EU directives) Yes

MU Monitoring (UK govt policy - not GQA or RE) Yes

PF Planned formal non-statutory (permit/env mon) No

PI Planned investigation (operational monitoring) No

PN Planned investigation (national Agency policy) Yes

SI Statutory failures (follow ups at non-designated points) No

UF Unplanned reactive monitoring formal (pollution incidents) No

UI Unplanned reactive monitoring (pollution incidents) No

WA Waste monitoring (Agency audit - permit) No

WF Waste monitoring (formal sample) No

WI Waste monitoring (Agency investigation) No

WO Waste monitoring (operator self-monitoring data) No

XO External organisation monitoring No

ZZ Unspecified at time of WIMS load No

The raw dataset contained 19,975 unique monitoring points and 2,264,316 unique samples.

2.1.2 Water company monitoring data

Defra invited water companies to provide surface water quality data collected as part of their

routine monitoring programmes. Six companies provided valid data (including co-ordinates)

for a total of 108,721 samples from 251 unique surface water monitoring sites. Table 2.2

provides a summary for each water company.

Environment Agency



Table 2.2 Summary of water company surface water dataset

Water Company Determinands recorded Sites Samples Date span

Affinity Water Nitrate as N,

Nitrite as N,

Nitrogen, Total Oxidised as N

4 1,770 Jan 2004 –

June 2015

Bristol Water Nitrate as NO3 10 9,091 July 1995 –

March 2015

Sutton and East

Surrey

Nitrate as NO3 1 219 Jan 2009 –

Dec 2014

United Utilities Ammoniacal Nitrogen as N

Nitrate as N

Nitrite as N


179 54,530 Jan 1980 –

April 2015

Wessex Water Nitrate as N 48 35,782 Jan 1980 –

March 2015

Yorkshire Water Nitrate as N 9 7,329 Jan 1980 –

April 2015

Totals 251 108,721

2.2 Data processing

2.2.1 Monitoring sites

Since the EA monitoring network includes some sites with the same ID in different regions, a

unique Site ID was created for each EA monitoring site by combining the Site ID and EA

Region fields. For water company data, a unique Site ID was created by combining the Site ID

and abbreviated water company name.

The co-ordinates of each monitoring site were checked for missing or inconsistent eastings

and northings, but no issues were found for the EA data. Some water company sites had

missing co-ordinates but were retained in case valid grid references could be provided at a

later date. Multiple sites with the same co-ordinates were retained as separate sites.

The following meta-data were tabulated for each monitoring site:

unique Site ID;

site name;

Environment Agency



co-ordinates (eastings and northings);

type of water monitored (river, lake or TraC);

site operator (Environment Agency or Water Company);

EA sampling point type (where applicable); and

EA region (where applicable).

As far as possible, the EA and water company datasets were filtered to exclude monitoring

sites that:

sampled water after it had been treated to remove nitrate (and so is not representative

of raw water quality);

sampled water that had been blended from multiple sources;

sampled water from canals (which may contain water from a number of WFD

catchments and not be representative of a defined geographic area); and

sampled water from lakes and reservoirs (which are subject to a separate methodology

to identify eutrophic waters).

To achieve this, the EA dataset was filtered according to the sampling point types listed in

Table 2.3. The same filter was applied to the Affinity Water and Sutton and East Surrey Water

datasets, which had also been extracted from the EA’s WIMS database. For the remaining

four water company datasets, all surface water monitoring sites were assumed to sample

untreated and unblended river water.

Table 2.3 Sampling point types included in surface water analysis

Sampling

Point Type

Code

Sampling Point Description Included

F1 Freshwater - RQO RE1 Yes





Environment Agency



Sampling

Point Type

Code

Sampling Point Description Included

F6 Freshwater - Non Classified River Points Yes

FA Freshwater - Lakes/Ponds/Reservoirs No

FB Freshwater - River Transfer Yes

FC Freshwater - Comparative Inlet Points Yes

FD Freshwater - River Augmentation Yes

FF Freshwater - Canal - RQO RE2 No

FG Freshwater - Canal - RQO RE3 No

FH Freshwater - Canal - RQO RE4 No

FJ Freshwater - Canals - Non Classified No

FK Freshwater - Land Drains Yes

FL Freshwater - Bathing Water Yes

FZ Freshwater - Unspecified Yes

This removed 1,922 sites from the EA data (of which 400 were canals). No sites were

removed from the water company data as only the relevant monitoring point types were

supplied.

As this NVZ Review only concerns England, 2,115 monitoring sites in Wales (Region ID =

WE) were also excluded.

2.2.2 Samples

For the EA WIMS dataset, a Unique Sample ID was created for each sample by combining

the UniqueSiteID, Sample Reference Number and Date (yyyy/mm/dd) fields. For the water

company dataset a Unique Sample ID was created for each sample by combining the

UniqueSiteID, Date and Time fields (or, for United Utilities, the UniqueSiteID and Date fields

only as no Time field was available).

Samples collected before 1st January 1990 were excluded because they were not regarded as

indicative of recent nitrate trends. Samples collected after 31st December 2014 were also

excluded because a full calendar year of monitoring data was not available for 2015 and,

where water quality fluctuates seasonally, the use of incomplete years could cause current

nitrate concentrations to be over- or under-estimated.

The data were screened to identify how many samples were taken from each site on each

day. In most cases, a single water sample was taken from a given site on a given day, but

multiple samples were sometimes taken. These could be a set of manual samples collected

Environment Agency



as part of a local investigation, or a series of samples collected by an auto-sampler. The

presence of multiple samples potentially causes two problems. First, any analysis of nitrate

concentrations is weighted towards days when more samples are taken. Second, if an auto-

sampler is triggered to take samples during high flow events then the results are not

representative of typical conditions (although they still provide useful information about water

quality during periods of wet weather). As it was not possible to tell where and when flow-

driven auto-samplers had been deployed, we adopted the following rules to minimise their

prevalence in the data.

Where three or more samples had been taken from the same site on the same day, all

samples were excluded.

Where two samples had been taken from the same site on the same day, the sample

with the highest TIN reading was retained and the other sample excluded. If the two

samples had the same TIN reading then the sample with the highest TON was retained

and if the two samples had identical readings for both determinands the one with the

highest sample number was retained.

In the case of United Utilities, the time of each sample was not recorded and so an average

concentration was calculated for each determinand on each day, and these daily average

values were used to calculate TIN and TON concentrations as described in Section 2.2.3

below.

If neither TON nor nitrate was measured (i.e. the sample only had measurements for nitrite

and/or ammonium) then the sample was excluded.

2.2.3 Determinands

Most water samples were measured for multiple determinands. These determinand results

were combined to calculate the concentration of total inorganic nitrogen (TIN) and total

oxidised nitrogen (TON) in each sample, as described below.

To make all determinands comparable, measurements recorded in mg/l as NO3 were

converted to mg/l as N by multiplying by 0.2258. Thus, 50 mg NO3/l corresponds to 11.29 mg

N/l.

Zero readings were removed. These can arise for a variety of reasons, but it cannot be

assumed that they represent readings below the Limit of Detection (LOD). Negative readings

were also removed as these are clearly erroneous.

“Less than” values were treated using the standard EA approach of dividing the recorded

concentration by two. “Greater than” values were not very prevalent in the data and were not

adjusted (i.e. the concentration measurement was used as reported).

Environment Agency



Total Inorganic Nitrogen (TIN) was calculated by the following rules, listed in order of

declining preference:

= TON + ammonium;

= nitrate + nitrite + ammonium;

= TON;

= nitrate + nitrite;

= nitrate + ammonium;

= nitrate.

Total Oxidised Nitrogen (TON) was calculated by the following rules, listed in order of


= TON;


= nitrate.

TIN and TON were not calculated if TON and nitrate were both missing, but were calculated if

nitrite and/or ammonium were missing because these determinands typically represent only a

small proportion of the nitrogen in the water sample.

2.2.4 Summary of cleaned dataset

The surface water data processing is summarised in Table 2.4.

Table 2.4 Summary of surface water data processing

Data processing stage No. of sites No. of samples

Raw data EA dataset 19,975 2,264,316

Water company dataset 251 108,721

Combined dataset 20,226 2,373,037

Data

exclusion

rules

Out of scope SMPT_TYPE 1,922 158,086

Monitoring point in Wales 2,115 206,894

Sample >2015 5 7553

Sample <1990 1 13,307

Zero or negative readings 0 151

Autosamplers 26 11,391

Nitrite and/or ammonium only 713 138,028

Two samples on the same day 0 6504

Two samples, identical values 0 440

Total excluded 4,782 542,218

Combined dataset 15,444 1,830,819 1

107 individual readings deleted; 15 samples only had zero/negative readings and were excluded.

Environment Agency



2.3 Data analysis methods

2.3.1 Overview

Total inorganic nitrogen (TIN) was used to measure the concentration of nitrogenous

compounds in water samples. TIN includes nitrate, nitrite and ammonium, of which nitrate is

usually the dominant fraction. Ammonium derives from both waste water treatment works

(WwTWs) and from agricultural sources and is rapidly oxidised to nitrate under normal riverine

conditions. To assess the contribution of ammonium to observed TIN concentrations, a

parallel analysis was performed using total oxidised nitrogen (TON), which comprises just

nitrate and nitrite.

Each surface water monitoring site with sufficient data was analysed to determine whether or

not:

the 95th percentile TIN concentration currently exceeds 50 mg NO3/l; or,

the 95th percentile TIN concentration is likely to exceed 50 mg NO3/l in the future,

assuming no preventative action is taken.

If either the current or future 95th percentile concentration exceeded 50 mg NO3/l, the

monitoring site was considered to have failed the assessment. The level of confidence in the

result was recorded as one of six classes (Table 2.5).

Table 2.5 Summary of confidence classes

Class Description

1 At least 95% confidence that the 95th percentile concentration is ≤50 mg NO3/l



4 At least 50% confidence that the 95th percentile concentration is >50 mg NO3/l



The statistical methods used to assess current and future status at each monitoring site

depended upon the amount of data available, as set out in Table 2.6. The assessment of

current status was based on analysis of data from the last six calendar years (i.e. 2009-2014)

or, where less data was available, on a statistical extrapolation of the historical (1990-2014)

trend to predict the 95th percentile concentration in mid-2015. The assessment of future status

used the same statistical extrapolation method to predict the 95th percentile concentration in

mid-2020.

Environment Agency



To ensure that the results of the analysis were based on sound monitoring evidence, sites

were not assessed if they:

had fewer than 19 water quality samples in total (the minimum number of samples

required to estimate a 95th percentile value using the Weibull method);

contained data from fewer than 5 calendar years (i.e. too few years to average out year

to year variation in water quality); or

had no water quality samples for the period 2009-2014 (i.e. no recent water quality

data).

Table 2.6 Minimum data rules for surface water analysis

Rule Current status

assessment

Future status

assessment

Number

of sites

At least 19 samples and at

least 5 years of data for period

2009-2014

Weibull estimate of

95th percentile for

2009-2014

Quantile regression

forecast of 95th

percentile in mid-2020

3,021

At least 19 samples and at

least 5 years of data for period

1990-2014 and at least 1

sample for period 2009-2014

Quantile regression

forecast of 95th


Quantile regression

forecast of 95th


3,449

Fewer than 19 samples, fewer

than 5 years of data for period

1990-2014, or no samples for

period 2009-2014

No assessment No assessment 9,067

TOTAL 15,444

All statistical analyses were conducted using R v. 3.2.0 (R Core Team 2015).

2.3.2 Data screening

Prior to analysis, exceptionally high and low concentration measurements at each site were

identified using a Multiple Outlier Test (described in Appendix A) and excluded from

subsequent calculations. Excluding outliers from the analysis makes the results more robust

and less sensitive to isolated, atypical measurements. A relatively high exclusion threshold

was used, however, to minimise the risk of discarding samples that were genuinely

representative of water quality at that monitoring point. Further checks were carried out to

assess the sensitivity of the results to any high and low concentration measurements

remaining in the dataset (see Section 2.3.6 for details).

Environment Agency



Next, the number of samples per calendar year was counted and the longest gap in the

monitored period was identified. This information enables appropriate input criteria to be set

for characterising the historical trend.

Finally, each site was also screened for evidence of step-changes in water quality by counting

instances where there was more than a 2-fold difference in mean concentration between

consecutive pairs of calendar years.

2.3.3 Weibull method

The Weibull method was used to estimate the current 95th percentile concentration at

monitoring site that had at least 19 water quality samples and at least five years of data

between 2009 and 2014.

The Weibull protocol is a robust technique because it doesn’t make any prior assumption

about the underlying distribution of the data (e.g. it doesn’t require the data to follow a normal

or log-normal distribution). The Weibull protocol is also relatively insensitive to outliers and

provides an assessment of conditions over a six year period, which makes the results less

sensitive to random, short-term fluctuations in water quality.

The Weibull method uses the rth ranked value within the observation dataset to provide an

estimate of the 95th percentile, where r = 0.95(n + 1) and n is the number of samples. When r

is not an integer, r is rounded down and up to the nearest whole number, and the

corresponding concentration values for these ranks are interpolated to estimate the 95th

percentile. Conservative 90% and 50% confidence intervals were calculated using binomial

distribution theory, as described in the EA Codes of Practice for Data Handling (Ellis et al.

1993).2 If the lower 90% confidence limit exceeded 50 mg NO3/l, the monitoring point was

deemed to have failed the test with high (95%) confidence; if the lower 50% confidence limit

exceeded 50 mg NO3/l, the monitoring point was deemed to have failed the test with medium

(75%) confidence; if the 95th percentile estimate exceeded 50 mg NO3/l, the monitoring point

was deemed to have failed the test with low (50%) confidence.

The results were visualised as shown in Figure 2.1, where:

the dark green solid horizontal line 2009-2014 marks the 95th percentile estimate;

the green shaded band indicates the 50% confidence interval around the 95th percentile

estimate; and

2 A minimum of 28 and 59 samples are required to calculate the upper 50% and 90% confidence

limits, respectively, so for some sites it was possible only to demonstrate with medium or low

confidence that the 95th

percentile was below the threshold.

Environment Agency



the wider, light green shaded band indicates the 90% confidence interval around the

95th percentile estimate.

2.3.4 Quantile regression method

For all sites that met the minimum data requirements (at least 19 samples and at least 5 years

of data for period 1990-2014 and at least 1 sample for period 2009-2014), quantile regression

was used to characterise trends in water quality between 1990 and 2014. The historical trend

was then extrapolated using Monte Carlo simulation to estimate the current (mid-2015) 95th

percentile concentration at monitoring sites that had too few samples for the Weibull method,

and to forecast the 95th percentile concentration in mid-2020.

The approach was based closely on the ‘AntB’ method used to analyse trends at groundwater

monitoring points (see Section 3.3.4). AntB uses multiple regression to characterise between-

year and seasonal variation in mean concentration and then applies a conversion factor to

estimate the 95th percentile concentration. One drawback with this approach is that it

assumes that the trend in the 95th percentile mirrors the trend in the mean concentration. This

will not always be the case; for example, if the occurrence of occasional high nitrate readings

reduces or increases over time then the 95th percentile concentration may decrease (or

increase) with little change in mean concentration.

For this reason, a quantile regression was used in place of a classical linear regression to

characterise directly any changes over time in the 95th percentile concentration. Quantile

regression (Koenker and Hallock 2001) is a statistical technique intended to estimate, and

conduct inference about, conditional quantile functions. Just as classical linear regression

methods based on minimizing sums of squared residuals enables the estimation of means,

quantile regression methods offer a mechanism for estimating the median, and the full range

of other quantile functions, by minimising weighted sums of absolute residuals.

Quantile regression is a robust technique that makes no assumptions about the underlying

probability distribution of the data. It is relatively insensitive to outliers. Outlying responses

influence the fit in only so far as they are either above or below the fitted value, not how far

above or below.

The statistical method comprises two main steps: trend analysis, and forecasting.

Trend analysis

A quantile regression model was fitted to the data to describe historical variation in the 95th

percentile concentration at each monitoring site. Temporal variation in water quality was

represented using a series of linear spline functions, which describe changes as a series of k

straight lines connected by k-1 ‘hinge points’. Each spline had to span at least 12 data points

and at least three years. Three years was chosen to strike a balance between flexibility (a

longer period would struggle to capture the timing of changes in water quality) and sensitivity

Environment Agency



(a shorter period would be unduly sensitive to short-term ‘blips’ in water quality). Hinge points

were constrained to occur at the end of calendar years (to average out seasonal variation),

and at least three years from the beginning or end of the time series (to avoid picking up short

term trends in water quality).

Following preliminary analyses, it was found that the slopes of the splines at the ends of the

time series could be sensitive to unusually high or low readings, so the first and last three

years of data were replicated at the ends of the time series to stabilise the fitted spline slopes.

The splines fitted to the replicated data were disregarded in the later forecasting phase.

The quantile regression trend analysis was undertaken using the ‘quantreg’ package

(Koenker 2015), with one appropriately defined dummy variable per spline.

All the linear spline terms were forcibly included, and carried through into the forecasting

phase, regardless of whether or not they were statistically significant. The justification for

including them in the latter case is that they are still the best estimates of the historical trends

– and if they are not statistically significant the sizes of the effects are likely to be relatively

slight.

Forecasting

Monte Carlo simulation was used to extrapolate the historical trend and forecast future water

quality at each site. The simulation assumes that the 25 years of historical data provides the

best (unbiased) evidence as to the likely future direction of water quality and is constructed by

selecting slopes at random from a defined population of spline slopes.

Suppose the historical quantile regression model has m splines. In general, some of these

slopes will have been estimated less precisely than others – as reflected by their standard

errors (SE). The simulation allows for this by defining the weight for spline slope j to be

proportional to 1/SE(j) and, using these weights, calculating the weighted mean (AvB) and

weighted standard deviation (SDB) of the m spline slopes.

The other inputs to the forecasting are:

the final spline slope;

the lag-1 autocorrelation of the spline slopes, R1, which defaults to 0 if there are fewer

than five splines;

the average spline length (the mean number of years spanned per spline); and

the quantile regression prediction of the 95th percentile at the end of the monitored

period (τ) and its standard error (τSE).

Environment Agency



Taking as its starting point a 95th percentile concentration drawn from the Normal distribution

N(τ, τSE) and the final spline slope, the Monte Carlo simulation generates a projection of how

the 95th percentile nitrate concentration might change in the future, starting from the year

immediately following the end of the monitoring record.

The future projection is constructed by selecting a series of spline slopes at random from the

Normal distribution N(AvB, SDB), and loosely associating each with the preceding slope to an

extent governed by the autocorrelation coefficient R1. Specifically, the slope for time step i+1

is generated from the previous time step’s slope (Bi) and the random slope (Brand) as follows:

Bi+1 = (1 – R1) AvB + R1 Bi + (1 – R12) ( Brand – AvB).

Taking the average spline length as the time step, the process is continued until the required

forecasting horizon is reached. Starting at the beginning of 2010, for example, if the average

spline length was 3.5 years, a series of three slopes would be generated to predict the 95th

percentile concentration in mid-2020.

The whole process was repeated 10,000 times to generate a range of possible ‘futures’. A

variety of summary statistics were harvested from each of the 10,000 simulated time series,

including the median, which represents the best estimate of the 95th percentile concentration

in mid-2015 and mid-2020. The uncertainty in this estimate was quantified by ranking the

10,000 forecasts for that year and determining the 10th, 25

th, 75

th and 90

th percentile values.

Figure 2.1 shows an example of a fitted quantile regression model and the forecasts

generated from it. The red dashed line at 11.29 mg N/l indicates the threshold value and the

solid blue zig-zag line indicates historical fluctuations in the 95th percentile TIN concentration.

Beyond the end of the monitoring record in 2014:

the blue line indicates the 95th percentile concentration forecast by Monte Carlo

simulation;

the blue shaded band indicates the 50% confidence interval around the 95th percentile

estimate; and

the wider, light blue shaded band indicates the 90% confidence interval around the 95th

percentile estimate.

In this example, the 95th percentile concentration is forecast to be 5.8 mg N/l in mid-2015 (the

first vertical dashed line) with a 90% confidence interval of 4.0-7.5 mg N/l, and 5.9 mg N/l in

mid-2020 (the second vertical dashed line) with a 90% confidence interval of 3.1-8.8 mg N/l.

Environment Agency



Figure 2.1 Historical trend and future forecast of 95th

percentile nitrate

concentration using quantile regression, and comparison to the 11.29 mg N/l (50 mg

NO3/l) standard (red dashed line)

2.3.5 Comparison with 2013 Review

Whilst the statistical methodology followed that used for the 2013 NVZ Review as closely as

possible, a small number of refinements were made to improve the accuracy and repeatability

of the results. These refinements are detailed in Table 2.7.

Environment Agency



Table 2.7 Summary of surface water methodological refinements

Refinement Justification

The number of simulated future time series

was increased from 1,000 to 10,000.

The greater number of shots provides greater

stability in the future forecasts.

A fixed seed was used in the random number

generator.

The use of a fixed seed makes the results of

the Monte Carlo simulation reproducible.

Uncertainty in the 95th percentile

concentration at the end of the monitored

period was incorporated into the Monte Carlo

simulation.

By not using a fixed starting concentration,

the simulation produces a wider range of

projections that more accurately represent the

uncertainty in future concentrations.

The minimum number of data points to be

spanned by each spline in the quantile

regression model was increased from 4 to

12.

A larger number of samples per spline

produces a better fitting quantile regression

model, particularly when there are there are

large gaps in the monitoring record.

A statistical test was added to test whether

the measured concentrations at each site

follow a bimodal or multi-modal distribution.

The test provides a useful indicator of sites

where erratic concentrations can give rise to

unreliable future forecasts.

The coefficient of variation (standard

deviation / mean) was calculated for each

site.

A high coefficient of variation can indicate

sites where highly variable concentrations

can give rise to unreliable future forecasts.

For each site, the % ammonium contribution

across all n samples was calculated as:

∑ 𝑇𝐼𝑁𝑖𝑛𝑖=1 − ∑ 𝑇𝑂𝑁𝑖

𝑛𝑖=1

∑ 𝑇𝐼𝑁𝑖𝑛𝑖=1

A high % ammonium contribution can indicate

a strong pollution contribution from point

source discharges.

For each site, the mean concentrations in

winter (Dec-Feb), spring (Mar-May), summer

(Jun-Aug) and autumn (Sep-Nov) were

calculated.

Peak concentrations in summer can indicate


source discharges.

2.3.6 Quality assurance of results

A series of automatic checks were undertaken to assess the performance of the statistical

methods and the accuracy of current and future status assessments. Sites were flagged for

more detailed checks if they displayed significant data quality issues (e.g. gaps, bi-modal

distribution), results that were markedly different from other sites (e.g. very high or low

concentrations) or results with high uncertainty.

Environment Agency



Visual checks of the Weibull estimates, quantile regression trends and future forecasts,

plotted against the raw data, were undertaken by WRc for sites:

with unusually high average or maximum concentrations;

with unusually high or low forecasts of current and future concentrations;

with an unusually high coefficient of variation;

with a statistically significant bimodal or multi-modal distribution;

with large gaps in the monitoring record, or no recent monitoring data;

that had an unusually wide confidence interval around the results;

that failed the current or future status tests despite no samples exceeding the 11.29

threshold;

that passed the current and future status tests but had exceedances of the 11.29

threshold since 2009; and

where there was a large discrepancy between the Weibull and quantile regression

estimates of the current 95th percentile concentration.

Checks focused on sites that met more than one criterion for checking, and those where

highly variable concentrations made it difficult to characterise accurately the historical water

quality trends. In general, the checks confirmed that the statistical methodology had been

applied correctly and that the current and future status assessments were reasonable at the

vast majority of sites. However, they also revealed a small minority of sites where:

a small number of high or low measurements were exerting a high degree of influence

on the results;

strongly fluctuating concentrations meant that historical trends could not be

characterised adequately, with the result that future forecasts were over/under-

estimated or very uncertain; or

large gaps in the monitoring record made the results particularly sensitive to historical

trends or a small number of recent samples.

Monitoring sites were flagged for the Environment Agency’s attention where the results were

judged not to be a reasonable representation of current and future water quality.

Environment Agency



3. Groundwater methodology

3.1 Source datasets

3.1.1 Environment Agency monitoring data

The EA extracted from its WIMS database all water quality data collected at routine

groundwater monitoring sites in England between 1980 and 2015. The determinand codes

extracted were:

0111 (Ammoniacal nitrogen as N);

0116 (Total oxidised nitrogen as N);

0117 (Nitrate as N);

0118 (Nitrite as N); and

9880 (Nitrate as NO3).

Measurements with purpose codes relating to reactive sampling of pollution incidents and

monitoring of waste sites were not provided as they were thought to be unrepresentative of

normal water quality. Table 3.1 shows which purpose codes were contained within the data

provided by the EA.

Table 3.1 Purpose codes included in groundwater dataset

Purpose

Code Purpose Code Description

Included

CA Compliance audit (permit) Yes

CF Compliance formal (permit) No

CI Statutory audit (operator data) Yes

CO Water quality UWWTD monitoring data Yes

CS Water quality operator self-monitoring compliance data Yes

IA IPPC/IPC monitoring (Agency audit – permit) Yes

IF IPPC/IPC monitoring (formal sample) No

II IPPC/IPC monitoring (Agency investigation) Yes

IO IPPC/IPC monitoring (operator self-monitoring data) Yes

Environment Agency



Purpose

Code Purpose Code Description

Included

IT Instrument trial No

MI Statutory failures (follow ups at designated points) No

MN Monitoring (national Agency policy) Yes

MP Environmental monitoring (GQA & RE only) Yes

MS Environmental monitoring statutory (EU directives) Yes

MU Monitoring (UK govt policy – not GQA or RE) Yes

PF Planned formal non-statutory (permit/env mon) No

PI Planned investigation (operational monitoring) No

PN Planned investigation (national Agency policy) Yes

SI Statutory failures (follow ups at non-designated points) No

UF Unplanned reactive monitoring formal (pollution incidents) No

UI Unplanned reactive monitoring (pollution incidents) No

WA Waste monitoring (Agency audit – permit) No

WF Waste monitoring (formal sample) No

WI Waste monitoring (Agency investigation) No

WO Waste monitoring (operator self-monitoring data) No

XO External organisation monitoring Yes

ZZ Unspecified at time of WIMS load No

The raw dataset contained 6663 unique monitoring points and 310,958 unique samples.

3.1.2 Water company monitoring data

Defra invited water companies to provide surface water quality data collected as part of their

routine monitoring programmes. Six companies provided valid data (including co-ordinates)

for a total of 259,004 samples from 706 unique surface water monitoring sites. Table 3.2

provides a summary for each water company.

Environment Agency



Table 3.2 Summary of water company surface water dataset

Water Company Determinands recorded Sites Samples Date span

Affinity Water Nitrate as N,

Nitrite as N,


150 13,023 Aug 1999 –

June 2014

Bristol Water Nitrate as NO3 17 14,369 Feb 1995 –

March 2015

Sutton and East

Surrey

Nitrate as NO3 16 1,050 Nov 1992 –

Dec 2014

United Utilities Ammoniacal Nitrogen as N

Nitrate as N

Nitrite as N


112 16,087 Feb 1980 –

April 2015

Wessex Water Nitrate as N 333 187,149 Dec 1979 –

April 2015

Yorkshire Water Nitrate as N 78 27,326 Jan 1980 –

April 2015

Totals 706 259,004

3.2 Data processing

3.2.1 Monitoring sites

Since the EA monitoring network includes some sites with the same ID in different regions, a

unique Site ID was created for each EA monitoring site by combining the Site ID and EA

Region fields. For water company data, a unique Site ID was created by combining the Site ID

and abbreviated water company name.

Data supplied to the EA by external organisations (Purpose Code = XO) was included in the

WIMS dataset, creating a risk that some water company monitoring data may have been

supplied by both the EA and the water company. The EA and water company datasets were

therefore compared to identify duplicate monitoring points. Site IDs were not always

consistent between the two datasets, so sites were matched on their co-ordinates. A total of

460 sites with matching co-ordinates were found in both datasets. We attempted to confirm

whether or not these pairs of sites were truly the same monitoring point but this was not

possible because:

clusters of boreholes frequently share the same co-ordinates; and

Environment Agency



some pairs of sites had the same site name in both datasets, but many pairs had subtle

differences (i.e. leading zeros or letters) or completely different names, which

prevented them being matched.

Rather than risk merging data from different monitoring points, the pairs of sites were retained

as two separate monitoring points and analysed separately. By adopting this approach, it is

likely that some sites and samples have been duplicated in the analysis, but because they

have identical locations this does not overstate the degree of evidence for or against

designation.

The co-ordinates of each monitoring site were checked for missing or inconsistent eastings

and northings, but no issues were found for the EA data. Some water company sites were

missing co-ordinates but these were retained in case valid grid references could be provided

at a later date.

The following meta-data were tabulated for each monitoring site:

unique Site ID;

site name;

co-ordinates (eastings and northings);

site operator (Environment Agency or Water Company);

EA sampling point type (where applicable); and

EA region (where applicable).

As far as possible, the EA and water company datasets were filtered to exclude monitoring

sites that sampled water after it had been treated to remove nitrate (and so is not

representative of raw water quality) or sampled water that had been blended from multiple

sources. To achieve this, the EA dataset was filtered according to the sampling point types

listed in Table 3.3. The same filter was applied to the Affinity Water and Sutton and East

Surrey Water datasets, which had also been extracted from the EA’s WIMS database. For the

remaining four water companies, all surface water monitoring sites were assumed to sample

untreated and unblended river water.

Environment Agency



Table 3.3 Sampling point types included in groundwater analysis

Sampling Point Type Code Sampling Point Description Included

BA Groundwater – borehole Yes

BB Groundwater – spring Yes

BC Groundwater – wells & adits Yes

BD Groundwater – pit Yes

BE Groundwater – composite Yes

BH Groundwater – landfill site No

BL Groundwater – external to landfill site No

BZ Groundwater – unspecified Yes

3.2.2 Samples

For the EA WIMS dataset, a Unique Sample ID was created for each sample by combining

the UniqueSiteID, Sample Reference Number and Date (yyyy/mm/dd) fields. For the water

company dataset a Unique Sample ID was created for each sample by combining the

UniqueSiteID, Date and Time fields (or, for United Utilities, the UniqueSiteID and Date fields

only as no Time field was available).

Samples collected before 1st January 1980 were excluded because they were not regarded as

indicative of recent nitrate trends. Samples collected after 31st December 2014 were also

excluded because a full calendar year of monitoring data was not available for 2015 and,

where water quality fluctuates seasonally, the use of incomplete years could cause current

nitrate concentration to be over- or under-estimated.

The data were screened to identify how many samples were taken from each site on each

day. In most cases, a single water sample was taken from a given site on a given day, but

multiple samples were sometimes taken. Autosamplers are not usually deployed at

groundwater monitoring sites, but multiple samples may be taken from different depths within

a borehole, from a set of adjacent boreholes (that share a Site ID), or for some other

investigative purpose. The presence of multiple samples means that any analysis of nitrate

concentrations is weighted towards days when more samples are taken so we adopted the

following rule to prevent this occurring.

Where two or more samples had been taken from the same site on the same day, the

sample with the highest TIN reading was retained and the other samples excluded. If

the two TIN readings were identical then the sample with the highest TON was retained

and if the TIN and TON readings were identical the sample with the highest sample

number was retained.

Environment Agency



In the case of United Utilities, the time of each sample was not recorded and so an average

concentration was calculated for each determinand on each day, and these daily average

values were used to calculate TIN and TON concentrations as described in Section 3.2.3

below.

If neither TON nor nitrate was measured (i.e. the sample only had measurements for nitrite

and/or ammonium) then the sample was excluded.

3.2.3 Determinands

Most water samples were measured for multiple determinands. These determinand results

were combined to calculate the concentration of total inorganic nitrogen (TIN) and total

oxidised nitrogen (TON) in each sample, as described below.

To make all determinands comparable, measurements recorded in mg/l as NO3 were

converted to mg/l as N by multiplying by 0.2258. Thus, 50 mg NO3/l corresponds to 11.29 mg

N/l.

Zero readings were removed. These can arise for a variety of reasons, but it cannot be

assumed that they represent readings below the Limit of Detection (LOD). Negative readings

were also removed as these are clearly erroneous.

“Less than” values were treated using the standard EA approach of dividing the recorded

concentration by two. “Greater than” values were not very prevalent in the data and were not

adjusted (i.e. the concentration measurement was used as reported).

Total Inorganic Nitrogen (TIN) was calculated by the following rules, listed in order of


= TON + ammonium;

= nitrate + nitrite + ammonium;

= TON;


= nitrate + ammonium;

= nitrate.

Total Oxidised Nitrogen (TON) was calculated by the following rules, listed in order of


= TON;


Environment Agency



= nitrate.

TIN and TON were not calculated if TON and nitrate were both missing, but were calculated if

nitrite and/or ammonium were missing because these determinands typically represent only a

small proportion of the nitrogen in the water sample.

3.2.4 Summary of cleaned dataset

The groundwater data processing is summarised in Table 3.4.

Table 3.4 Summary of groundwater data processing

Data processing stage No .of sites No. of samples

Raw data EA dataset 6,663 310,958

Water company dataset 706 259,004

Combined dataset 7,369 569,962

Data exclusion rules Out of scope SMPT_TYPE 0 0

Sample >2015 1 3,663

Sample <1980 0 1

Zero or negative readings 0 185

Multiple samples from the

same site on the same day

0 6,942

Nitrite and/or ammonium

only

56 11,081

Monitoring point in Wales 516 6,319

Total excluded 573 28,699

Final dataset Combined dataset 6,796 541,263

3.3 Data analysis methods

3.3.1 Overview

Total inorganic nitrogen (TIN) was used to measure the concentration of nitrogenous

compounds in water samples. TIN includes nitrate, nitrite and ammonium, of which nitrate is

usually the dominant fraction. Ammonium derives from both waste water treatment works

(WwTWs) and from agricultural sources and is rapidly oxidised to nitrate under normal riverine

conditions. To assess the contribution of ammonium to observed TIN concentrations, a

parallel analysis was performed using total oxidised nitrogen (TON), which comprises just

nitrate and nitrite.

Environment Agency



Each groundwater monitoring site with sufficient data was analysed to determine whether or

not:

the 95th percentile TIN concentration currently exceeds 50 mg NO3/l; or,

the 95th percentile TIN concentration is likely to exceed 50 mg NO3/l in the future,

assuming no preventative action is taken.

Following the 2013 Review methodology, a site was deemed to have failed the statistical test

if the current or future 95th percentile nitrate concentration exceeded 50 mg NO3/l with at least

95% confidence. In practice this meant testing whether the lower 90% confidence limit on the

95th percentile exceeded 50 mg NO3/l. This approach is slightly less stringent than that for

surface waters (which uses the best estimate of the 95th percentile instead of the lower

confidence limit) and amounts to setting the evidence bar slightly higher to reduce the risk of

falsely designating NVZs due to high sampling errors. In other words, this approach is a way

of taking account of the uncertainty that arises when limited monitoring data is available for

analysis.

The year 2027 was chosen as the time horizon for the future assessment because it (i) is

consistent with the approach used in the 2013 NVZ Review, (ii) allows a sufficient period of

time for pollution mitigation measures to take effect, and (iii) ties in with the Water Framework

Directive river basin planning cycle.

The statistical methods used to assess current and future status at each monitoring point

depended upon the amount of data available, as set out in Table 3.5. As in the 2013 NVZ

Review, most sites had insufficient data to robustly estimate the 95th percentile concentration.

For these sites, the mean concentration was estimated instead and a conversion factor of

1.16 was applied to convert the mean concentration into an estimate of the lower 90%

confidence limit on the 95th percentile (see Section 3.3.7 for details). This is equivalent to

testing whether or not the mean TIN concentration exceeds 43 mg NO3/l.

The assessment of current status used data from the last six calendar years (i.e. 2009-2014)

to estimate the 95th percentile concentration or, where less data were available, a statistical

extrapolation of the historical (1980-2014) time series was used to predict the mean

concentration in mid-2015. The assessment of future status used the same statistical

extrapolation methods to predict the mean concentration in mid-2027. To ensure that the

results of the analysis were based on sound monitoring evidence, sites were not assessed if

they had fewer than six water quality samples in total.

All statistical analyses were conducted using R v. 3.2.0 (R Core Team 2015).

Environment Agency



Table 3.5 Minimum data rules for groundwater analysis

Rule

Minimum

no. of

samples

Current status

assessment

Future status

assessment

Number

of sites

At least 24 samples

and at least 5 years

data for period 2009-

2014

24 Weibull estimate of 95th

percentile for 2009-2014

Mean concentration1 in

2027 forecast using

‘AntB’ method

953

At least 2 samples per

year for any six years

12 Mean concentration1 in

2015 forecast using

‘AntB’ method


2027 forecast using

‘AntB’ method

2,866

At least 1 sample per

year for six consecutive

years

6 Mean concentration1 in

2015 forecast using

‘AntC’ method


2027 forecast using

‘AntC’ method

511

At least 6 samples 6 Mean concentration1 for

period 1980-2014

Mean concentration1

for period 1980-2014

1,096

Sites with sufficient data for analysis 5,426

Less than 6 samples - No assessment No assessment 1,370

TOTAL 6,796

1 Estimates of mean concentration were multiplied by a conversion factor of 1.16 to estimate the lower 90% confidence limit on

the 95th percentile. If this exceeded 50 mg NO3/l the monitoring point was deemed to have failed the test with high (95%)

confidence. This is equivalent to testing whether or not the mean concentration exceeds 43 mg NO3/l.

3.3.2 Data screening

Prior to analysis, exceptionally high and low concentration measurements at each site were

identified using a Multiple Outlier Test (described in Appendix A) and excluded from

subsequent calculations. Excluding outliers from the analysis makes the results more robust

and less sensitive to isolated, atypical measurements. A relatively high exclusion threshold

was used, however, to minimise the risk of discarding samples that were genuinely

representative of water quality at that monitoring point. Further checks were carried out to

assess the sensitivity of the results to any high and low concentration measurements

remaining in the dataset (see Section 3.3.9 for details).

Next, the number of samples per calendar year was counted and the longest gap in the

monitored period was identified. This information enables appropriate input criteria to be set

for the AntB analysis.

Finally, each site was also screened for evidence of step-changes in water quality by counting

instances where there was more than a 2-fold difference in mean concentration between

consecutive pairs of calendar years.

Environment Agency



3.3.3 Weibull method

The Weibull method was used to estimate the current 95th percentile concentration at

monitoring points that had at least 24 water quality samples and five years with data between

2009 and 2014.

The Weibull protocol is a robust technique because it doesn’t make any prior assumption

about the underlying distribution of the data (i.e. it doesn’t require the data to follow a normal

or log-normal distribution). It is also relatively insensitive to outliers and provides an

assessment of conditions over a six year period, which averages out year to year variation

and makes the results insensitive to random, short-term fluctuations in water quality.

The Weibull method uses the rth ranked value within the observation dataset to provide an

estimate of the 95th percentile, where r = 0.95(n + 1) and n is the number of samples. When r

is not an integer, r is rounded down and up to the nearest whole number, and the

corresponding concentration values for these ranks are interpolated to estimate the 95th

percentile. Conservative 90% confidence intervals were calculated using binomial distribution

theory, as described in the EA Codes of Practice for Data Handling (Ellis et al., 1993). If the

lower 90% confidence limit exceeded 50 mg NO3/l, the monitoring point was deemed to have

failed the test.

3.3.4 AntB method

The AntB tool was used to forecast current (mid-2015) and future (mid-2027) mean nitrate

concentrations at monitoring points that had at least two samples per year for any six years

between 1980 and 2014.

AntB is a simplified version of ANTEATER, a statistical forecasting tool for groundwater nitrate

that was developed for the Environment Agency by WRc in the late 1990s. One limitation of

ANTEATER is that a substantial amount of monitoring data is needed for it to work

satisfactorily. A simplified version of ANTEATER called AntB was therefore developed in

2005. This applies a broadly similar methodology to that used by ANTEATER, but is aimed at

datasets that are not sufficiently extensive for a full ANTEATER analysis to be possible.

The operations undertaken by AntB fall into two main parts:

1. trend analysis – AntB characterises historical changes in mean nitrate concentration;

and

2. forecasting – AntB uses Monte Carlo simulation to extrapolate the historical trend and

forecast future nitrate concentration.

The statistical methodology used in AntB is discussed below under these headings.

Environment Agency



Trend analysis

A multiple regression model was fitted to the data to characterise historical variation in the

mean concentration at each monitoring site. The model consisted of two main elements: a

linear spline component, which represents inter-annual changes in mean nitrate concentration

as a series of k straight lines connected by k-1 ‘hinge points’, and a seasonal sinusoidal

component, which represents within-year changes in mean nitrate concentration.

Inter-annual trends

Hinge points were constrained to occur at the end of calendar years to avoid picking up

seasonal effects. The following rules were used to strike an appropriate balance between

having too many splines (whose estimation would then be excessively vulnerable to high-

influence points in the data) and too few splines (which might be unable to reflect genuine

historical zigzags).

So that future changes in nitrate concentration could be simulated using a fixed time

step, the splines must be of equal length (i.e. span the same number of calendar years.

The exception was where the number of years per spline is not an exact multiple of the

total number of years in the data set. In this situation any remainder years are included

in the final spline; for example, 4 splines over 18 years would span 4, 4, 4, and 6 years.

To prevent over-fitting, the maximum allowed number of splines was 10.

To make the trend results more robust to high-influence data points, the average

number of data points per linear spline must be at least 6. This minimum must be met

not just on average, but also for the two end splines.

The splines must be apportioned over the full time period of the monitoring data,

(including years with no data) and long enough to span any gaps in the monitoring

record. For instance, if the biggest gap in the data extends for Y calendar years, the

number of years per spline must be at least Y + 1.

Seasonal trends

AntB represents seasonality using a simple sinusoidal function: Sin(2X) + Cos(2X), where

X is the decimal fraction of the year (ranging from 0 to 1). If BS and BC are the resulting slope

estimates, the amplitude and phase of the seasonal term are then estimated by:

Amplitude = (BS2 + BC

2), and

Phase = ArcTan(-BC/BS) + c,

where c is an adjustment factor set to 0, +1 or -1 as appropriate to make Cos(Phase) have

the same sign as BS.

Environment Agency



Consideration was given to extending the analysis to allow the amplitude to vary from year to

year, but groundwater data is often too erratic or sparse in individual years for this to work

reliably, and so AntB uses a constant amplitude and phase.

Trend analysis

The relationship between nitrate concentration and time was calibrated using multiple

regression, with one appropriately defined dummy variable per spline and two additional

predictor variables to represent seasonality. The linear spline and seasonal terms were

forcibly included, and carried through into the forecasting phase, regardless of whether or not

they were statistically significant. The justification for including them in the latter case is that

they are still the best estimates of the historical trends – and if they are not statistically

significant the sizes of the effects are likely to be relatively slight.

Forecasting

Monte Carlo simulation was used to extrapolate the historical trend and forecast future water

quality at each site. The simulation assumes that the 35 years of historical data provides the

best (unbiased) evidence as to the likely future direction of water quality and is constructed by

selecting slopes at random from a defined population of spline slopes.

Suppose the historical multiple regression model has m splines. In general, some of these

spline slopes will have been estimated less precisely than others – as reflected by their

standard errors (SE). The simulation allows for this by defining the weight for spline slope j to

be proportional to 1/SE(j) and, using these weights, calculating the weighted mean (AvB) and

weighted standard deviation (SDB) of the m spline slopes. If the model contained a just single

spline, then SDB was estimated as double the standard error of the spline slope coefficient.

The other inputs to the forecasting are:

the final spline slope;

the lag-1 autocorrelation of the spline slopes, R1, which defaults to 0 if there are fewer

than 4 splines;

the number of years per spline (except possibly for the last spline); and

the predicted mean concentration at the end of the monitored period (τ) and its

standard error (τSE).

Taking as its starting point a mean concentration drawn from the Normal distribution N(τ, τSE)

and the final spline slope, the simulation generates a projection of how the mean nitrate

concentration might change in the future, starting from the year immediately following the end

of the monitoring record.

Environment Agency



The future projection is constructed by selecting a series of spline slopes at random from the

Normal distribution N(AvB, SDB), and loosely associating each with the preceding slope to an

extent governed by the autocorrelation coefficient R1. Specifically, the slope for time step i+1

is generated from the previous time step’s slope (Bi) and the random slope (Brand) as follows:

Bi+1 = (1 – R1) AvB + R1 Bi + (1 – R12) ( Brand – AvB).

Taking the spline length as the time step, the process is continued until the required

forecasting horizon is reached. Starting from mid-2011, for example, if the spline length was 4

years, a series of four slopes would be generated to predict the mean concentration in mid-

2027.



including the median, which represents the best estimate of the mean concentration in mid-

2015 and mid-2027. The uncertainty in this estimate was quantified by ranking the 10,000

forecasts for that year and determining the 10th, 25

th, 75

th and 90


Finally, a conversion factor was applied to convert the mean concentration to an estimate of

the 95th percentile concentration (see Section 3.3.7).

Figure 3.1 shows an example of a fitted AntB regression model and the forecasts generated

from it. The horizontal green dashed line at 11.29 mg N/l (equivalent to 50 mg NO3/l) indicates

the threshold value for the 95th percentile concentration and the horizontal blue dashed line at

9.71 mg N/l indicates the corresponding threshold value for the mean concentration

(equivalent to 43 mg NO3/l).

The solid blue wavy line indicates historical fluctuations in the mean concentration. Beyond

the end of the monitoring record in 2007:

the blue line indicates the mean concentration forecast by Monte Carlo simulation;

the blue shaded band indicates the 50% confidence interval around the mean estimate;

and

the wider, light blue shaded band indicates the 90% confidence interval around the

mean estimate.

In this example, the mean concentration is forecast to be 8.2 mg N/l in mid-2015 (the first

vertical dashed line) with a 90% confidence interval of 6.5-9.9 mg N/l, and 8.6 mg N/l in mid-

2027 (the second vertical dashed line) with a 90% confidence interval of 4.1-13.1 mg N/l.

Environment Agency



Figure 3.1 Historical trend and future forecast of mean nitrate concentration using

AntB, and comparison to the 9.71 mg N/l (43 mg NO3/l) standard (horizontal blue

dashed line)

3.3.5 AntC method

The AntC tool was used to forecast current (mid-2015) and future (mid-2027) mean nitrate

concentration at monitoring points that had at least one sample per year for six consecutive

years at some point between 1980 and 2014. If there was more than one suitable period

within the sample data for a selected site, the most recent period was analysed.

AntC is a simplified version of AntB, which is used when there are insufficient samples to

characterise seasonal variation. AntC uses a simplified version of AntB’s spline model to

Environment Agency



characterise historical trends in annual mean concentrations and, like AntB, uses Monte Carlo

simulation to extrapolate the historical trend and forecast future nitrate concentrations.

The operations undertaken by AntC fall into two main parts:

1. trend analysis – AntC characterises historical changes in mean nitrate concentration;

and

2. forecasting – AntC uses Monte Carlo simulation to extrapolate the historical trend and

forecast future nitrate concentration.

The statistical methodology used in AntC is discussed below under these headings.

Trend analysis

First, annual mean concentrations (Y1, Y2, etc. up to Yk,) are calculated for the k consecutive

years of data (i.e. replicate samples taken in the same year are averaged together). The k-1

successive differences between these annual means are then calculated as:

Di = Yi+1 – Yi, i running from 1 to k-1.

and used to quantify the direction and rate of change in water quality over time.

Forecasting

The inputs to the forecasting are:

the average (AvD) and standard deviation (SDD) of the k-1 successive differences

between the annual means; and

the lag-1 autocorrelation of the successive differences, R1.

Given the generally precarious nature of the data set being analysed, a precautionary

adjustment to the SDD was made by setting it to the upper 90% confidence limit. Thus:

SDD = SDD√[(k-2)/ChiSqd],

where ChiSqd is the 10% point of the chi-squared distribution with k-2 degrees of freedom.

For example, with 10 years of data, N = 10, and so the factor by which SDD is scaled up is

√[8/3.49] = 1.51.

Taking as its starting point the mean concentration in the last of the k consecutive years, the

simulation generates a projection of how the mean nitrate concentration might change in the

future. The future projection is constructed by selecting a series of annual difference values at

Environment Agency



random from the Normal distribution N(AvD, SDD), and loosely associating each with the

preceding difference to an extent governed by the autocorrelation coefficient R1. Specifically,

the difference for year i+1 is generated from the previous year’s difference (Bi) and the

random difference (Brand) as follows:

Bi+1 = (1 – R1) AvD + R1 Bi + (1 – R12) ( Brand – AvD).

The process is continued on an annual time step, cumulatively adding the differences

together, until the required forecasting horizon is reached. If the last of the consecutive years

was 2014, for example, a series of 13 annual differences would be generated to predict the

mean concentration in mid-2027.



including the median, which represents the best estimate of the mean concentration in mid-

2015 and mid-2027. The uncertainty in this estimate was quantified by ranking the 10,000

forecasts for that year and determining the 10th, 25

th, 75

th and 90




Figure 3.2 shows an example of a fitted AntC model and the forecasts generated from it. The

horizontal green dashed line at 11.29 mg N/l (equivalent to 50 mg NO3/l) indicates the

threshold value for the 95th percentile concentration and the horizontal blue dashed line at

9.71 mg N/l (equivalent to 43 mg NO3/l) indicates the corresponding threshold value for the

mean concentration.

The observed annual mean concentrations are shown by short horizontal blue lines. Beyond


the solid blue line indicates the mean concentration forecast by Monte Carlo simulation;


and


mean estimate.

In this example, the mean concentration is forecast to be 7.2 mg N/l in mid-2015 (the first

vertical dashed line) with a 90% confidence interval of 6.3-8.1 mg N/l, and 8.3 mg N/l in mid-

2027 (the second vertical dashed line) with a 90% confidence interval of 0.0-16.8 mg N/l.

Environment Agency




AntC, and comparison to the 9.71 mg N/l (43 mg NO3/l) standard (horizontal blue

dashed line)

3.3.6 Mean Concentration method

At sites where there were insufficient data to apply the Weibull, AntB or AntC methods, the

mean concentration of all the samples (1980-2014) was calculated and taken as an estimate

of the mean concentration in mid-2015 and mid-2027. The standard deviation of historical

data was used to calculate the 50% and 90% confidence intervals around the mean estimate.



Figure 3.3 shows an example of a monitoring point where the mean concentration of the

historical dataset is used to assess current and future water quality. The horizontal green

Environment Agency



dashed line at 11.29 mg N/l (equivalent to 50 mg NO3/l) indicates the threshold value for the

95th percentile concentration and the horizontal blue dashed line at 9.71 mg N/l (equivalent to

43 mg NO3/l) indicates the corresponding threshold value for the mean concentration.

The observed historical mean concentration is shown by the solid blue horizontal line. Beyond


the solid blue line indicates the future forecast of mean concentration;


and


mean estimate.

In this example, the mean concentration in mid-2015 (the first vertical dashed line) and mid-

2027 (the second vertical dashed line) is forecast to be 13.0 mg N/l with a 90% confidence

interval of 10.8-15.2 mg N/l.

Environment Agency




the ‘Mean Concentration’ method, and comparison to the 9.71 mg N/l (43 mg NO3/l)

standard (horizontal blue dashed line)

3.3.7 Conversion factor

The 95th percentile is the primary statistic used to measure nitrate concentrations in surface

waters and groundwater. Unlike surface waters, however, most groundwater monitoring sites

lack sufficient data to robustly estimate the 95th percentile concentration. Since the kriging

process (Section 3.4) requires the same summary statistic to be used for all monitoring points,

a conversion factor was used to make estimates of mean and 95th percentile concentrations

comparable.

During the 2008 Review the mean concentration was estimated for all monitoring points and

the compliance threshold of 50 mg NO3/l (or 11.29 mg N/l) was adjusted to take into account

the fact that the mean is always lower than the 95th percentile. Based on the sample

Environment Agency



distribution for all groundwater samples, a threshold of 45 mg NO3/l (or 10.16 mg N/l) was

chosen (Defra 2008). The choice of 45 mg NO3/l was informed by an analysis which showed

that the mean concentration was on average 5 mg NO3/l lower than the lower 90% confidence

limit on the 95th percentile for monitoring points with large numbers of samples. This provided

a comparable level of protection for groundwater to that achieved by the statistical test for

surface waters.

During the 2013 Review, a re-analysis of the “45 mg/l rule” using up-to-date monitoring data

recommended that a conversion factor of 50/43 (or 1.16) would be appropriate. However, the

Methodology Review Group opted to use a conversion factor of 50/45 (or 1.11) for

consistency with the 2008 methodology. Sensitivity analysis indicated that the proportion of

monitoring points failing the test was similar using 1.11 or 1.16. For consistency with the

surface water methodology, the Methodology Review Group took a decision to adopt the 95th

percentile as the primary statistic and to apply a conversion factor of 1.11 to the observed

mean concentrations, so that all results could be compared to a threshold concentration of

50 mg NO3/l.

For the 2017 Review, Defra opted to accept the recommendation of the 2013 Review and

adopt a conversion factor of 1.16. The central estimate of the mean concentration was

therefore multiplied by 1.16 to convert it into an estimate of the lower 90% confidence limit on

the 95th percentile. If this exceeded 50 mg NO3/l the monitoring point was deemed to have

failed the test with high (95%) confidence. This is equivalent to testing whether or not the

mean concentration exceeds 43 mg NO3/l (or 9.71 mg N/l).

3.3.8 Comparison with 2013 Review

Whilst the statistical methodology followed that used for the 2013 NVZ Review as closely as

possible, a small number of refinements were made to improve the accuracy and repeatability

of the results. These refinements are detailed in Table 3.6.

Table 3.6 Summary of groundwater methodological refinements


The number of simulated future time series

was increased from 1,000 to 10,000.

The greater number of shots provides greater

stability in the future forecasts.

A fixed seed was used in the random number

generator.

The use of a fixed seed makes the results of

the Monte Carlo simulation reproducible.

Uncertainty in the mean concentration at the

end of the monitored period was incorporated

into the AntB Monte Carlo simulation.

By not using a fixed starting concentration,

the simulation produces a wider range of

projections that more accurately represent the

uncertainty in future concentrations.

Environment Agency




A statistical test was added to test whether

the measured concentrations at each site

follow a bimodal or multi-modal distribution.

The test provides a useful indicator of sites

where erratic concentrations can give rise to

unreliable future forecasts.

The coefficient of variation (standard

deviation / mean) was calculated for each

site.

A high coefficient of variation can indicate

sites where highly variable concentrations

can give rise to unreliable future forecasts.

For each site, the % ammonium contribution

across all n samples was calculated as:

∑ 𝑇𝐼𝑁𝑖𝑛𝑖=1 − ∑ 𝑇𝑂𝑁𝑖

𝑛𝑖=1

∑ 𝑇𝐼𝑁𝑖𝑛𝑖=1

A high % ammonium contribution can indicate


source discharges.

For each site, the mean concentrations in

winter (Dec-Feb), spring (Mar-May), summer

(Jun-Aug) and autumn (Sep-Nov) were

calculated.

Peak concentrations in summer can indicate


source discharges.

A time series plot was produced for each

monitoring site, showing the historical data,

fitted trend, and future forecasts.

The plots allow close visual inspection of the

results to verify the assessments of current

and future status.

For AntC, consecutive years are identified

started from the most recent year of

monitoring

In cases where two or more runs of

consecutive years are available, the most

recent data is used, which should provide a

more reliable indicator of current and future

water quality.

To use the Weibull method, there must be at

least one sample in five of the last six years

(2009-14).

This rule brings the groundwater method into

line with the surface water method and

ensure that percentile estimates are based

on data from a small number of potentially

unrepresentative years.

Use of the AntC2 tool for assessing the

status of data-poor groundwater sites was

abandoned in preference of the ‘Mean

Concentration’ method.

Only 15 sites were analysed by AntC2 in the

2013 Review, and only 28 sites meet the

criteria for AntC2 in the 2017 Review. The

predictions from AntC2 typically have a high

degree of uncertainty, and provide little

additional information over the simpler Mean

Concentration method.

The data screening checks used by AntB

were applied to all sites, not just those

analysed using AntB.

The screening checks provide a more

comprehensive data quality check for every

site.

Environment Agency




A conversion factor of 50/43 (~1.16) was

applied to estimates of mean concentration to

produce an estimate of the lower 90%

confidence limit on the 95th percentile.

A conversion factor of 50/43 was

recommended at time of the previous NVZ

Review, based upon an analysis of

monitoring data that showed the mean

concentration to be, on average, 7 mg NO3/l

lower than the lower 90% confidence limit on

the 95th percentile.

3.3.9 Quality assurance of results

A series of automatic checks were undertaken to assess the performance of the statistical

methods and the accuracy of current and future status assessments. Sites were flagged for

more detailed manual checks if they displayed significant data quality issues (e.g. gaps, bi-

modal distribution), results that were markedly different from other sites (e.g. very high or low

concentrations) or results with high uncertainty.

In addition, the effect of the refinements described in Section 3.3.8 was assessed by

comparing the results of the original (2013 Review) and revised (2017 Review) methods.

Sites where current estimates showed discrepancies of more than 10% and 1 mg N/l were

reviewed to understand why the results were sensitive to small changes in methodology.

Visual checks of the Weibull estimates, quantile regression trend analysis and future

forecasts, plotted against the raw data, were undertaken by WRc for sites:

with unusually high average or maximum concentrations;

with unusually high or low forecasts of current and future concentrations;

with an unusually high coefficient of variation;

with a statistically significant bimodal or multi-modal distribution;

with large gaps in the monitoring record, or no recent monitoring data;

that had an unusually wide confidence interval around the results;

that failed the current or future status tests despite no samples exceeding the 11.29

threshold;

Environment Agency



that passed the current and future status tests but had exceedances of the 11.29

threshold since 2009;

where there was a large discrepancy between the results generated by the original

(2013 Review) and revised (2017 Review) methods.

Checks focused on sites that met more than one criterion for checking, and those where

highly variable concentrations made it difficult to characterise accurately the historical water

quality trends. In general, the checks confirmed that the statistical methodology had been

applied correctly and that the current and future status assessments were reasonable at the

vast majority of sites. However, they also revealed a small minority of sites where:

a small number of high or low measurements were exerting a high degree of influence

on the results;

strongly fluctuating concentrations meant that historical trends could not be

characterised adequately, with the result that future forecasts were over/under-

estimated or very uncertain; or

large gaps in the monitoring record made the results particularly sensitive to historical

trends or a small number of recent samples.

Monitoring sites were flagged for the Environment Agency’s attention where the results were

judged not to be a reasonable representation of current and future water quality.

3.4 Kriging

3.4.1 Introduction

Monitoring data from groundwater monitoring points only gave groundwater nitrate

concentrations at specific discrete locations within aquifers (the exact area of land

represented by a monitoring point will depend on the volume and depth of the abstraction and

geology). In order to estimate nitrate concentrations across the country and use the

methodology at a 1 km2 resolution, a statistical interpolation technique (kriging) was used to

understand spatial patterns in the groundwater dataset and predict nitrate concentrations at

unmonitored locations.

The advantages of this approach are that it (i) makes full use of the information contained in

the monitoring dataset, and (ii) generates an evidence stream with greater spatial coverage

and spatial resolution that can potentially improve the delineation of NVZ boundaries. The

disadvantages are that the use of a national analysis to integrate data from multiple

monitoring points provides limited ability to take account of local groundwater characteristics

or to represent very localised geological conditions. However, the use of expert review with

local staff is intended to provide an opportunity to refine the information and it is important to

Environment Agency



remember that the method is about developing a conceptual model of an area. The kriged

concentration data are an important line of evidence in building that understanding but they

are only one line of evidence

3.4.2 What is kriging?

Kriging is a well-established and widely used geostatistical interpolation technique. In essence

it takes a set of point measurements, characterises spatial patterns in the measured variable

and uses this information to estimate values of the measured variable at all unmonitored

locations. The result is a two-dimensional map of the measured variable.

Kriging works by quantifying the spatial correlation between pairs of measurements.

Measurements taken close together in space will tend to be more similar than those taken

from locations further apart, but the rate of change in similarity with distance will be specific to

each dataset. Kriging uses the entire dataset to characterise the relationship between pairs of

measurements with different degrees of separation, in the form of a variogram.

These spatial relationships are then used to estimate the measured variable at unmonitored

locations from the values observed at surrounding locations. This is achieved by taking a

weighted average of the measurements surrounding the unmonitored location. Decreasing

weight is given to measurements further away from the unmonitored location, as determined

by the spatial correlation relationship. This interpolation process is repeated for every cell of a

grid to build up a complete two-dimensional map.

3.4.3 Methodology

Trend analysis was used to estimate the current (2015) and future (2027) TIN concentrations

at each of 5,426 monitoring points, as detailed in Section 3.3. As it is not possible to use

different types of values within the kriging process (e.g. a 95th percentile concentrations at

some points and a mean concentrations at others), the lower 90% confidence limit on the 95th

percentile was used to characterise water quality at each monitoring point (see Section 3.3.7

for details of how this was calculated). Using a consistent summary statistic in this way allows

the kriging analysis to compare TIN concentrations among boreholes. The results of the

kriging analysis were compared to a threshold of 11.3 mg N/l.

At locations where a downward historical trend in nitrate concentrations gave rise to

predictions of negative concentrations in 2015 or 2027, these results were set to 0.1 mg N/l. A

power transformation was applied to make the data approximately normally distributed and to

reduce the influence of very high concentration estimates at a small number of monitoring

points (power coefficient λ = 0.19 for current concentrations and 0.13 for future

concentrations). After kriging, a reverse transformation was used to convert the results back

to mg N/l.

Environment Agency



Monitoring points with obviously incorrect grid references (e.g. in the sea) were excluded from

the kriging process, but it was not possible to verify that the co-ordinates of every monitoring

point were correct (these will be checked later as part of the local ground-truthing process).

Most co-ordinates were only accurate the nearest 100 m so, where two or more monitoring

points were within the same 100x100 m grid square, the monitoring point with the highest TIN

concentration was used in the analysis, and results for the other monitoring points were

excluded. Of the 5,426 monitoring points for which 95th percentile TIN concentrations had

been estimated, 1,225 were excluded due to duplication within a 100m grid square, and 41

were excluded due to incorrect grid references, leaving 4,160 for the kriging analysis. These

monitoring points were clustered in the main aquifers and provided excellent coverage across

the whole of England (Figure 3.4).

Environment Agency



Figure 3.4 Location of the 4,160 groundwater monitoring points used in the kriging

analysis

Separate variograms were calibrated for current (2015) and future (2027) TIN concentrations.

Although TIN concentrations were predicted more precisely at some monitoring points than

others (because some points have more complete, comprehensive and up-to-date sampling

data than others), each monitoring point was given an equal weighting in the analysis.

Exploratory analyses carried out as part of the 2013 review provided no clear evidence for

anisotropy, and so a simple, omnidirectional variogram model was produced in which the

spatial correlation structure is the same in all directions.

Environment Agency



Using the observed data and the calibrated variograms, kriging was then used to estimate

current and future TIN concentrations in each of 130,899 1 km2 grid squares covering the

whole of England. These point predictions were applied to the entire grid square – i.e.

predicted concentrations were assumed not to vary within a grid square.

Various models were fitted to the variograms with the goal of maximising the level of

agreement between the predicted (interpolated) values and the observed nitrate

concentrations at individual monitoring points. The relative performance of the models was

assessed visually by plotting the observed nitrate concentrations at individual monitoring

points against the predicted values for the 1 km2 grid cells containing those points. The

strength of agreement was then quantified by calculating (i) the root mean square error

(RSME) and (ii) the percentage of monitoring points where the level of pollution was

categorised correctly (low: 0.00 – 5.65 mg N/l; medium: 5.65 – 11.29 mg N/l; high: >11.29 mg

N/l).

All statistical analyses were conducted using R v.3.2.0 (R Core Team 2015).

3.4.4 Results

A nested exponential-spherical model with zero nugget variance was selected as the

preferred variogram model for kriging current and future 95th percentile TIN concentrations.

Figure 3.5 presents the variograms, showing how the (semi-)variance in TIN concentrations

increases with increasing separation (lag) between the monitoring points concerned. Details

of the calibrated variograms are provided in Table 3.7.

Table 3.7 Parameter estimates for current and future concentration variogram

models

Model component Current concentration Future concentration

Exponential – Partial sill 0.075 0.038

Exponential – Effective range 1,000 m 1,000 m

Spherical – Partial sill 0.045 0.020

Spherical – Range 30,000 m 40,000 m

When used for kriging, the exponential-spherical model produced the strongest agreement

between the predicted values and the observed nitrate concentrations at individual monitoring

points. Although high nitrate concentrations tended to be slightly under-predicted and low

concentrations tended to be slightly over-predicted, the kriged (interpolated) values correctly

categorised the level of pollution at 84-85% of individual monitoring points. Appendix C

provides a more in-depth comparison of alternative models and justifies the choice of the

preferred model.

Environment Agency



Figure 3.6 and Figure 3.7 show final, interpolated maps of current and future TIN

concentrations. Like all interpolation algorithms, kriging has the effect of smoothing the data,

and therefore tends to slightly under-predict the high concentrations and over-predict the low

concentrations. Nonetheless, the kriged maps provide a useful representation of how TIN

concentrations vary spatially, and provide a reasonable prediction of current and future

concentrations across the main aquifers.

The kriging results were used to inform the designation process by predicting whether or not

TIN concentrations exceed the 11.29 mg N/l threshold at locations between monitoring points.

This, in turn, allowed EA staff to use their professional judgement to decide how to define the

extent of pollution in aquifers and the size of the NVZ required. The results were also used to

help identify monitoring points with anomalously high or low nitrate concentrations; these may

then be screened out if they are deemed to be unrepresentative of the water bodies (e.g. if

they are influenced by a local point source discharge).

Environment Agency



Figure 3.5 Variogram for current (a) and future (b) 95th percentile TIN

concentrations

(a)

(b)

Environment Agency



Figure 3.6 Kriged predictions of current (2015) 95th

percentile TIN concentration

Environment Agency



Figure 3.7 Kriged predictions of future (2027) 95th

percentile TIN concentration

Environment Agency



4. Recommendations

For consistency and comparability, this review follows closely the methodology used for the

previous (2013) Review of NVZs in England and Wales (Environment Agency 2012a, 2012b).

Some refinements were made to the methodology to make better use of the available data

and to provide a more rigorous system of checks and balances, which are highlighted and

discussed in this document.

Despite these refinements, the methodology still has some limitations.

1. The use of kriging to estimate groundwater nitrate concentrations at unmonitored

locations is a purely statistical exercise that fails to incorporate any other sources of

information about pollution pressures. Land use has been shown to be a good predictor

of nitrate concentrations in rivers and an appropriately calibrated regression model

could be used to help improve predictions of groundwater nitrate concentration in parts

of the country with little or no monitoring data. The residuals from the regression model

could still be kriged to take account of any spatial pattern in unexplained variation.

2. Many groundwater monitoring points have limited numbers of samples and/or

significant gaps in the monitoring record, both of which make it difficult to estimate

current and future nitrate concentrations precisely. In particular, the AntC trend analysis

tool is prone to producing projections that have exceptionally wide confidence intervals

or which are unduly sensitive to high or low measurements in a single year. An

alternative would be to make greater use of the ‘Mean Concentration’ method, and to

build in algorithms or rules to account for the additional uncertainty that arises when the

monitoring record is not up to date. For example, a standard auto-correlation coefficient

could be calculated for a set of monitoring points with similar characteristics, rather than

attempting to estimate an auto-correlation coefficient for each monitoring point

separately from as few as six data points.

3. An empirical adjustment factor of 1.16 was used to convert estimates of mean

concentration into an estimate of the lower 90% confidence limit on the 95th percentile

(Section 3.3.7). Use of this simple scaling factor ensures that a common and

comparable summary statistic can be generated for every monitoring point, but it also

introduces an additional, unquantified source of uncertainty. Furthermore the use of a

single, universal factor falsely assumes the same level of variability in nitrate

concentrations at each monitoring point, with the consequence that it provides an

uneven level of protection to groundwaters across England. Previous analysis of data

from Anglian region suggested that monitoring points in chalk have a more stable water

quality than monitoring points in gravel (Environment Agency 2012b). If extended to the

whole country, this could be used to develop type-specific thresholds and scaling

factors.

Environment Agency



References

Defra, 2008. Description of the methodology applied in identifying waters and designating Nitrate

Vulnerable Zones in England.

Ellis, J.C., van Dijk, P.A.H. and Kinley, R.D. 1993. Codes of Practice for Data Handling – Version 1.

WRc Report No. NR 3203/1/4224 for The National Rivers Authority.

Environment Agency, 2012a. Method Statement for Nitrate Vulnerable Zone Review – Surface Waters.

Environment Agency report to Defra and Welsh Government – supporting paper for the Implementation

of the Nitrates Directive 2013 – 2016. April 2012.

Environment Agency, 2012b. Method Statement for Nitrate Vulnerable Zone Review – Groundwaters.

Environment Agency report to Defra and Welsh Government – supporting paper for the Implementation

of the Nitrates Directive 2013 – 2016. April 2012.

Koenker, R. and Hallock, K.F. 2001. Quantile regression. Journal of Economic Perspectives, 15 (4),

143-156.

Koenker, R. 2015. Quantile regression: the quantreg package.

http://cran.r-project.org/web/packages/quantreg/quantreg.pdf

R Core Team, 2015. R: A language and environment for statistical computing. R Foundation for

Statistical Computing, Vienna, Austria. http://www.R-project.org/

http://cran.r-project.org/web/packages/quantreg/quantreg.pdf

http://www.r-project.org/

Environment Agency



Appendix A Multiple Outlier Test

A1 Objective

The objective of MOT (Multiple Outlier Test) is to identify outliers in a given data set, on the

assumption of underlying Normality.

A2 Definitions

Outlier: A data value which has arisen from some statistical population that is more extreme

than the population from which the bulk of the values have arisen.

Suspected outlier: a data value which is so far above or below the bulk of the data values that

it causes surprise to the user of the data.

Although the definition of suspected outlier might appear rather subjective, it carries with it the

implication that the user must have some correct probability distribution in mind (however

vague), and believes that the suspected outliers are not consistent with that distribution. In

other words, he or she suspects that the sample has been contaminated by observations from

some statistical distribution other than the one expected.

A3 The single outlier test

A well-established statistical procedure is available when the data can be assumed to have

come from an underlying Normal distribution (or where the data can be transformed, for

example by taking logarithms) so as to make this assumption reasonable. The test proceeds

as follows. First calculate the mean (m) and standard deviation (s) of the data values. Then

calculate the quantity tmax as:

tmax = |(x? – m)|/s

where x? is the suspected outlier (that is, either the minimum or the maximum of the data set).

If tmax is greater than the value given in Table A.1, the outlier can be declared to be statistically

significant at the 1% level. In other words, the probability that a value as extreme as this could

have arisen by chance from a Normal population is only 1 in 100.

Environment Agency



Table A.1 Critical values (P = 1%) of the tmax statistic

No. of data values Critical value

4 1.49

5 1.75

6 1.94

7 2.10

8 2.22

9 2.32

10 2.41

12 2.55

14 2.66

16 2.75

18 2.82

20 2.88

30 3.10

40 3.24

50 3.34

60 3.41

80 3.53

100 3.60

120 3.66

150 3.72

200 3.81

300 3.91

400 3.97

500 4.03

A4 The multiple outlier test

What if several outliers are suspected to be present in the data? In that case a simple

generalisation of the preceding test can be used. This is known as the Multiple Outlier Test.

The test takes an 'outward consecutive' approach. First, a pool of k suspects is produced by

finding all the data values whose tmax value (as defined above) is greater than some arbitrary

Environment Agency



limit (see below). The k suspects are then tested one at a time, working from the least to the

most extreme. The details of the procedure are as follows:

Starting with the (n-k) 'reliable' data values, augment these by just the least extreme of

the k suspected outliers.

Calculate the mean and standard deviation of those (n-k+1) values, and perform the

single outlier test as usual.

If the suspect fails to be confirmed as an outlier, pool it with the (n-k) reliable values,

recalculate the mean and standard deviation, and then test the next least extreme

suspect.

Continue in this way until a suspected outlier is declared to be a genuine outlier. All the

remaining (and hence more extreme) values are then declared to be outliers also.

It might be thought that, as the multiple outlier test provides several opportunities for false

positives, the actual significance levels would be somewhat higher than the nominal 1% value

quoted in Table A.1 for the single outlier case. That is not the case, however. When the data

values really do come from a Normal population, the multiple outlier test very rarely produces

false positives unless the single outlier test does so also - a characteristic that has been

confirmed by computer simulation.

Following the 2008 and 2013 Review methodologies, the nitrate datasets were assumed to be

Normally distributed. A critical value of 7 was used to strike a reasonable balance between

identifying outliers that were plainly abnormal, and retaining data values that were merely

suspicious.

Environment Agency



Appendix B Meta Data Overview

Table B.1 contains metadata descriptions for each of the tables provided within the SQL

Server database containing the raw and cleaned versions of the surface water and

groundwater monitoring data. Table B.2 contains metadata descriptions and purposes for

each of the tables provided in the final file geodatabase and Table B.3 contains metadata

descriptions and purposes for all feature classes provided in the final file geodatabase.

Table B.1 Metadata containing descriptions for each SQL database table contained

within the SW and GW SQL Server Database

Database Table Title Description

Ground_water_v1 Table containing the raw groundwater quality

data provided by EA from the WIMS database

and from various water companies.

Ground_water_v9 Table containing the cleaned/processed

groundwater quality data. Samples and sites

have been screened under agreed criteria and

TIN and TON have been calculated for each

sample. This is the dataset used for the

statistical trend analysis (Tasks 2.1 and 2.2).

GW_DELETED_SAMPLES Table containing the unique IDs of all

groundwater samples removed from the raw

groundwater quality data during the cleaning

process, and the reason for removal.

GW_DELETED_SITES Table containing the unique IDs of all

groundwater monitoring sites removed from the

raw groundwater quality data during the cleaning

process, and the reason for removal.

Gw_site_summary_end Table containing summary information for all of

the groundwater monitoring sites which were

retained in the final cleaned dataset (see-

ground_water_v9).

Surface_water_combined_inclTRAC Table containing the cleaned/processed surface

freshwater quality data including quality data

from sites located on transitional waters.

Samples and sites have been screened under

agreed criteria and TIN and TON have been

calculated for each sample. This is the dataset

used for the statistical trend analysis (Tasks 2.1

and 2.2).

Environment Agency



Database Table Title Description

Surface_water_v1 Table containing the raw surface freshwater and

transitional water quality data provided by EA

from the WIMS database.

Sw_and_wc_site_summary_end Table containing summary information for all of

the surface freshwater and transitional water

monitoring sites which were retained in the final

cleaned dataset (see-

surface_water_combined_inclTRAC).

SW_WC_DELETED_SAMPLES Table containing the unique IDs of all surface

freshwater and transitional water samples

removed from the raw surface water quality data

during the cleaning process, and the reason for

removal.

SW_WC_DELETED_SITES Table containing the unique IDs of all surface

freshwater and transitional water monitoring sites

removed from the raw surface water quality data

during the cleaning process, and the reason for

removal.

Wc_surface_water_v1 Table containing the raw surface freshwater and

transitional water quality data provided by

various water companies.

Environment Agency



Table B.2 Metadata containing descriptions and a summary/purpose for each

Geodatabase Table included in the final File Geodatabase

Table Title Summary/Purpose Description

Eng_2000_Land_Cover_Local_Perc_ The percentage land cover in

2000 for each water body was

connected to a master water

body feature class and was

used for the site selection

process for Task 5

(Evaluation of Action

Programme Effectiveness).

Local land cover data from

2000 at water body level.

Provides the area (ha) and

percentage of land cover, split

between urban, grass,

woodland, sea, water, rough

grazing, arable and total

area/percentage.

Table includes 7816 rows (1

row per water body).

Land cover data (area or

percentage) with a value of -

999.999 represents water

bodies which had no land

cover data (mostly very small

water bodies).

Eng_2000_Land_Cover_US_Perc_ The percentage land cover in

2000 for each water body and

its upstream catchment area

was connected to a master

water body feature class and

was used for the site selection

process for Task 5



Upstream land cover data

from 2000 at water body level.



between urban, grass, wood,

sea, water, rough grazing,

arable and total

area/percentage.








water bodies).

Eng_2010_Land_Cover_Local_Perc_ The percentage land cover in

2010 for each water body was




process for Task 5

Local land cover data from

2010 at water body level.





Environment Agency







area/percentage.








water bodies).

Eng_2010_Land_Cover_US_Perc_ The percentage land cover in

2010 for each water body and




was used for the site selection

process for Task 5



Upstream land cover data

from 2010 at water body level.






area/percentage.








water bodies).

GW_site_summary_table This summary provides the

core information about each

groundwater monitoring point

and is combined with other

datasets (e.g. land cover, BFI,

Point Source Flow, NVZ

designations). This table was

used in Tasks 2.3, 2.4, 3.1,

and 5 to identify the


locations across England.

Summary of groundwater

monitoring site ID, name,

region, location and data

source.

UniqueSiteID is the site ID

combined with the sample

point region ID.

GW_TIN_Outliers_43mg_l

GW_TIN_Outliers_45mg_l

GW_TON_Outliers_43mg_l

List of groundwater samples

that were removed from the

trend analysis due to being

identified as outliers.

Contains all the GW samples

identified as outliers during

the trend analysis of the TIN

and TON data.

Environment Agency




GW_TON_Outliers_45mg_l These outliers are specifically

for the TIN and TON trend

analysis under the 43 and 45

mg/l pollution threshold rule.

GW_TIN_Results_43mg_l


GW_TON_Results_43mg_l

GW_TON_Results_45mg_l

These results detail the trends

in water quality at each


and include the assessment

of current and future status

used to identify polluted

waters for NVZ designation

(Tasks 2.1 and 2.2). The

results were analysed further

to: compare concentrations

among reporting periods

(Task 2.3), map the number of

sites passing and failing (Task

2.4), produce an interpolated

map of groundwater quality

(Task 3.1).

Trend analysis results for

each GW monitoring point,

using either TIN or TON data

and either the 43 or 45 mg/l

pollution threshold rule, plus

summary statistics for the

three reporting periods.

It included current and future

TIN and TON concentrations

and several summary

statistics for the three

reporting periods.

The ‘Key_GW_Output_Fields’

table provides a key to the

data fields.

Key_GW_Output_Fields

Key_SW_Output_Fields

To provide a key/definition to

all the fields in the GW and

SW results tables.

Table providing the name and

definition of all the fields both

the TIN and TON (43 and 45

mg/l rule for GW) results

tables.

Each row represents a single

field in either the GW or SW

results tables.

Land_Cover_Local_Comparison_ The local land cover change

per water body between 2000

and 2010 was connected to a

master water body feature

class and was used for the

site selection process for Task

5 (Evaluation of Action


Change in local land cover

data from 2000 to 2010 at

water body level.

Provides the percentage of

land cover for both 2000,

2010 and the change between

the two, split between urban,

grassland, woodland, sea,

inland water, rough grazing,

arable and total percentage.



Land cover percentage data

Environment Agency




with a value of -999.999

represents water bodies

which had no land cover data

(mostly very small water

bodies).

Land_Cover_US_Comparison_ The upstream land cover

change per water body

between 2000 and 2010 was




process for Task 5



Change in upstream land

cover data from 2000 to 2010

at water body level.

Provides the percentage of

land cover for both 2000,

2010 and the change between

the two, split between urban,

grassland, woodland, sea,

inland water, rough grazing,

arable and total percentage.



Land cover percentage data

with a value of -999.999

represents water bodies

which had no land cover data

(mostly very small water

bodies).

LOCAL_area_weighted_average_BFI The area-weighted average

BFI in each water body was


body feature class and used

for the site selection process

for Task 5 (Evaluation of

Action Programme

Effectiveness).

Local Base Flow Index (BFI)

values on a water body level.

Water bodies with a local BFI

value of zero are most likely

caused by one of two

reasons:

the water body is located in

Scotland

the water body is very small

(typically coastal water

bodies)

SW_Sites_Summary_table This summary provides the

core information about each

surface water monitoring point

and was combined with other

datasets (e.g. land cover, BFI,

Point Source Flow, NVZ

designations). This table was

Summary of surface water

monitoring site ID, name,

region, location and data

source.

UniqueSiteID is the site ID

combined with the sample

Environment Agency




used in Tasks 2.3, 2.4, 3.1,

and 5 to identify the


locations across England.

point region ID.

SW_TIN_Outliers

SW_TON_Outliers

List of samples that were

removed from the trend

analysis (Tasks 2.1 and 2.2)

due to being identified as

outliers.

Contains all the SW samples

identified as outliers during

the trend analysis of the TIN

and TON data.

SW_TIN_Results

SW_TON_Results

These results detail the trends

in TIN and TON at each

surface water monitoring point

and include the assessment

of current and future status

used to identify polluted

waters for NVZ designation

(Tasks 2.1 and 2.2). The

results were later analysed to:

compare concentrations

among reporting periods

(Task 2.3), and map the

number of sites passing and

failing (Task 2.4).

Trend analysis results for

each SW monitoring point

using either TIN or TON data,

plus summary statistics for the

three reporting periods.

The ‘Key_SW_Output_Fields’

table provides a key to the

data fields.

US_area_weighted_average_BFI The area-weighted average

BFI for each water body and





process for Task 5



Upstream Base Flow Index

(BFI) values on a water body

level.

Water bodies with a upstream

BFI value of zero are most

likely caused by one of two

reasons:

the water body is located in

Scotland

the water body is very small

(typically coastal water

bodies)

US_prop_STW The proportion of flow at each

river water body outlet

estimated to come from point

sources. This was transferred

into the master water body

and site summary feature

Proportion of the water body

surface flow that comes from

point sources.

Includes the mean natural

flows summed for upstream

water bodies and the

Environment Agency




classes which were used as

part of the site selection

process for Task 5



upstream flow coming from

point sources (both in

ML/day).

Table B.3 Metadata containing descriptions and a summary/purpose for each

Geodatabase feature class included in the final File Geodatabase

Class Title Summary/Purpose Description

Current_krige

Future_Krige

Difference_krige

TIN_1km_krige

These feature classes were

used to create several maps of

England representing the

current and future TIN

concentrations from the kriging

process and the difference

between the two (Task 3.1).

The TIN_1km_krige feature

class is the final results output

from the kriging analysis, from

which the other three feature

classes were created.

These point feature classes

represent the interpolated TIN

concentrations for every 1 km2

grid square in England.

DGWR_HYDROMETRIC

_AREAS_GB_NI_polygon

This feature class of Large

Hydrological Areas was used

predominantly in Task 2.4 for

assessing the trend analysis

results by LHA and in Task 5

for the site selection process.

This feature class represents

all 59 Large Hydrological

Areas (LHAs) in England and

Wales.

Eng_Outline

Wales_1km_grid_Outline

Eng_Wales_Outine


used for calculating the total

percentage area of England

and Wales designated as

NVZs (Task 6) and as an

outline of England and Wales

for all maps generated.

This feature class contains a

single polygon outline of

England and Wales.

GW_Change_Preceding_T2_3

GW_Change_Previous_T2_3

GW_Max_NO3_Current_T2_3

GW_Mean_NO3_Current_T2_3


used in Task 2.3 to create

maps of England showing, for

each groundwater monitoring

point analysed: (i) the change


included the final groundwater

results from the trend analysis

(Task 2.2).

Several fields have been

Environment Agency




in mean NO3 concentration

between the preceding (2006-

07) and current (2013-14)

reporting periods; (ii) the

change in mean NO3

concentration between the

previous (2010-11) and current

(2013-14) reporting periods;

(iii) the maximum NO3

concentration in the current

(2013-14) reporting period; and

(iv) the mean NO3


(2013-14) reporting period.

included into these results

(easting, northing, check for

water company data and

check for a minimum amount

of data).

Only sites which met the

criteria for Task 2.3 were

included in this feature class.

Rivers_Change_Preceding_T2_3

Rivers_Change_Previous_T2_3

Rivers_Max_NO3_Current_T2_3

Rivers_Mean_NO3_Current_T2_3




each river monitoring point

analysed: (i) the change in

mean NO3 concentration




change in mean NO3







(iv) the mean NO3




included the final surface

freshwater results from the

trend analysis (Task 2.2).




water company data, check for

canals and check for a

minimum amount of data).




TRaC_Change_Preceding_T2_3

TRaC_Change_Previous_T2_3

TRaC_Max_NO3_Current_T2_3

TRaC_Mean_NO3_Current_T2_3




each TraC monitoring point

analysed: (i) the change in

mean NO3 concentration





included the final transitional

waters results from the trend

analysis (Task 2.2).




water company data, check for

canals and check for a

Environment Agency




change in mean NO3







(iv) the mean NO3



minimum amount of data).




GW_LHA_Results_T2_4

GW_Results_T2_4

The GW_LHA_Results_T2_4

feature class was used to

create a map of England which

represented the trend analysis

results (Task 2.2) by LHA.

The GW_Results_T2_4 feature

class was used to create a

map of England which


results (Task 2.2) for each of

the groundwater monitoring

sites.

These feature classes contain

the trend analysis results

(Task 2.2) for all groundwater

monitoring sites included in

Task 2.4.


added to indicate if the change

between the current and future

predicted concentrations are

improving or worsening. As

well as fields to indicate the

combination of current and

future status and the direction

of change.

SW_LHA_Results_T2_4

SW_Results_T2_4

The SW_LHA_Results_T2_4


create a map of England which


results (Task 2.2) by LHA.

The SW_Results_T2_4 feature

class was used to create a

map of England which


results (Task 2.2) for each of

the surface water monitoring

sites.


the trend analysis results

(Task 2.2) for all surface

freshwater monitoring sites

included in Task 2.4.


added to indicate if the change

between the current and future

predicted concentrations are

improving or worsening. As

well as fields to indicate the

combination of current and

future status and the direction

of change.

Environment Agency




GW_Sites_BFI_T5 This feature class is the master

site summary information used

in the site selection process of

Task 5.

This feature class is the site

summary table from Task 2.2

combined with local BFI, land

cover, LHA, WB ID, NVZ

presence and year and EA

region.

SW_Sites_WB_T5 This feature class is the master

site summary information used

in the site selection process of

Task 5.

This feature class is the site

summary table from Task 2.2

combined with local and

upstream BFI, land cover,

upstream point source flows,

LHA, WB ID, NVZ presence

and year and EA region.

GW_TIN_Results_43mg_l_Task3_1 This feature class contains the

groundwater trend analysis

results which were used in the

kriging analysis (Task 3.1).

This feature class contains the

trend analysis results for

groundwater TIN

concentrations using the 43

mg/l rule.

The difference between this

feature class and the


table is that monitoring sites

were removed if they had

obviously erroneous locations

(e.g. located in the sea).

NEAP_N_2000

NEAP_N_2004

NEAP_N_2010_Adjusted

NEAPN_2014

The 2000, 2004, 2010 and

2014 NEAP-N results were

used in Task 6 for calculating

the N loading inside and

outside of NVZ areas

designated during the 1996,

2002, 2008 and 2013

designation rounds.

These feature classes contains

the four NEAP-N model results

from 2000 to 2014 on a 1sq

km grid for England.

There are 130,899 NEAP-N

result points in England.

The NEAP-N results are from

agricultural loading only.

Points with N loading of zero

either have no N loading from

agriculture (e.g. Urban areas)

or are in coastal areas where

the majority of the 1sq km is

covered by sea.

The 2010 NEAP-N results

were adjusted to correct some

Environment Agency




incorrect locations in the

original shapefile.

NVZ_1996_Dissolved

NVZ_2002_Dissolved

NVZ_2008_Dissolved

NVZ_2013_Dissolved


used to identify the total area

extent of NVZs from the 1996,

2002, 2008 and 2013

designation rounds. This

information was used in Task

6.


a single polygon representing

the total extent of NVZs from

the 1996, 2002, 2008 and

2013 designation rounds.

NVZ_1996_T6

NVZ_2002_T6

NVZ_2008_T6

NVZ_2013_T6


used during Task 6 to identify

the areas covered by each

NVZ type (i.e. SW, GW and

EW (and any areas overlapped

by multiple NVZs from these

designation rounds.

These polygon feature classes

represent all the individual

NVZs put in place during the

1996, 2002, 2008 and 2013

designation rounds.

NVZ_2013_GW_T2_4

NVZ_2013_SW_T2_4


used to depict the extent of

groundwater and surface water

NVZ designations from the

2013 designation round for two

maps of England in Task 2.4.

These polygon feature classes

represent the total

groundwater and surface water

NVZ extent from the 2013

designation round.

WFD_WB_Cycle1_Final This master water bodies


identify water body level

attributes for the site selection

process of Task 5.

This feature class is the

master water bodies dataset

which includes the WB

connectivity code, catchment

size, region, local and

upstream BFI, land cover, and

upstream point source flows.

Environment Agency



Appendix C Evaluation of alternative kriging models

C1 Model options

The goal of the kriging analysis was to accurately predict nitrate concentrations across

England at a 1 km2 resolution. This interpolation technique was used to help understand

spatial patterns in the groundwater dataset and predict water quality at unmonitored locations.

A variety of variogram models (Gaussian, exponential, spherical, double-exponential and

exponential-spherical, with and without a nugget variance) and data transformations (log and

power) were explored. Many of these were discarded at an early stage because (i) the

variogram model would not converge, (ii) the model clearly did not capture the shape of the

variogram, or (iii) the interpolated surface displayed an unacceptably high degree of

smoothing.

To explore and illustrate the sensitivity of the kriging results to the choice of data

transformation and variogram, three alternative models were compared in detail.

1. Model 1: an exponential model with a non-zero nugget variance and a log

transformation – a basic initial model;

2. Model 2: a double exponential model with a zero nugget variance and a power

transformation – to mirror the 2013 modelling approach; and

3. Model 3: a nested exponential-spherical model with zero nugget variance and a power

transformation – the preferred model (described in Section 3.4.4 above).

The relative performance of the three models was assessed visually by plotting the observed

nitrate concentrations at the 4,160 individual monitoring points against the predicted values

for the 1 km2 grid cells containing those points. The strength of agreement was then

quantified by calculating (i) the root mean square error (RSME3) and (ii) the percentage of

monitoring points where the level of pollution was categorised correctly (low: 0.00 – 5.65 mg

N/l; medium: 5.65 – 11.29 mg N/l; high: >11.29 mg N/l).

3 The Root Mean Square error (RMSE) is metric that quantifies the differences between the values predicted by a

model (in this case, the interpolated concentrations) and the values actually observed (in this case, the

concentrations estimated from the monitoring data). To enable a fair comparison between models, the RSME

was calculated using the back-transformed model predictions. Observed values <0 and >50 mg N/l were

considered to be outliers arising from unreliable trend analysis and truncated to 0.1 and 50 mg N/l, respectively,

to minimise their influence on the RSME calculation.

Environment Agency



C2 Comparison of results

Table C.1 compares the performance of the three models, as measured by the fit between the

observed and predicted values. For both current and future nitrate concentrations, Model 3

performed substantially better than both Models 1 and 2, having the lowest RSME and the

highest categorisation accuracy. An accuracy of 84-85% is considered to be high, given the

imprecision of the ‘observed’ values (i.e. the 95th percentile estimates at individual monitoring

points), and the fact a single prediction for each grid cell is unable to account for within-cell

variation in nitrate concentrations among neighbouring boreholes.

Table C.1 Comparative kriging performance of alternative variogram models

Time horizon Metric Model 1 Model 2 Model 3

Current (2015) RMSE 6.53 6.12 3.97

Categorisation accuracy 70.2% 71.0% 85.4%

Future (2027) RMSE 9.02 7.64 5.17

Categorisation accuracy 63.2% 70.3% 84.0%

Figure C.1 plots the observed nitrate concentrations at the 4,160 individual monitoring points

against the predicted values for the 1 km2 grid cells containing those points (x-axis truncated

at 0 and 40 mg N/l for clarity). All three models tended to under-predict at locations with high

nitrate concentrations and over-predict at locations with low nitrate concentrations, but this

tendency was much weaker for Model 3.

Environment Agency



Figure C.1 Comparison of observed and predicted nitrate concentrations

Current (2015) concentrations Future (2027) concentrations M

od

el 1

Mo

del 2

Mo

del 3

C3 Justification for the preferred model

Like all interpolation algorithms, kriging has the effect of smoothing the data, and therefore will

always tend to under-estimate the high concentrations and over-estimate the low

concentrations. This is potentially problematic if the interpolated surface displays such a high

degree of smoothing that it (i) fails to represent localised nitrate concentration peaks above

the 11.29 mg N/l pollution threshold or (ii) over-states the level of risk in unpolluted areas. It

was therefore imperative to select a model that was able to predict nitrate concentrations

across England as accurately as possible.

The choice of data transformation appeared to have relatively little influence on the results,

but a power transformation was preferred for consistency with the methodology used for the

2013 NVZ Review (see Environment Agency 2012b).

0

5

10

15

20

25

30

35

40

0 5 10 15 20 25 30 35 40

Pre

dic

ted

TIN

co

nce

ntr

atio

n f

or

grid

ce

ll (m

g N

/l)

Estimated TIN concentration at monitoring point (mg N/l)

0

5

10

15

20

25

30

35

40

0 5 10 15 20 25 30 35 40

Pre

dic

ted

TIN

co

nce

ntr

atio

n f

or

grid

ce

ll (m

g N

/l)


0

5

10

15

20

25

30

35

40

0 5 10 15 20 25 30 35 40

Pre

dic

ted

TIN

co

nce

ntr

atio

n f

or

grid

ce

ll (m

g N

/l)


0

5

10

15

20

25

30

35

40

0 5 10 15 20 25 30 35 40

Pre

dic

ted

TIN

co

nce

ntr

atio

n f

or

grid

ce

ll (m

g N

/l)


0

5

10

15

20

25

30

35

40

0 5 10 15 20 25 30 35 40

Pre

dic

ted

TIN

co

nce

ntr

atio

n f

or

grid

ce

ll (m

g N

/l)


0

5

10

15

20

25

30

35

40

0 5 10 15 20 25 30 35 40

Pre

dic

ted

TIN

co

nce

ntr

atio

n f

or

grid

ce

ll (m

g N

/l)


Environment Agency



There were at least two reasons to expect that the variograms should have a high nugget

variance4, namely:

1. the relatively high level of sampling error in estimates of TIN concentrations at

individual monitoring points; and

2. the tendency for neighbouring boreholes sampling groundwater at different depths to

record very different TIN concentrations.

Including a nugget variance in the variogram models, however, resulted in an unacceptably

high degree of smoothing in the interpolated surfaces. For this reason, a zero nugget model

was preferred.

Having opted for a zero nugget model, a nested formulation was found to be necessary to

fully capture the shape of the variogram. The double exponential model (Model 2), as used for

the 2013 NVZ Review, performed well, but the exponential-spherical model (Model 3) had the

lowest RSME and the highest categorisation accuracy. Model 3 was therefore selected as the

preferred model for kriging current and future 95th percentile TIN concentrations.

4 The nugget variance is the y-intercept of the variogram – i.e. the variance at lag zero. It represents the small-

scale variability in the data, which comprises genuine variation in nitrate concentrations (e.g. variation in water

quality between neighbouring deep and shallow boreholes), and sampling and measurements error (i.e. the error

in the estimated 95th percentile concentration at each monitoring point).

Statistical Methods for Nitrate Vulnerable Zone Review...

Documents

Transcript of Statistical Methods for Nitrate Vulnerable Zone Review...