TF5651_ch07
-
Upload
brian-nguyen -
Category
Documents
-
view
214 -
download
0
Transcript of TF5651_ch07
-
7/27/2019 TF5651_ch07
1/16
Chapter 7
Analysis of an air quality
data set
In this Chapter we will make a detailed analysis of a comprehensive set of air
pollution concentrations and their associated meteorological measurements. I
would like to thank Professor David Fowler and Dr Robert Storeton-West of the
Centre for Ecology and Hydrology (CEH) for their help in supplying the data.
The measurements were taken over the period 1 January31 December 1993
at an automatic monitoring station operated by CEH at their Bush Estate
research station, Penicuik, Midlothian, Scotland. Three gas analysers provided
measurements of O3, SO2, NO, and NOx; NO2 was obtained as the difference
between NOx and NO. Windspeed, wind direction, air temperature and solarradiation were measured by a small weather station. The signals from the
instruments were sampled every 5s by a data logger, and hourly average values
calculated and stored.
7.1 THE RAW DATA SET
There were 8760 hours in the year 1993. Hence a full data set would involve 8760
means of each of the nine quantities, or 78 840 data values in all. An initialinspection of the data set showed that it was incomplete. This is quite normal for
air quality data there are many reasons for loss of data, such as instrument or
power failures or planned calibration periods. Any missing values have to be
catered for in subsequent data processing. The data availability for this particular
data set is given inTable 7.1.
Many of these lost values were due to random faults and were uniformly dis-
tributed throughout the year. The reliability of gas analysers depends largely on
the simplicity of their design, and the ranking of the analysers in this example is
quite typical. Ozone analysers based on UV absorption are very straightforwardinstruments that rarely fail provided that regular servicing is carried out. UV
fluorescence sulphur dioxide analysers are rather more complicated, and NOxanalysers even more so. The lower availability for the nitrogen oxides in this case
was in fact due to an extended period at the start of the year when problems were
being experienced with the analyser.
2002 Jeremy Colls
-
7/27/2019 TF5651_ch07
2/16
Table 7.1 Data availability for the 1993 CEH data set
Measurement Number of hours Percentage of 8760available
Ozone 8569 97.8Sulphur dioxide 8251 94.2Nitric oxide 7010 80.2Nitrogen oxides 7010 80.2Nitrogen dioxide 7010 80.2Windspeed 8443 96.4Wind direction 8459 96.6Air temperature 8650 98.7
Solar radiation 8642 98.7
The raw data for gas concentrations, wind speed and wind direction are shown
as time sequences inFigure 7.1. Plotting the data in this form is an excellent rapid
check on whether there are serious outliers (data values lying so far outside
the normal range that they are probably spurious). Differences in the general
trends of the concentrations through the year are also apparent. The O3 con-
centration (Figure 7.1(a)) rose to a general peak in AprilMay before declining
steadily through to November, and the most common values at any time werearound half the maxima of the hourly means. Sulphur dioxide concentrations
(Figure 7.1(b)) were typically much smaller, although the maxima were nearly
as great as for ozone. Typical NO concentrations (Figure 7.1(c)) were low
throughout the year, although the intermittent maxima were higher than for the
other gases. NO2 concentrations (Figure 7.1(d)) were high around May and
November, and particularly low in June/July. There were occasions when the
concentrations of any of these gases increased and declined very rapidly they
appear as vertical lines of points on the time series. Superficially, there does not
appear to be any systematic relationship between the timing of these occasionsfor the different gases. The windspeed (Figure 7.1(e)) declined during the first
half of the year and then remained low. The wind direction (Figure 7.1(f)) was
very unevenly distributed, being mainly from around either 200 (just west of
south), or 0 (north).
7.2 PERIOD AVERAGES
The plots shown in Figure 7.1 give an immediate impression of the variationsduring the year, but are not of much use for summarising the values for
comparison with other sites or with legislated standards, for example. We
therefore need to undertake further data processing. The most straightforward
approach is to explore the time variations by averaging the hourly means over
different periods. First, we need to decide how to handle those missing values.
2002 Jeremy Colls
-
7/27/2019 TF5651_ch07
3/16
Figure 7.1 Time series of hourly means for the 1993 CEH data set.
60
50
40
30
20
10
0
0 1000 2000 3000 4000 5000 6000 7000 8000Hour of the year
Ozone hourly means
Ozoneconcentration/ppb
60
50
40
30
20
10
0
0 1000 2000 3000 4000 5000 6000 7000 8000
Hour of the yearSulphur dioxide hourly means
SO
conce
ntration/ppb
2
SO
concentration/ppb
2
0 1000 2000 3000 4000 5000 6000 7000 8000
Hour of the yearNitric oxide hourly means
198
178
158
138
118
98
78
58
38
18
2
(a)
(b)
(c)
2002 Jeremy Colls
-
7/27/2019 TF5651_ch07
4/16
80
70
60
50
40
30
20
10
00 1000 2000 3000 4000 5000 6000 7000 8000
NO
concentration/ppb
2
Hour of the yearNitrogen dioxide hourly means
16
14
12
10
8
6
4
00 1000 2000 3000 4000 5000 6000 7000 8000
Windspee
d(m/s)
Hour of the yearWindspeed hourly means
2
0 1000 2000 3000 4000 5000 6000 7000 8000
Hour of the year
360
320
280
240
200
160
120
80
40
0
Winddirection/degrees
(e)
(f)
Figure 7.1 Continued.
2002 Jeremy Colls
-
7/27/2019 TF5651_ch07
5/16
60
50
40
30
20
10
0
35
30
25
20
15
10
5
01 3 5 7 9 1
113151719 21232527293133353739414345474951
Week of the yearWeekly means for all pollutants
35
30
25
20
15
10
5
0
1 2 3 4 5 6 7 8 9 10 11 12
Conc
entration/ppb
Month of the yearMonthly means for all pollutants
Concentration/ppb
O3SO2NONOxNO2
O3SO2NONOxNO2
O ,ppbv3
SO2,ppbv
NO,ppbv
NOx,ppbv
NO2
Concentration/ppb
11835526986103120137154171
188205222239256273290307324341
358
Day of the yearDaily means for all pollutants
(a)
(b)
(c)
O3
NOx
NO2
SO2
Figure 7.2 Time series of: (a) daily, (b) weekly and (c) monthly means for the 1993 CEHdataset.
2002 Jeremy Colls
-
7/27/2019 TF5651_ch07
6/16
When we plotted the (nominal) 8760 h means, a few missing values did not have
a big effect on the appearance. As we average over longer periods, the number of
values decreases to 365 daily means, 52 weekly means and only 12 monthlymeans, so that the effect of missing values becomes proportionately greater.
Before we start averaging, we must decide on a protocol that includes as much of
the data as possible, but does not create an average value when there is simply not
enough data to justify it. For example, consider the calculation of a daily average
from the 24 individual hourly averages that contribute to it. If one value is
missing, and the sequence of values before and after is varying smoothly, then it
is legitimate to substitute the missing value with the average of the adjacent
values. If the sequence varies erratically, this procedure serves little purpose.
Instead, we can ignore the missing value, and calculate the daily mean as theaverage of the remaining 23 h, arguing that the whole day was still well repre-
sented. If only 12 or 8 h remain (as might happen if a faulty instrument was
reinstated in the early afternoon), then this argument loses credibility and the
whole day should be discarded. The same idea can be applied to the calculation
of weekly, monthly and annual averages, with a requirement that, say, 75% of the
contributing values be present if the average is to be calculated. This philosophy
must particularly be adhered to when the data is missing in blocks, so that no
measurements have been taken over significant proportions of the averaging
period. For example, it is clear fromFigure 7.1(d)that the annual average NO2concentration would not include any of January or February, and therefore might
not be representative of the year as a whole.
InFigure 7.2(ac)the 1993 data for gas concentrations are presented as daily,
weekly and monthly averages respectively. The short-term variations are succes-
sively reduced by the longer averaging periods, and the trends that we originally
estimated from the raw data become clearer.
7.3 ROSES
In Section 6.1.1 we discussed the influence of wind direction on the pollutant
concentration at a point. We have analysed the CEH dataset specifically to
highlight any such dependencies. The 360 of the compass were divided into
16 sectors of 22.5 each. The hourly means taken when the wind was from
each sector were then averaged, and the values plotted in the form shown in
Figure 7.3. These diagrams are known as roses or rosettes. Figure 7.3(a) shows
that the wind direction was usually from between South and South-west, and
Figure 7.3(b) that these were also the winds with the highest average speeds.Figure 7.3(c) gives the ozone rose for the year the almost circular pattern is
expected because ozone is formed in the atmosphere on a geographical scale
of tens of km, rather than being emitted from local sources. Hence the
concentration should be relatively free of directional dependence. Sulphur
dioxide, on the other hand, is a primary pollutant which will influence
2002 Jeremy Colls
-
7/27/2019 TF5651_ch07
7/16
Figure 7.3 Direction roses of wind frequency, wind speed and pollutant gasconcentration
concentrations downwind of specific sources. Figure 7.3(d) indicates possiblesources to the North, South-east and West-north-west of the measurement site.
Concentrations from the sector between South and South-west the predomi-
nant wind direction are the lowest. The roses for NO (Figure 7.3(e)) and NO2(Figure 7.3(f)) are not so well defined. These gases are both primary pollutants
(dominated by NO) and secondary pollutants (dominated by NO2). Hence we
2002 Jeremy Colls
-
7/27/2019 TF5651_ch07
8/16
can see both patterns of directional dependence, with NO behaving more like
SO2, and NO2 more like O3.
Figure 7.4 shows the location of the monitoring site in relation to localtopographical features and pollution sources, knowledge of which can help to
understand the pollution data. Two factors strongly influence the wind rose the
Pentland Hills run north-eastsouth-west, and the Firth of Forth estuary
generates northsouth sea breezes. The combined effect of both factors produces
the strongly south to south-west wind rose which was seen in Figure 7.3(a). The
main sources responsible for the primary pollutants are urban areas and roads.
The city of Edinburgh, which lies 10 km to the north, generates the northerly
SO2 peak seen on Figure 7.3(d). Although the small town of Penicuik lies close
to the south, there is apparently no SO2 peak from that direction, nor is there anidentifiable source responsible for the south-east SO2 peak. Detailed inter-
pretation of such data cannot be made without a detailed source inventory, since
weak low sources close to the monitor can produce similar signals to those from
stronger higher more distant sources.
Figure 7.4 The topography, urban areas and roads around the CEH measurementsite at Penicuik.
2002 Jeremy Colls
-
7/27/2019 TF5651_ch07
9/16
7.4 DIURNAL VARIATIONS
Another way of gaining insight into pollutant occurrence is to average the con-centrations according to the hour of the day. We must be careful, though, to allow
for the ways in which the characteristics of the days themselves change during the
year. InFigures 7.5and7.6, we have carried out diurnal analyses for the months
of June and December respectively.
In June, the period of daylight is long and the peak solar radiation
high (Figure 7.5(a)). The air temperature is warm (Figure 7.5(b)) and shows a
pronounced afternoon increase in response to the solar radiation. Average
windspeeds are low, with convective winds increasing during daylight
hours (Figure 7.5(c)). The ozone concentration (Figure 7.5(e)) shows abackground level of about 23 ppb, with photochemical production increasing
this concentration to a peak of 30 ppb at around 1600. The diurnal variation of
SO2 (Figure 7.5(f)) is quite different there are sharp peaks centred on 1000 and
1700 which result from local emissions, and no clear dependence on solar
radiation. As with the pollutant roses, the diurnal variations of NO and NO2 are
a blend of these two behaviours.
Figure 7.6(ah) show the corresponding variations during December. Now,
the days are short and solar energy input is a minimum. Air temperatures are low
and barely respond to the sun, windspeeds are higher and almost constant throughthe day. As a consequence of these changes, ozone shows no photochemical
production in the afternoon. Somewhat surprisingly, SO2 has lost all trace of
the 1000 and 1700 spikes, although these remain very clear for NO. The pattern
for NO2 is very similar in December and June.
7.5 SHORT-TERM EVENTS
So far in this chapter, we have grouped data in different ways specifically to
smooth out short-term variations and clarify patterns. We can also benefit from
a detailed look at shorter periods of measurements. In Figure 7.7are shown the
time series for one particular period of 300 h (between 4100 and 4400 h, or
roughly from the 20 June to the 3 July). The wind direction was generally
southerly, except for two periods of about 50 h each when it swung to the north
and back several times. When the wind direction changed, there were bursts of
higher concentrations of NO, NO2 and SO2, and the O3 background concentra-
tion was disturbed. These changes were probably associated with emissionsfrom a local source that was only upwind of the measurement site when the wind
was from one particular narrow range of directions. This would not only bring
the primary pollutants, but also excess NO to react with the O3 and reduce the
concentration of the latter.
2002 Jeremy Colls
-
7/27/2019 TF5651_ch07
10/16
Figure 7.5 Average diurnal variations in June.
2002 Jeremy Colls
-
7/27/2019 TF5651_ch07
11/16
Figure 7.6 Average diurnal variations in December.
2002 Jeremy Colls
-
7/27/2019 TF5651_ch07
12/16
Figure 7.7 Variations of gas concentration with wind direction over a single period of300h. 2002 Jeremy Colls
-
7/27/2019 TF5651_ch07
13/16
7.6 FREQUENCY DISTRIBUTIONS
As discussed inChapter 4, the concentrations of air pollutants often show a log-normal frequency distribution i.e., the logarithms of the concentrations are
distributed normally. We have analysed the hourly and daily means from the Centre
for Ecology and Hydrology data set to confirm this. The overall range of the
concentrations that occurred over the year was divided into subranges, and the
number of values that fell within each subrange was counted. This is the frequency
distribution. These counts were then expressed as a percentage of the total number,
and summed by subrange to give the cumulative frequency distribution. If the
frequency distribution is log-normal, then the cumulative distribution plots as a
straight line on log-probability axes. In Figures 7.8 and7.9, the distributions for thehourly and daily means of the five gases are shown. Those for SO2, NO and NOxare quite linear, NO2 less so, and O3 not at all. The O3 distribution is characteristic
100
10
1
95.0 90.0 70.0 50.0 30.0 10.0 5.0 1.0 0 1.
Proportion of the time for which the hourly meanconcentration exceeded the value given on the axis/%y
Gasconcentration/ppb
O3
NOx
NO2
NO
SO2
Figure 7.8 Cumulative frequency distributions of hourly-mean pollutant concentrations.
2002 Jeremy Colls
http://tf5651_ch04.pdf/http://tf5651_ch04.pdf/ -
7/27/2019 TF5651_ch07
14/16
of a variable that has a significant background component the concentration was
not low for as high a proportion of the time as the log-normal form requires.
Certain statistical parameters can be derived from the log-probability curves.
Commonly quoted are the concentrations below which the value falls for 50%
(median), 90%, 98% and 99% of the time. The 98% value of daily means is used
in European Union Directives on air quality it is equivalent to stating that the
concentration should not be exceeded for more than seven days in the year. The
values indicated by Figures 7.8and 7.9 are extracted in Table 7.2. It is clear thatone such parameter alone is not sufficient to define the distribution. If the distri-
bution is linear, we can measure the gradient, which is equivalent to the standard
geometric deviation of the sample. Then the median and gradient completely
define the population distribution. There are more complex formulations of the
log-normal distribution that can be used to describe non-linear data sets.
100
10
1
99.9 99.0 95.0 70.0 50.0 10.0 5.0 1.0 0 1.
Proportion of the time for which the daily meanconcentration exceeded the value on the axis/%y
Gasconcentra
tion/ppb O3
NOx NO2
NO
SOz
90.0 30.0
Figure 7.9 Cumulative frequency distributions of daily-mean pollutant concentrations.
2002 Jeremy Colls
-
7/27/2019 TF5651_ch07
15/16
7.7 FURTHER STATISTICAL ANALYSES
Other standard statistical parameters can be used to describe the data. The
summary statistics for the 1993 data are given in Table 7.3.
Finally, we can apply the ideas on the relationships between the period
maxima that were outlined in Chapter 4. If the maximum 1-h concentration is
Cmax,1 h, and the maximum over any other period t is Cmax,t, then we should find
that Cmax,t Cmax,1 h tq, where q is an exponent for the particular gas. For the
Table 7.2 Percentiles of hourly and daily means
Pollutant Hourly means Daily means
50 90 98 99 50 90 98 99(per cent) (per cent)
O3 24 34 40 42 24 32 36 38SO2 1.5 6 13 15 2 5 9 11NO 1 3 16 27 1 4 13 18NOx 5 21 41 56 7 18 30 39NO2 5 17 27 31 7 14 22 23
Table 7.3 Summary statistics for the 1993 CEH data set
Gas Hourly means/ppb Daily means/ppb Weekly means/ppb
Mean Median Standard Median Standard Median Standarddeviation deviation deviation
O3 23.6 24.7 9.2 23.8 7.4 23.4 5.8SO
2
2.6 1.6 3.1 2.0 2.2 2.7 1.4NO 1.7 0.3 7.5 0.7 3.7 1.3 2.0NOx 9.2 5.8 11.6 7.5 7.8 8.9 5.3NO2 7.5 5.3 6.9 6.7 5.2 7.8 27.2
Table 7.4 Values ofCmax,t for the different pollutants
Cmax,t Pollutant
O3 SO2 NO NOx NO2
Cmax, 1 h 57 47 185 186 72Cmax, 1 day 43 16 38 57 37Cmax, 1 week 34 7 11 24 16Cmax, 1 month 31 5 7 20 13q 0.095 0.35 0.51 0.35 0.27
2002 Jeremy Colls
http://tf5651_ch04.pdf/http://tf5651_ch04.pdf/http://tf5651_ch04.pdf/ -
7/27/2019 TF5651_ch07
16/16
1993 data set, the maximum values over the different averaging periods are
shown inTable 7.4. Plotting log Cmax,t against log t gives the results shown in
Figure 7.10, in which the gradients of the lines give the values of q for thedifferent gases.
1000
100
10
1
1 10 100
Number of hours
Maximum
concentrationduringperiod/pp
b
O3
NOxNO2 NO
SO2
1 hour 1 day 1 week 1 month
Figure 7.10 Correlations between the maximum period average and the averaging period.