r Atm spheric Pollution Research · (T2), at 00:00 UTC, humidity and cloudiness data (at both...

8
Atmospheric Pollution Research 4 (2013) 290Ͳ297 © Author(s) 2013. This work is distributed under the Creative Commons Attribution 3.0 License. A Atm spheric P Pollution R Research www.atmospolres.com An analysis of ozone variation in the Greater Athens Area using Granger Causality Athanasios Sfetsos, Diamando Vlachogiannis National Centre for Scientific Research “Demokritos”, Environmental Research Laboratory, AgiaParaskevi, Attikis, 153 10, Greece ABSTRACT Air pollution in urban areas is a topic of interest for many researchers as it impacts negatively the human health, the environment and the quality of life. As part of the effort in exploring ways for efficient and timely assessment of the urban air pollution patterns and their association with the local meteorology and photochemistry, an advanced statistical approach is proposed for the analysis of the spatiotemporal ozone (O 3 ) variations and interdependencies to other pollutants. The focus of the work is placed on the investigation and determination of the causality between the local and regional factors causing the observed ozone variability, by applying a holistic methodology on multiple–year meteorological data and air pollution monitoring data, referenced in Athens (Greece). The methodology includes the Positive Matrix Factorization (PMF), for data scaling and reduction, a k–means clustering algorithm, for determining groups of data with common properties, and importantly, the Granger Causality test, for obtaining the causal links between the ozone and nitrogen oxides as well as the local meteorological conditions. The methodology revealed six dominant combined patterns of weather and air pollution. The application of the Granger Causality allowed the determination of relationships across the pollution patterns of dispersed geographic locations and the interdependence of those with the local meteorological conditions and photochemistry effects. Keywords: Granger Causality, ozone, NO X , Athens Corresponding Author: Athanasios Sfetsos : +30Ͳ210Ͳ650Ͳ3403 : +30Ͳ210Ͳ652Ͳ5004 : [email protected] Article History: Received: 11 April 2013 Revised: 23 May 2013 Accepted: 26 May 2013 doi: 10.5094/APR.2013.032 1. Introduction Tropospheric ozone (O 3 ) bears significant oxidant capacity and it is deemed hazardous to human health above certain defined levels which are otherwise known as air quality limits. Since ozone is a secondary pollutant, its concentration depends on the emission sources of the primary pollutants (NO X and NMVOC) and the prevailing meteorological conditions. At a given location, apart from being produced by a series of complex photochemical reactions (Seinfeld and Pandis, 2006), ozone is also brought in by regional transport and downward stratospheric transport. Thus, both short and long–range non–linear processes could contribute to the formation and or increase of ozone. The sinks of ozone are a series of kinetic and photolysis reactions involving nitrogen oxides (NO X ), dry deposition and dissolution into the sea water. Due to these sources and sinks, ozone can exhibit various levels of concentrations and diurnal and seasonal cycles among areas of different characteristics, e.g. urban, remote, rural, road–sites etc. Research efforts so far have investigated the factors that determine urban atmospheric pollution and population exposure based on extrapolation of data or surrogate information such as air quality networks, regression analysis, emission and dispersion modeling (e.g. Kanaroglou et al., 2005; Ballesta et al., 2008; Fann et al., 2012). The underlying chemical process in ozone formation and destruction are highly complex as recent works show (Avino and Manigrasso, 2008; Movassaghi et al., 2012) and is an open research issue. In addition, various studies have shown close relations between the meteorological conditions with the concentrations of the air pollutants (e.g. Greene et al., 1999). These methods however are associated with uncertainties in assigning values to the parameters that describe the various complex processes of the atmospheric chemistry and dynamics. There have also been approaches to attribute sources and meteorological factors to exposure concentrations based on statistical tools such as Multivariate Linear Regression and Principal Component Regression but these methods cannot adequately model the non– linear relationships associated with the lower atmospheric ozone. Alternative approaches such as the Artificial Neural Networks (ANN) allow for non–linear relationships between the variables, nevertheless, they cannot address both linear and non–linear patterns equally well (Zhang, 2003). Combined multivariate techniques (e.g. PCA, PCR, Multiple Regression Analysis, ANN) have been applied to forecast ozone concentrations by deducing the groups of days with similar characteristics, relating the types of weather to air quality patterns (e.g. Al–Alawi et al., 2008; Vardoulakis and Kassomenos, 2008; Gvozdic et al., 2011; Ozbay et al., 2011). Yet, so far, there is not any statistical method in the literature for obtaining the causality between the variables that determine the ozone concentrations and at the same time quantifying their magnitude. To fill in this gap, a new concept is applied based on Granger Causality. Granger Causality is a statistical approach, initiated in the research fields of Econometrics. It has been applied in the exploration of the causal relationships between air pollution, meteorology, health effects and mortality rates (Wang et al., 2008). The mathematical formulation of Granger Causality is based on linear regression of stochastic processes (Granger, 1969; Pitard and Viel, 1999). Here, the proposed application of Granger

Transcript of r Atm spheric Pollution Research · (T2), at 00:00 UTC, humidity and cloudiness data (at both...

Page 1: r Atm spheric Pollution Research · (T2), at 00:00 UTC, humidity and cloudiness data (at both intervals) were also studied but excluded from the analysis, as no statistically significant

Atmospheric Pollution Research 4 (2013) 290 297

© Author(s) 2013. This work is distributed under the Creative Commons Attribution 3.0 License.

AAtm spheric PPollution RResearchwww.atmospolres.com

An analysis of ozone variation in the Greater Athens Area using Granger Causality

Athanasios Sfetsos, Diamando Vlachogiannis

National Centre for Scientific Research “Demokritos”, Environmental Research Laboratory, Agia–Paraskevi, Attikis, 153 10, Greece

ABSTRACTAir pollution in urban areas is a topic of interest for many researchers as it impacts negatively the human health, theenvironment and the quality of life. As part of the effort in exploring ways for efficient and timely assessment of the urbanair pollution patterns and their association with the local meteorology and photochemistry, an advanced statisticalapproach is proposed for the analysis of the spatiotemporal ozone (O3) variations and interdependencies to otherpollutants. The focus of the work is placed on the investigation and determination of the causality between the local andregional factors causing the observed ozone variability, by applying a holistic methodology on multiple–yearmeteorological data and air pollution monitoring data, referenced in Athens (Greece). The methodology includes thePositive Matrix Factorization (PMF), for data scaling and reduction, a k–means clustering algorithm, for determining groupsof data with common properties, and importantly, the Granger Causality test, for obtaining the causal links between theozone and nitrogen oxides as well as the local meteorological conditions. The methodology revealed six dominantcombined patterns of weather and air pollution. The application of the Granger Causality allowed the determination ofrelationships across the pollution patterns of dispersed geographic locations and the interdependence of those with thelocal meteorological conditions and photochemistry effects.

Keywords: Granger Causality, ozone, NOX, Athens

Corresponding Author:Athanasios Sfetsos

: +30 210 650 3403: +30 210 652 5004: [email protected]

Article History:Received: 11 April 2013Revised: 23 May 2013Accepted: 26 May 2013

doi: 10.5094/APR.2013.032

1. Introduction

Tropospheric ozone (O3) bears significant oxidant capacity andit is deemed hazardous to human health above certain definedlevels which are otherwise known as air quality limits. Since ozoneis a secondary pollutant, its concentration depends on theemission sources of the primary pollutants (NOX and NMVOC) andthe prevailing meteorological conditions. At a given location, apartfrom being produced by a series of complex photochemicalreactions (Seinfeld and Pandis, 2006), ozone is also brought in byregional transport and downward stratospheric transport. Thus,both short and long–range non–linear processes could contributeto the formation and or increase of ozone. The sinks of ozone are aseries of kinetic and photolysis reactions involving nitrogen oxides(NOX), dry deposition and dissolution into the sea water. Due tothese sources and sinks, ozone can exhibit various levels ofconcentrations and diurnal and seasonal cycles among areas ofdifferent characteristics, e.g. urban, remote, rural, road–sites etc.

Research efforts so far have investigated the factors thatdetermine urban atmospheric pollution and population exposurebased on extrapolation of data or surrogate information such as airquality networks, regression analysis, emission and dispersionmodeling (e.g. Kanaroglou et al., 2005; Ballesta et al., 2008; Fann etal., 2012). The underlying chemical process in ozone formation anddestruction are highly complex as recent works show (Avino andManigrasso, 2008; Movassaghi et al., 2012) and is an openresearch issue.

In addition, various studies have shown close relationsbetween the meteorological conditions with the concentrations of

the air pollutants (e.g. Greene et al., 1999). These methodshowever are associated with uncertainties in assigning values tothe parameters that describe the various complex processes of theatmospheric chemistry and dynamics. There have also beenapproaches to attribute sources and meteorological factors toexposure concentrations based on statistical tools such asMultivariate Linear Regression and Principal ComponentRegression but these methods cannot adequately model the non–linear relationships associated with the lower atmospheric ozone.Alternative approaches such as the Artificial Neural Networks(ANN) allow for non–linear relationships between the variables,nevertheless, they cannot address both linear and non–linearpatterns equally well (Zhang, 2003). Combined multivariatetechniques (e.g. PCA, PCR, Multiple Regression Analysis, ANN) havebeen applied to forecast ozone concentrations by deducing thegroups of days with similar characteristics, relating the types ofweather to air quality patterns (e.g. Al–Alawi et al., 2008;Vardoulakis and Kassomenos, 2008; Gvozdic et al., 2011; Ozbay etal., 2011).

Yet, so far, there is not any statistical method in the literaturefor obtaining the causality between the variables that determinethe ozone concentrations and at the same time quantifying theirmagnitude. To fill in this gap, a new concept is applied based onGranger Causality. Granger Causality is a statistical approach,initiated in the research fields of Econometrics. It has been appliedin the exploration of the causal relationships between air pollution,meteorology, health effects and mortality rates (Wang et al.,2008). The mathematical formulation of Granger Causality is basedon linear regression of stochastic processes (Granger, 1969; Pitardand Viel, 1999). Here, the proposed application of Granger

Page 2: r Atm spheric Pollution Research · (T2), at 00:00 UTC, humidity and cloudiness data (at both intervals) were also studied but excluded from the analysis, as no statistically significant

Sfetsos and Vlachogiannis – Atmospheric Pollution Research (APR) 291

Causality is aided with a dimension reduction algorithm (PMF) of aset of unlagged and lagged explanatory variables that decrease thenumber of parameters and removes multi–collinearity, yielding astatistically significant variable subset (Sfetsos and Vlachogiannis,2010). In general terms, the methodology offers significantadvantages over simple correlation analysis by removing anyspurious correlation of the examined time series. The aim of thisstudy is to assess the ozone distribution at different locations inthe city of Athens and time scales by identifying the factors thatcause the spatial and temporal characteristics of the observedozone concentrations. In particular, the current work focuses onthe investigation of the causal relationship between observed NOXconcentrations, meteorological variables and surface O3 in theAthens area by applying a multistep methodology which mostimportantly incorporates the Granger Causality test.

2. Description of Data Sets

The city of Athens and the peripheral boroughs are located inbasin of complex terrain characterized by an alternation of smallhills and flat areas. The basin itself is surrounded by threemountain ranges of ridge heights up to 1 400 m and with anopening to the Saronic Gulf in the southwest (Figure 1). Theregion’s weather is dominated from the interaction of large andlocal scale circulation systems and the climate is typicallyMediterranean with hot dry summers and wet mild winters (e.g.Sindosi et al., 2003). Air pollution problems occur frequently in thedensely populated city area and these have been attributed to anumber of reasons such as the physiographic characteristics of theregion, the emissions mainly from traffic and the industrial zonessurrounding the area (Aleksandropoulou et al., 2011; Progiou andZiomas, 2011) and the prevailing meteorological patterns (intensesunlight, temperature inversions, light winds, low mixing layerheight, etc). In particular, it has been found that the intense anti–cyclonic conditions with weak sea breezes and weak flows, thenocturnal strong temperature inversions and the low pressuresystems in the west of Greece enhance the concentration levels ofair pollutants substantially (e.g. Kallos et al., 1993; Kassomenos andKoletsis, 2005).

Hourly average concentration data of O3 and NOX covering asix–year period (2000–2005) is the starting data of this study,available from observational ground stations of the air qualitynetwork, operating under the supervision of the Air Pollution andNoise Control Division of the Hellenic Ministry of Environment(MEECC, 2011). The process of data selection was based on thecriterion of a representative, in terms of urban and suburban,distribution of the measuring stations, as well as on the conditionof the availability of an adequate number of temporally commonsamples to reach a statistically meaningful analysis. The stationsthat fulfilled the criteria for selection were (i) Patision (PAT), (ii)Agia Paraskevi (AGP), (iii) Lykovrisi (LYK), (iv) Thrakomakedones(THR) and (v) Marousi (MAR) (Figure 1).

These were used to estimate the maximum 8–h running meanO3 value and the maximum 3–h running mean NOX, both used bynational guidelines to setup alerting thresholds (MEECC, 2011).

The meteorological data were obtained from the NationalCentres for Environmental Prediction (NCEP) re–analysis data set(NCEP, 2013) over the respective period of time (6–years) on adaily basis at 00:00 and 12:00 UTC hours, complementary to the airquality information. The meteorological variables selected andtheir mean statistical calculated values, shown in Table 1, were: (i)the Mixing Layer Height (MLH), (ii) the Temperature (T) at 2 m, (iii)the E–W (u10) and (iv) N–S (v10) components of the wind speed at10 m above ground level, (v) the E–W (u850) and (vi) N–S (v850)components of the wind speed at 850 hPa. Several othermeteorological parameters, such as E–W (u850), the N–S (v850)components of the wind speed at 850 mb, the temperature at 2 m

(T2), at 00:00 UTC, humidity and cloudiness data (at both intervals)were also studied but excluded from the analysis, as no statisticallysignificant correlation (at the 95% confidence interval) was foundto the air quality data.

The daily combined data set (Table 1, meteorological variablesand daily maximum of running 8–hour mean O3 values and 3–hourmean NOX values) were used for defining the clusters in the regionand applying the Granger Causality to capture directionalcorrelations between spatially monitored pollutants.

3. Methodology

The present study was based on a methodology designed toperform data handling and processing, comprising the followingbasic steps:

(1) Scaling and dimension reduction of the original data set byapplying Positive Matrix Factorization (PMF).

(2) Application of a modified outlier removal enhanced k–meansclustering algorithm to determine groups of data with commonproperties.

(3) Application of the Granger Causality to obtain the causalrelationships between O3, NOX concentrations and theprevailing meteorological conditions in the area of interest andfor each determined cluster.

(4) Analysis of the linkages in each cluster, and the additional useof the Pearson correlation coefficient in the hourly O3 and NOXdata to obtain short–term (sub–daily) correlation in the data.

At the first step, data reduction was performed by applyingthe condition that all datasets from the stations contained validmeasurements, which returned a dataset significantly reduced insize. The selected variables were then transformed by applyingEquation (1), so as to have zero mean and unity variance and thuscommon scaling.

)min()( , iiitransfi

iii xxxand

xxxx (1)

At a subsequent step, the resulting variables were scaled to benon–negative to allow for the proper implementation of thePositive Matrix Factorization (PMF).

3.1 Positive Matrix Factorization

The Positive Matrix Factorization (PMF) method was appliedon the original data set to obtain its dimension reduction (Paateroand Tapper, 1994; Lee and Seung, 1999). The solution of the PMFrevealed well formed patterns of the initial variables (Table 2).These were further processed by keeping only the most significantvariables in each factor. The number of Factors was extracted fromthe shape of the objective function, (Yakovleva et al., 1999) and itwas further confirmed using the PCA with the eigenvalue greaterthan one rule, as suggested in Shahgedanova et al. (1999). Theoptimum number of factors returned by both applied approacheswas found to be equal to five, and it was used later in theclustering analysis.

Factor 1 corresponded to the O3 and NOX parameters at PAT,LYK, MAR stations and O3 in the suburban locations of THR andAGP. Factor 2 was associated to the N–S components at allintervals and to the noon E–W wind component at 850 mb. Factor3 comprised the O3 values at the city centre and the NOX at thesuburban stations (THR, AGP). Factor 4 was related to the E–Wwind component at midnight and midday and the MLH variable.Finally, Factor 5 was related to the O3 at the AGP station, thetemperature and MLH at midday.

Page 3: r Atm spheric Pollution Research · (T2), at 00:00 UTC, humidity and cloudiness data (at both intervals) were also studied but excluded from the analysis, as no statistically significant

Sfetsos and Vlachogiannis – Atmospheric Pollution Research (APR) 292

Figure 1. Location of the air quality monitoring stations and topography of the region (Athens, Greece).

Table 1. Selected meteorological and air quality variables with calculated statistical properties

Variable Mean STD Units

00:00u10 0.92 2.48 m/s

v10 –2.23 3.9 m/s

12:00

T2 292.67 6.76 K

MLH 1 115.29 444.9 m

u10 –1.48 2.25 m/s

v10 –1.42 5.01 m/s

u850 2.26 4.43 m/s

v850 –2.25 6.34 m/s

Max 8–hO3 mean

PAT 108.61 37.51 g/m3

LYK 45.18 21.79 g/m3

MAR 53.65 23.65 g/m3

THR 16.93 14.71 g/m3

AGP 28.7 15.63 g/m3

Max 3–h NOX

mean

PAT 263.92 112.34 g/m3

LYK 153.81 64.53 g/m3

MAR 165.95 94.66 g/m3

THR 122.99 39.77 g/m3

AGP 131.42 37.04 g/m3

Page 4: r Atm spheric Pollution Research · (T2), at 00:00 UTC, humidity and cloudiness data (at both intervals) were also studied but excluded from the analysis, as no statistically significant

Sfetsos and Vlachogiannis – Atmospheric Pollution Research (APR) 293

Table 2. The factor loadings

Factor 1 Factor 2 Factor 3 Factor 4 Factor 5

PAT O3 0.21 0 0.35 0 0

PAT NOX 0.43 0 0 0 0

LYK O3 0.35 0 0 0 0

LYK NOX 0.37 0 0 0 0

MAR O3 0.36 0 0 0 0

MAR NOX 0.43 0 0 0 0

THR O3 0.3 0 0 0 0

THR NOX 0 0 0.62 0 0

AGP O3 0.3 0 0 0 0.21

AGP NOX 0 0 0.6 0 0

00:00 u10 0 0 0 0.51 0

00:00 v10 0 0.49 0 0 0

12:00 u850 0 0.26 0 0.44 0

12:00 v850 0 0.59 0 0 0

12:00 T2 0 0 0 0 0.89

12:00 MLH 0 0 0 0.43 0.27

12:00 u10 0 0 0 0.57 0

12:00 v10 0 0.52 0 0 0

3.2. Clustering

The clustering approach adopted in the current investigationwas a modified, k–means algorithm (e.g. Jain et al., 1999) thatincluded an outlier removal process. The later was based on theGrubbs (1969) test and was applied during the phase of calculatingthe Cluster centers. In line with that test, the data of each Clusterwere checked on each computational iteration for the presenceand removal of outliers. A normalized “compactness and separation” criterion was used for determining the number of Clusters,following that of Kim et al. (2001). The application of the describedmethodology in the five factors returned by the PMF yielded intotal six distinct Clusters (Figure 2), their percentage distribution ofwhich, calculated over the time span of the study, could be seen inTable 3b.

3.3. Granger Causality

The Granger causality test (Granger, 1969) was applied oncethe data were grouped into Clusters to establish those parametersthat cause the spatial and temporal characteristics of the observedozone concentrations. The process was based on the specificationof a bivariate kth order vector auto–regressive model. In thisprocess, a generic variable y was said to “Granger cause” a variablex if, given the past values of x, the lagged values of x and y couldbetter model x. A generic vector autoregressive equation for theGranger causality is:

xn,t=A0 + A1xn,t 1 + A2xn,t 2 +…+ Aixn,t + x,t (2)

where t=1,2,…,T represented the number of samples for eachCluster, was the order of the distributed lag, and t was a randomerror term with a standard normal distribution, for the nth Cluster.The vector matrix x contains the parameters used in the analysisand Ai were matrices of coefficients.

The null hypothesis followed in the Granger causality test wasthat a variable xi x does not cause xj x that was represented by allthe coefficients Aj=0 in A (jth row) while the alternative hypothesiswas that Aj 0 for at least one Aj. The interpretation of theregression based on the F–test implied that the x data should bestationary. The variable xi was interpreted so as to Granger cause xjonce rejecting the null hypothesis on 95% significance level (pGC–

value 0.05 in section below). For the reasons of a concisediscussion on the causality analysis results, only the variables fromthose examined bearing a causal relationship are discussed here.

4. Analysis of Results

Data on the obtained clusters are presented in Tables 3a and3b. A classification of the O3 and NOX values was imposed as high,medium and low relative to the legislation limits (120 g/m3 and200 g/m3, respectively) and the urban background values.Following this reasoning, the O3 classes were assumed as high(>90 g/m3), medium (40–90 g/m3) and low (<40 g/m3).Similarly, the NOx values were classified as high (>150 g/m3),medium (80–150 g/m3) and low (<80 g/m3), according to recommendations of MEECC (2011).

Cluster 1. The data of Cluster 1 were mainly characterized by highO3 concentration values in the city centre (PAT) of the order of100 g/m3, medium values in the stations LYK and MAR andrelatively low values in the suburban stations (THR and AGP). Thispattern is also associated with high NOX values in the city centreand medium concentration values at the suburban stations. Themeteorological conditions were characterized predominantly bylow S–SW winds near the surface level and stronger winds at850 hPa. Those days were also found to be characterized by hightemperatures and predominantly clear skies (from the inspectionof the cloudiness parameter). The application of the Grangercausality yielded the following:

LYK O3 was Granger caused by MAR O3.LYK NOX was Granger caused by MAR NOX.MAR NOX was Granger caused by MAR O3 and LYK NOX.

The associations were caused by the proximity of the stationsand the photochemical reactions related to the local production ofO3 and nitrogen oxides from local emissions sources (traffic).Furthermore, the correlation analysis of the hourly O3 data in LYKand MAR indicated high positive correlation (0.6) on a very shortscale (of 1–2) hours, attributed to the slow transportation of ozonebetween the stations, supporting the Granger causality analysisfindings. Moreover, a negative correlation between the hourlyconcentrations of NOX and O3 in MAR was found, indicating thatthe depletion of O3 was related to the local increase of NOX andvice–versa.

Page 5: r Atm spheric Pollution Research · (T2), at 00:00 UTC, humidity and cloudiness data (at both intervals) were also studied but excluded from the analysis, as no statistically significant

Sfetsos and Vlachogiannis – Atmospheric Pollution Research (APR) 294

Table 3a. Summary information on Cluster centres of O3 and NOX concentrations

PAT–O3

( g/m3)PAT–NOx( g/m3)

LYK–O3

( g/m3)LYK–NOx( g/m3)

MAR–O3

( g/m3)MAR–NOx( g/m3)

THR–O3

( g/m3)THR–NOx( g/m3)

AGP–O3

( g/m3)AGP–NOx( g/m3)

1 98.7 235.53 40.72 139.8 48.66 142.03 14.35 112.82 24.79 122.35

2 79.57 148.99 27.09 96.82 33.28 68.72 6.5 103.03 16.79 113.55

3 140.21 327.44 55.19 185.46 65.01 219.89 22.7 163.82 34.85 168.19

4 116.19 361.78 60.6 202.52 71.11 248.98 25.82 107.14 39.37 117.25

5 161.03 503.75 82.96 273.05 96.35 369.26 38.71 148.97 53.84 154.84

6 99.13 228.84 39.66 136.48 47.48 136.36 13.75 115.71 25.06 124.95

Table 3b. Summary information on Cluster centres of meteorological variables ( : wind speed (m/s), û: wind direction (degrees))

00:00 00:00û 10

12:00 12:00û 850

12:00T2 (K)

12:00MLH (m)

12:00 12:00û 10

Percentage%

1 3.76 238.58 10.12 234.71 297.12 1 507.08 4.24 191.7 13

2 5.9 1.4 9.48 9.37 295.14 977.34 6.81 22.08 23

3 1.4 342.3 2.08 284.87 292.35 1 040.73 1.99 83.14 20

4 2.07 279.4 5.28 254.68 287.14 1 243.19 1.14 162.95 13

5 1.47 292.59 4.02 256.26 298.49 1 116.51 1.3 127.43 5

6 3.2 350.27 4.21 344.53 293.03 1 064.03 3.25 34.98 25

Figure 2. Determination of the number of Clusters.

Cluster 2. The data belonging to Cluster 2 were characterized bymedium to low O3 values over the entire city. The same patternappeared also in the NOX values. A strong northern winddominated Cluster 2 days transporting pollutants away and thusventilating the urban area. The application of Granger causalitygave the following results:

PAT O3 was Granger caused by PAT NOX, and vice–versa,showing local production and destruction of O3 due tophotochemistry.LYK and MAR NOX Granger caused each–other. That was aresult of the temporal co–variability of the hourly NOXconcentrations between the two stations, where NOX wastransported regionally from the industrial zone located far inthe north of the area of interest.MAR NOX was also Granger caused by THR NOX, attributed tothe northern wind conditions.AGP O3 and AGP NOX was Granger caused by temperature T2,which was the dominant meteorological factor. This resultdemonstrated the temperature dependence of NOX concentrations and the large impact of temperature on ozone production(e.g. Vogel et al., 1999).

Cluster 3. The data belonging to Cluster 3 were characterized byhigh O3 values in the city centre (PAT), medium rangeconcentrations ( 50–60 g/m3) in the LYK and MAR stations, andby approximately 20 g/m3 lower concentration values in thesuburban stations. The NOX spatial variation exhibited similarpatterns and the concentration values remained high. Theprevailing meteorological conditions were very low winds duringthe night time, almost calm and stagnant, which were followed bylight eastern (E) winds near the surface and very low westernwinds (W) in the upper atmosphere, during the day. Theapplication of the Granger causality revealed the following:

PAT O3 was Granger caused by PAT NOX, and vice–versa,showing local production and destruction of O3 due tophotochemical reactions. Figure 3 shows a statistically significant positive correlation between NOX and O3 at PAT hourlyvalues, where an increase of NOX attributed to local emissionsources, resulted in an increase of O3 locally produced MARNOX was Granger caused by MAR O3 and temperature values,T2.THR O3 was Granger caused by AGP O3 which was evident oftransported O3 due to E surface winds. Actually, Figure 4

Page 6: r Atm spheric Pollution Research · (T2), at 00:00 UTC, humidity and cloudiness data (at both intervals) were also studied but excluded from the analysis, as no statistically significant

Sfetsos and Vlachogiannis – Atmospheric Pollution Research (APR) 295

reveals a positive correlation between those two variablesparticularly in the time frame between 1–3 hours.THR NOX was Granger caused by AGP NOX and LYK NOXfollowing a similar reasoning as previously.

Cluster 4. The data belonging to Cluster 4 were characterized byrather high levels of NOX and O3 in the city centre and at thesuburban stations and medium to low values of the compounds atthe suburban–background stations. The NOX was found higher inthe PAT, LYK and MAR stations. The prevailing meteorologicalconditions were moderate W winds at night time, moderate tohigh W winds in the upper atmosphere with a prevailing southern(S) wind component near the surface at midday. The temperaturewas of the order of 13 °C and the MLH was of rather high value.The application of the Granger causality revealed that:

LYK NOX was Granger caused by PAT NOX, an indication oftransported NOX.MAR O3 was Granger caused by MAR NOX. The hourly datashowed a positive correlation between the two variables,indicating that locally emitted NOX dominated the O3production and that the correlation was statistically significanton a very short–term.AGP O3 was Granger caused by MAR O3, revealing transportation by the W dominated wind component. The analysis ofthe hourly concentrations between the two stations yielded atravel time of the order of 1–3 hours.

Cluster 5. The data belonging to Cluster 5 were characterized bythe highest values of all pollutants at all stations and compounds.The weather pattern comprised light W winds during the night,light wind conditions during midday. Clear and warm conditionsprevailed, an indication of light sea breeze conditions, typicallyassociated with high ozone values in the area. The application ofthe Granger causality revealed that:

THR O3 was Granger caused by LYK O3 and MAR O3, producedlocally in the vicinity. Ozone in THR exhibited similar patterns

with both LYK and MAR stations, displaying significant andpersisting positive correlation in the short term (1–2 hourlylags).AGP O3 was Granger caused by MAR O3, both showingtransported ozone to the hilly areas of Attica. A similar patternof positive correlation was found between the two stations asin the previous case, although the correlation coefficient (ofthe hourly values) was higher between AGP and MAR stationsreaching a peak value of 0.6.

Cluster 6. The data belonging to Cluster 6 were mainlycharacterized by medium O3 concentration values in the city centre(PAT), low values in MAR and LYK and relatively low values at thesuburban stations of THR and AGP. Additionally, those days werecharacterized by medium NOX values all around the city. Theweather conditions during those events were characterizedpredominantly by medium N–NW winds and relatively coldtemperatures ( 10 °C), pointing clearly to the winter season. Theapplication of the Granger causality revealed that:

LYK NOX and MAR NOX Granger caused each–other. Therefore,the local emissions influence was evident.AGP O3 and AGP NOX were Granger caused by THR O3 and THRNOX, respectively, due to the transported concentrations. Ashort–term positive correlation was also found between thetwo stations.

5. Conclusions

The present work introduces the application of an integratedmethodology centered on the Granger causality for analyzingspatially distributed O3 and NOX patterns combined withmeteorological observations at various monitoring stations. Itallowed for a statistical calculation of the variables that Grangercause variability in another parameter. The analysis used dailymaximum 8–h O3 and 3–h NOX running mean concentrations in thecity of Athens, which has a well established pattern of high O3values.

Figure 3. Correlation of O3 PAT to NOX PAT lags.

Page 7: r Atm spheric Pollution Research · (T2), at 00:00 UTC, humidity and cloudiness data (at both intervals) were also studied but excluded from the analysis, as no statistically significant

Sfetsos and Vlachogiannis – Atmospheric Pollution Research (APR) 296

Figure 4. Correlation of O3 THR to O3 AGP lags.

As the spatial and temporal ozone variability in urban areasdepends on the meteorological conditions and photochemistry,distinct Clusters with similar properties were estimated andanalyzed for the area. The process employed the Positive MatrixFactorization (PMF), as a dimension reduction algorithm, coupledwith the k–means clustering algorithm to determine the groups ofdata with common properties, accounting for the spatialdiversifications among the monitoring stations. The data analysisrevealed in total the presence of six Clusters of distinct weatherand air pollution types in the region. At a subsequent step, theGranger Causality analysis was applied on each calculated Cluster,investigating and establishing the causal links between themeteorological conditions and air pollution concentration patternsat various locations of the city of Athens.

The Granger Causality aided in the establishment of the linksbetween the explanatory variables and helped to investigatewhether the examined set of variables could provide informationfor improving the prediction of another set of variables. Theapplication of the Granger tools on the daily maximum of running8–hour mean O3 values and daily maximum of running 3–hourmean NOX values in the Attica region showed that themeteorological conditions played the most significant role on theobserved concentrations patterns. Depending on the prevailingweather types, it was revealed that the dominant component waseither regional or local and different patterns of causality (bi–directional or uni–directional) were established. Provided thatobservational data are available from monitoring stations, thediscussed statistical methodology can be applied to studyefficiently, reliably and timely the complex inter–relationship ofphotochemistry and local climate in urban areas, particularlyduring the joint efforts of air pollution control and urbandevelopment.

Acknowledgements

This work was supported partially by EC FP7 under GrantAgreement No 229773 (PERL).

References

Al–Alawi, S.M., Abdul–Wahab, S.A., Bakheit, C.S., 2008. Combining principalcomponent regression and artificial neural networks for more accuratepredictions of ground–level ozone. Environmental Modelling &Software 23, 396–403.

Aleksandropoulou, V., Torseth, K., Lazaridis, M., 2011. Atmosphericemission inventory for natural and anthropogenic sources and spatialemission mapping for the greater Athens area. Water, Air & SoilPollution 219, 507–526.

Avino, P., Manigrasso, M., 2008. Ten–year measurements of gaseouspollutants in urban air by an open–path analyzer. AtmosphericEnvironment 42, 4138–4148.

Ballesta, P.P., Field, R.A., Fernandez–Patier, R., Madruga, D.G., Connolly, R.,Caracena, A.B., De Saeger, E., 2008. An approach for the evaluation ofexposure patterns of urban populations to air pollution. AtmosphericEnvironment 42, 5350–5364.

Fann, N., Lamson, A.D., Anenberg, S.C., Wesson, K., Risley, D., Hubbell, B.J.,2012. Estimating the national public health burden associated withexposure to ambient PM2.5 and ozone. Risk Analysis 32, 81–95.

Granger, C.W.J., 1969. Investigating causal relations by econometric modelsand cross–spectral methods. Econometrica, 37, 424–438.

Greene, J.S., Kalkstein, L.S., Ye, H., Smoyer, K., 1999. Relationships betweensynoptic climatology and atmospheric pollution at 4 US cities.Theoretical and Applied Climatology 62, 163–174.

Grubbs, F.E., 1969. Procedures for detecting outlying observations insamples. Technometrics 11, 1–21.

Gvozdic, V., Kovac–Andric, E., Brana, J., 2011. Influence of meteorologicalfactors NO2, SO2, CO and PM10 on the concentration of O3 in the urbanatmosphere of Eastern Croatia. Environmental Modeling & Assessment16, 491–501.

Jain, A.K., Murty, M.N., Flynn, P.J., 1999. Data clustering: a review. ACMComputing Surveys 31, 264–323.

Kallos, G., Kassomenos, P., Pielke, R.A., 1993. Synoptic and mesoscaleweather conditions during air–pollution episodes in Athens, Greece.Boundary–Layer Meteorology 62, 163–184.

Page 8: r Atm spheric Pollution Research · (T2), at 00:00 UTC, humidity and cloudiness data (at both intervals) were also studied but excluded from the analysis, as no statistically significant

Sfetsos and Vlachogiannis – Atmospheric Pollution Research (APR) 297

Kanaroglou, P.S., Jerrett, M., Morrison, J., Beckerman, B., Arain, M.A.,Gilbert, N.L., Brook, J.R., 2005. Establishing an air pollution monitoringnetwork for intra–urban population exposure assessment: a location–allocation approach. Atmospheric Environment 39, 2399–2409.

Kassomenos, P.A., Koletsis, I.G., 2005. Seasonal variation of thetemperature inversions over Athens, Greece. International Journal ofClimatology 25, 1651–1663.

Kim, D.J., Park, Y.W., Park, D.J., 2001. A novel validity index fordetermination of the optimal number of clusters. IEICE Transactions onInformation and Systems E84d, 281–285.

Lee, D.D., Seung, H.S., 1999. Learning the parts of objects by non–negativematrix factorization. Nature 401, 788–791.

MEECC (Ministry of Environment, Energy and Climate Change), 2011.Annual Report of Air Pollution during 2011, Department of Air Quality,the Ministry’s Directorate, Greece, 78 pages.

Movassaghi, K., Russo, M.V., Avino, P., 2012. The determination and role ofperoxyacetil nitrate in photochemical processes in atmosphere.Chemistry Central Journal 6, S2–S8.

NCEP (National Centers for Environmental Prediction), 2013. http://rda.ucar.edu/datasets/ds083.2, accessed in May 2013.

Ozbay, B., Keskin, G.A., Dogruparmak, S.C., Ayberk, S., 2011. Multivariatemethods for ground–level ozone modeling. Atmospheric Research 102,57–65.

Paatero, P., Tapper U., 1994. Positive matrix factorization: a non–negativefactor model with optimal utilization of error estimates of data values.Environmetrics, 5, 111–126.

Pitard, A., Viel, J.E., 1999. A model selection tool in multi–pollutant timeseries: the Granger–Causality diagnosis. Environmentrics 10, 53–65.

Progiou, A.G., Ziomas, I.C., 2011. Road traffic emissions impact on airquality of the greater Athens area based on a 20 year emissionsinventory. Science of the Total Environment 410, 1–7.

Seinfeld, J.H., Pandis, S.N., 2006. Atmospheric Chemistry and Physics: FromAir, Pollution to Climate Change. John Wiley & Sons, Inc., New York

Sfetsos, A., Vlachogiannis, D., 2010. A new approach to discovering thecausal relationship between meteorological patterns and PM10

exceedances. Atmospheric Research 98, 500–511.

Shahgedanova, M., Burt, T.P., T.D. Davies, 1999. Carbon monoxide andnitrogen oxides air pollution in Moscow. Water, Air & Soil Pollution112, 107–131.

Sindosi, O.A., Katsoulis, B.D., Bartzokas, A., 2003. An objective definition ofair mass types affecting Athens, Greece; the correspondingatmospheric pressure patterns and air pollution levels. EnvironmentalTechnology 24, 947–962.

Vardoulakis, S., Kassomenos, P., 2008. Sources and factors affecting PM10

levels in two European cities: implications for local air qualitymanagement. Atmospheric Environment 42, 3949–3963.

Vogel, B., Riemer, N., Vogel, H., Fiedler, F., 1999. Findings on NOy as anindicator for ozone sensitivity based on different numericalsimulations. Journal of Geophysical Research–Atmospheres 104, 3605–3620.

Wang, Q.X., Liu, Y., Pan, X.C., 2008. Atmosphere pollutants and mortalityrate of respiratory diseases in Beijing. Science of the Total Environment391, 143–148.

Yakovleva, E., Hopke, P.K., Wallace, L., 1999. Receptor modelingassessment of particle total exposure assessment methodology data.Environmental Science & Technology 33, 3645–3652.

Zhang, G.P., 2003. Time series forecasting using a hybrid ARIMA and neuralnetwork model. Neurocomputing 50, 159–175.