Polluted or nonpolluted - a fuzzy logic approach determing...

12
Polluted or nonpolluted - a fuzzy logic approach determing soil pollution Marko KOMAC Robert AJN Geological Survey of Slovenia Dimiceva 14, SI-1000 Slovenia [email protected] ; [email protected] 1. ABSTRACT In this paper a combination of the GIS and the DSS methods is used in order to produce a pollution map of the sampling area. 512 samples of urban soil were collected in the area of Slovenias capital Ljubljana and its vicinity. Samples were analysed for 36 chemical elements, including heavy metals (Cd, Co, Cr, Cu, Hg, Ni, Pb, and Zn). A pollution index (PI) for each sampling point was calculated for each of the eight elements. The method of fuzzy logic was used for the classification of the PI into six classes; non-polluted, partially polluted, marginally polluted, polluted, very polluted, and extremely polluted. Finally maps of the pollution in the sampled area were produced, using kriging method. 2. INTRODUCTION Defining something as polluted has always been ungrateful task. To avoid the subjective approach, as much as that is possible, fuzzy logic methods were used to determine the ratio of pollution in the urban area of Slovenia’s capital, Ljubljana and its vicinity (figure 1). Fuzzy logic is widely used in many fields of research. Since it deals with gradual transition between two states, it can also be used when classifying specific media as polluted or non-polluted. For the purpose of this paper, urban geochemical samples (soil) were used. Figure 1 The sampling area; Slovenias capital Ljubljana and its vicinity Using combined methods mentioned above, the pollution maps of the sampling area for each of the eight elements were produced. Maps describing the levels of pollution are more understandable to a wider public than the maps describing percentiles or

Transcript of Polluted or nonpolluted - a fuzzy logic approach determing...

Page 1: Polluted or nonpolluted - a fuzzy logic approach determing ...mmc2.geofisica.unam.mx/cursos//gest/Articulos/Geostatistics/Pollut… · Polluted or nonpolluted - a fuzzy logic approach

Polluted or nonpolluted - a fuzzy logic approach determing soil pollution

Marko KOMAC Robert �AJN

Geological Survey of Slovenia Dimiceva 14, SI-1000 Slovenia

[email protected] ; [email protected]

1. ABSTRACT In this paper a combination of the GIS and the DSS methods is used in order to produce a pollution map of the sampling area. 512 samples of urban soil were collected in the area of Slovenia�s capital � Ljubljana and its vicinity. Samples were analysed for 36 chemical elements, including heavy metals (Cd, Co, Cr, Cu, Hg, Ni, Pb, and Zn). A pollution index (PI) for each sampling point was calculated for each of the eight elements. The method of fuzzy logic was used for the classification of the PI into six classes; non-polluted, partially polluted, marginally polluted, polluted, very polluted, and extremely polluted. Finally maps of the pollution in the sampled area were produced, using kriging method. 2. INTRODUCTION Defining something as polluted has always been ungrateful task. To avoid the subjective approach, as much as that is possible, fuzzy logic methods were used to determine the ratio of pollution in the urban area of Slovenia's capital, Ljubljana and its vicinity (figure 1). Fuzzy logic is widely used in many fields of research. Since it deals with gradual transition between two states, it can also be used when classifying specific media as polluted or non-polluted. For the purpose of this paper, urban geochemical samples (soil) were used.

Figure 1 � The sampling area; Slovenia�s capital Ljubljana and its vicinity Using combined methods mentioned above, the pollution maps of the sampling area for each of the eight elements were produced. Maps describing the levels of pollution are more understandable to a wider public than the maps describing percentiles or

Page 2: Polluted or nonpolluted - a fuzzy logic approach determing ...mmc2.geofisica.unam.mx/cursos//gest/Articulos/Geostatistics/Pollut… · Polluted or nonpolluted - a fuzzy logic approach

percents of a specific element in the certain media, or even worse � describing factor values as products of the statistical approach. 3. GEOCHEMICAL DATA In the sampling area, 512 soil samples were collected (the data is courtesy of Department of Geochemistry, Geological Survey of Slovenia [1]). Samples were collected in a 500×500 meters grid in the dense urban area and 1000×1000 meters in the area sparsely inhabited. They were analysed for 36 elements, including some of the most toxic ones (Cd, Co, Cr, Cu, Hg, Ni, Pb, and Zn). Slovenian legislation defines the limit, the alarm and the critical values of the heavy metals in soil [2]. The table 3.1 shows the legislative levels for the specific chemical element (i). Table 3.1 � Normative levels of heavy metals in soils (after [2])

Chemical element - i Limit (ppm) Alarm (ppm) Critical (ppm) Cd 1 2 12 Co 20 50 240 Cr 100 150 380 Cu 60 100 300 Hg 0.8 2 10 Ni 50 70 210 Pb 85 100 530 Zn 200 300 720

Comparing the levels of toxic elements in the sampling area soil(s) and those that were set by the legislation, a pollution index for every sampling point was calculated for each of the eight elements ((PIi)xy). The pollution index was the basis for calculating and deriving a pollution map of the sampling area, using fuzzy logic methods. The expression (1) shows the described relationship:

i

xyixyi ll

cPI

)()( � , (1)

where (PIi)xy is the pollution index, (ci)xy is the concentration of the element i at point xy, and lli is the limit level for the same element, defined by the legislation. Alarm and critical levels were used for defining the fuzzy classes. 4. STATISTICS Basic statistics for the standardised sample PIs according to the limit value are shown in the table 3.2. The mean is the highest for Co, which is not surprising, since more than half of samples exceed the relative limit normative. The mean is also high for Pb, where it falls just bellow 1. The smallest mean values have Cd and Hg. For the elements Cr, Pb and Hg the standard deviation is relative high, a little smaller is for Cu and Zn (� 0,8). The smallest standard deviation has Ni (� 0,2), where for Cd and Co, have the value of standard deviation around 0,35.

Page 3: Polluted or nonpolluted - a fuzzy logic approach determing ...mmc2.geofisica.unam.mx/cursos//gest/Articulos/Geostatistics/Pollut… · Polluted or nonpolluted - a fuzzy logic approach

Table 4.1 � Basic statistics for the standardised sample data

Cd_PI Co_PI Cr_PI Cu_PI Ni_PI Pb_PI Zn_PI Hg_PI Range 0,2 - 4,7 0,25 - 3,8 0,19 - 23,49 0,166 - 9,8166 0,14 - 2,44 0,176 - 14,07 0,22 - 16,1 0,075 - 19,087

Mean 0,59277 1,09941 0,81900 0,73711 0,60207 0,98649 0,79954 0,52337

St.deviat. 0,367 0,358 1,048 0,818 0,184 1,116 0,834 1,328

n > limit 34 312 89 35 13 17 49 29

n > alarm 2 2 8 36 3 101 39 4

n > critical 0 0 1 5 0 5 3 2

The standard deviation values are important for defining the transition value between total membership and non-membership (zero value membership) in fuzzy logic methods. The last three rows show the number of samples that exceed the normative specified in the first column. 5. FUZZY LOGIC Decision Support Systems, which fuzzy logic is a consistent part of, is a branch of the Artificial Intelligence methods. It relies on expert knowledge that governs the decisions and actions that arise in different situations. Fuzzy logic is a sort of logic, which instead of crisp values uses soft, fuzzy values. It means that transition between two numbers, or set of numbers is gradual, ranging from 1 to 0, meaning the total degree of membership and no degree of membership, respectively [3]. Functions that guide the degree of membership can be predefined or customised. There are several predefined functions that very well describe transitions and can be used for different purposes. These functions are sigmoidal, bell-shaped, linear [4], �, Gauss, exponential [5], J-bell [6] etc. Permissible levels of heavy metals in soils are presented with crisp values. It is hard to define a narrow margin between polluted and unpolluted soil, or any media for that matter. Does the fact, that specific soil contains 98,6 ppm (parts per million) of some toxic element mean that the soil is unpolluted, while the threshold is set to 100 ppm? Certainly not! There are also several other problems in geochemical data that bother scientists. These are accuracy, precision, and variation of results in different analytic methods. How can one then define the threshold levels, permissible levels or even level of pollution? Fuzzy logic may hold the key to this problem. Hence, for the purpose of this article the method of fuzzy logic was used for the classification of the PI into six classes, non-polluted, partially polluted, marginally polluted, polluted, very polluted, and extremely polluted. Using standard deviation of the PI values when defining the membership function seems the most reasonable approach to classify sampling points in sets according to their PI values. The sigmoidal membership function is also known as the S-shaped function. It was chosen for soil pollution data, since it most properly represents the natural distribution, and can be presented with the expression (2):

��2cos� , (2)

Page 4: Polluted or nonpolluted - a fuzzy logic approach determing ...mmc2.geofisica.unam.mx/cursos//gest/Articulos/Geostatistics/Pollut… · Polluted or nonpolluted - a fuzzy logic approach

where α stands for 2)ab(

)ax( �

� . This is true only when the function is growing.

When x is greater than b, α equals 1. Function can also be symmetrical with one or a set of maximum values. The function can be defined with two, three or four values. As already mentioned, a defines the value, where the function rises above 0, b, where it reaches value 1, c, where it starts to drop, and d, where it reaches 0 again. Figure 2 shows different types of sigmoidal membership function.

Figure 2 � Different types of the sigmoidal membership function [4] The pollution indexes values and their distribution were the reason for defining six classes, or fuzzy sets of pollution levels. Those six classes (sets) are described in the Table 5.1. Table 5.1 � Properties of the defined fuzzy sets or classes

Fuzzy class: Cd Co Cr Cu Ni Pb Zn Hg

I a -0,1673 -0,1083 -0,8581 -0,6511 -0,0436 -0,9391 -0,6144 -1,2527

non-pltd b 0,2000 0,2500 0,1900 0,1667 0,1400 0,1765 0,2200 0,0750

c 0,2000 0,2500 0,1900 0,1667 0,1400 0,1765 0,2200 0,0750

d 0,5673 0,6083 1,2381 0,9845 0,3236 1,2920 1,0544 1,4027

II a 0,0000 0,0000 -0,4531 -0,2345 0,1164 -0,5273 -0,2244 -0,7977

partially pltd b 0,3673 0,3583 0,5950 0,5833 0,3000 0,5882 0,6100 0,5300

c 0,6327 0,6417 0,5950 0,5833 0,8164 0,5882 0,6100 0,5300

d 1,0000 1,0000 1,6431 1,4011 1,0000 1,7038 1,4444 1,8577

III a 0,6327 0,6417 -0,0481 0,1822 0,8164 -0,1155 0,1656 -0,3277

margin. pltd b 1,0000 1,0000 1,0000 1,0000 1,0000 1,0000 1,0000 1,0000

c 1,0000 1,0000 1,0000 1,0000 1,0000 1,0000 1,0000 1,0000

d 1,3673 1,3583 2,0481 1,8178 1,1836 2,1155 1,8344 2,3277

IV a 1,0000 1,0000 1,0000 1,0000 1,0000 1,0000 1,0000 1,0000

polluted b 1,3673 1,3583 2,0481 1,8178 1,1836 2,1155 1,8344 2,3277

c 2,0000 2,5000 2,0481 1,8178 1,4000 2,1155 1,8344 2,5000

d 2,3673 2,8583 3,0962 2,6356 1,5836 3,2311 2,6688 3,8277

V a 1,6327 2,1417 1,4519 1,1822 1,2164 1,3845 0,6656 1,1723

very pltd b 2,0000 2,5000 2,5000 2,0000 1,4000 2,5000 1,5000 2,5000

c 3,1663 2,9708 3,8000 5,0000 2,2564 6,2350 3,1000 12,5000

d 3,5337 3,3292 4,8481 5,8178 2,4400 7,3505 3,9344 13,8277

VI a 2,7990 2,6125 2,7519 4,1822 5,1195 2,2656 11,1723

extrem. pltd b 3,1663 2,9708 3,8000 5,0000 6,2350 3,1000 12,5000

Page 5: Polluted or nonpolluted - a fuzzy logic approach determing ...mmc2.geofisica.unam.mx/cursos//gest/Articulos/Geostatistics/Pollut… · Polluted or nonpolluted - a fuzzy logic approach

c 4,3327 3,4417 22,4419 8,9989 12,9551 15,2656 17,7598

d 4,7000 3,8000 23,4900 9,8167 14,0706 16,1000 19,0875 The fuzzy sets were defined as follows; the lower maximum membership, defined by variable b in the equation 5.1, for the first set (non-polluted) was defined with the data minimum, which had to be smaller than the value 1. Since there were always several minimum values, this could also indicate the lower level of the analytical detection. The transition range between non-membership and total membership of the element concentration at a sample in the certain set was always governed by the standard deviation value for that element. The maximum membership of the third set (marginally polluted) was set to the value 1, where half of the data are classified as non-polluted and the other half as polluted. The second set (partially polluted) was placed in the middle of the two sets described. The fourth set was defined so that all the data, including the scattered, falls above the value 1. These values of element concentrations are certainly above the limit normative and can be classified as polluted. The fifth set, named very polluted, was defined with the alarm normative as its minimum membership value and the last set that represents extremely polluted values was defined with the critical normative. The membership function for the highest set reaches the value 0 at maximum value of the data. This step automatically eliminates the upper data extremes. Converting numerous fuzzy results (one map per class per element � 48 maps) to crisp results is almost necessary step, since the crisp results are easily understandable to the user and the public. To produce crisp results, the disjunction fuzzy method was used. In the language of fuzzy logic the disjunction or the union means that the maximum value of several data sets is extracted [7]. 6. RESULTS AND DISCUSSION The pollution maps of the investigated area were produced using the kriging method on the classified samples. For the details on kriging refer to reference literature at the end of the article [8]. The methods are very useful in geology and pedology, since the spatial relations between data exist [9]. The post-kriging method (kriging after the fuzzy classification) was chosen, since the method is prone to errors when the anomalous data are used for the interpolation [10]. Geochemical data certainly fall into this category. With fuzzy classification all the data were stratified before the kriging method was performed on them. The shape of the semi-variogram defines the type of the kriging method used for the interpolation of specific data. For all the elements, except for the mercury, the linear kriging method of interpolation was chosen. For the mercury, an exponential kriging method was used. Shown below are the pollution maps for all the elements, derived with the procedure described above. The urban area of Ljubljana is represented with a grey-hatched area. The figures 3 � 10 show the pollution maps of Cd, Co, Cr, Cu, Hg, Ni, Pb, and Zn respectively. Darker green colours represent non- or partially polluted areas, bright

Page 6: Polluted or nonpolluted - a fuzzy logic approach determing ...mmc2.geofisica.unam.mx/cursos//gest/Articulos/Geostatistics/Pollut… · Polluted or nonpolluted - a fuzzy logic approach

green and bright yellow colours represent marginally polluted areas, yellow and red colours represent polluted and very polluted areas and magenta represents extremely polluted areas.

Figure 3 � The pollution map ob the Ljubljana urban area for cadmium

Figure 4 � The pollution map ob the Ljubljana urban area for cobalt

Page 7: Polluted or nonpolluted - a fuzzy logic approach determing ...mmc2.geofisica.unam.mx/cursos//gest/Articulos/Geostatistics/Pollut… · Polluted or nonpolluted - a fuzzy logic approach

Figure 5 � The pollution map ob the Ljubljana urban area for chromium Figure 6 � The pollution map ob the Ljubljana urban area for copper

Page 8: Polluted or nonpolluted - a fuzzy logic approach determing ...mmc2.geofisica.unam.mx/cursos//gest/Articulos/Geostatistics/Pollut… · Polluted or nonpolluted - a fuzzy logic approach

Figure 7 � The pollution map ob the Ljubljana urban area for mercury Figure 8 � The pollution map ob the Ljubljana urban area for nickel

Page 9: Polluted or nonpolluted - a fuzzy logic approach determing ...mmc2.geofisica.unam.mx/cursos//gest/Articulos/Geostatistics/Pollut… · Polluted or nonpolluted - a fuzzy logic approach

Figure 9 � The pollution map ob the Ljubljana urban area for lead Figure 10 � The pollution map ob the Ljubljana urban area for zinc

Page 10: Polluted or nonpolluted - a fuzzy logic approach determing ...mmc2.geofisica.unam.mx/cursos//gest/Articulos/Geostatistics/Pollut… · Polluted or nonpolluted - a fuzzy logic approach

The pollution tendency of the elements Pb and Zn (figures 9 and 10 respectively) shows that there are some relations between spatial distribution of mentioned heavy metals. Higher values and thus higher pollution is concentrated in the old city centre and is declining towards its edges. Zn and especially Pb show spatial relation with traffic, which is naturally denser in the centre. The distribution of Hg (figure 7) shows high values throughout the whole old city centre, while the metal concentration decreases moving to the city edges. The majority of the urban area is classified as marginally polluted. The higher values of Cu (figure 6) are also concentrated in the area of the city centre. If taking into account the marginally polluted areas, the situation is the same as in the case of mercury. The values of Cd (figure 3) are low throughout the whole area with some exceptions. The distribution pattern looks like irregular, but in fact it shows some correlation with Cu, Pb and Zn. For Cr and Ni (figures 5 and 8 respectively) we can say that there are no patterns of spatial distribution. These elements are more or less distributed randomly in the sampled area. Co is very randomly distributed, but the values are relatively high throughout the whole city (figure 4). High values of some elements coincide with several locations where the relict pollutants used to be or with the locations of present pollutants. The locations that are relevant to the occurrence of higher element values are shown in the figure 11. Figure 11 � The locations of the pollutants in Ljubljana

Page 11: Polluted or nonpolluted - a fuzzy logic approach determing ...mmc2.geofisica.unam.mx/cursos//gest/Articulos/Geostatistics/Pollut… · Polluted or nonpolluted - a fuzzy logic approach

The map shows four locations of four pollutants � an old buried landfill, an abandoned shooting practice ground, the heating plant, and the Institute for the turbine engines research. The blue area represents the location of an old landfill, which was probably the cause of high values at that location for Co, Cu, Pb and Zn. At the abandoned shooting practice ground, marked with green dot, high values of Pb and Zn were found. The heating plant, marked with red dot, was probably the cause of high values of Cu, Pb and Zn at the vicinity of that location. High values of Cu, Hg, Pb and Zn were found on the location of The Institute for the turbine engines research, marked with cyan dot. 7. CONCLUSIONS The fuzzy method classification of the pollution indices (PI) values was used here to produce eight pollution maps, one for each chemical element (Cd, Co, Cr, Cu, Hg, Ni, Pb, and Zn). Normative values were used to convert absolute values of the chemical element concentration into pollution indices that more clearly show the pollution levels. Fuzzy classification was used to stratify geochemical data into six classes, from non-polluted to extremely polluted. Hardening method of disjunction was used to convert numerous fuzzy results into easy understandable crisp results � the pollution maps. Results show that the values of elements Pb and Zn are associated with older urban areas and areas of denser traffic. The distribution of Hg and Cu also shows higher concentrations of the two elements in the city centre. Those become smaller when moving towards the edges of the urban area. Occurrence of these four chemical elements is obviously associated with anthropogenic activity. This statement is supported with coincidence of the high values of four elements with the location of the old landfill, the heating plant and with the turbine engines research Institute. High values of Pb and Zn are also found at the abandoned shooting practice ground. The concentrations of chemical elements Cd, Cr and Ni are randomly distributed and are low throughout the whole sampled area with some exceptions. Co is also randomly distributed, but its values are relatively high. Although the values of Cd show some correlation with the elements Pb, Zn and Cu, they are not so high, except for some exceptions. The fuzzy classification methods have shown that uncertainties that arise with the data analysis can be successfully dealt with, since the results represent the most likely membership of a certain sampling point to a one of six classes. This means that with a certain degree of confidence, a sampling point can be classified to a given class. With the results in our mind, we have to consider that point data were used and some form of interpolation was necessary to produce continuous spatial results. Although the procedure of the data interpolation always casts some shadow of doubt on the results, the general trends from the original data are preserved. As do other cities in the world, Ljubljana is facing the problems of anthropogenic pollution, too. Decades of industry and traffic activity have left in Slovenia�s capital scars of pollution that will not heal so easily.

Page 12: Polluted or nonpolluted - a fuzzy logic approach determing ...mmc2.geofisica.unam.mx/cursos//gest/Articulos/Geostatistics/Pollut… · Polluted or nonpolluted - a fuzzy logic approach

8. REFERENCES [1] � �ajn, R. 1995: Geochemical Properties of soils in the area of Ljubljana city.

� Master's Thesis, University in Ljubljana, Faculty for Natural and Technical Sciences, 94 p., Ljubljana.

[2] � Decree on limit, alarm and critical values of hazardous substances in the

soil. Official Gazette RS No. 68/96. [3] � Zadeh, L. A. 1965: Fuzzy sets. � Information and Control, Vol. 8(3), p. 338-

353, AP Professional, Boston. [4] � IDRISI for Windows � Version 2.007: Help System. � Clark University,

Main St. Worcester. [5] � Pedrycz, W. 1998: Computational intelligence: an introduction.� CRC Press,

284 p, Boca Raton. [6] � Burrough, P. A. 1989: Fuzzy Mathematical Methods for Soil Survey and

Land Evaluation.� Journal of Soil Science, Vol. 40, p. 477-492, Blackwell Science Ltd., Oxford.

[7] � McNeil, F. M. & Thro, E. 1994: Fuzzy logic: a practical approach.� AP

Professional, 292 p, Boston. [8] � Davis, J.C. 1986: Statistics and data analysis in geology. � Wiley & Sons,

646 p., New York. [9] � ESRI, 1996: ArcView Spatial Analyst � Advanced Spatial Analysis Using

Raster and Vector Data. � Environmental Systems Research Institute, 148 p., Redlands.

[10] � ESRI, 1999: ARC/INFO for Windows � Version 8.0: Help System. �

Environmental Systems Research Institute, inc., Redlands.