Mapping and classification of spatial data using machine learning: algorithms and software tools...

21
Institute of Geomatics and Analysis of Risk, University of Lausanne, Switzerland Vadim Timonin Vadim.Timonin @UNIL.ch Mapping and classification of spatial data using Machine Learning Office software tools

description

Mapping and classification of spatial data using machine learning: algorithms and software toolsVadim Timonin – Institute of Geomatics and Risk Analysis (IGAR), University of Lausanne (Switzerland)Intelligent Analysis of Environmental Data (S4 ENVISA Workshop 2009)

Transcript of Mapping and classification of spatial data using machine learning: algorithms and software tools...

Page 1: Mapping and classification of spatial data using machine learning: algorithms and software tools Vadim Timonin – Institute of Geomatics and Risk Analysis (IGAR), University of Lausanne

Institute of Geomatics and Analysis of Risk, University of Lausanne, Switzerland

Vadim Timonin

Vadim.Timonin @UNIL.ch

Mapping and classification of spatial data using

Machine Learning Officesoftware tools

Page 2: Mapping and classification of spatial data using machine learning: algorithms and software tools Vadim Timonin – Institute of Geomatics and Risk Analysis (IGAR), University of Lausanne

Contents

1. Short description of the Machine Learning Office

2. SIC 2004: Application to the automatic cartography of radioactivity

3. Case study: Wind fields mapping with neural network and

regularization technique.

Page 3: Mapping and classification of spatial data using machine learning: algorithms and software tools Vadim Timonin – Institute of Geomatics and Risk Analysis (IGAR), University of Lausanne

Machine Learning Office

EPFL press

June 2009

Part of the book:

Page 4: Mapping and classification of spatial data using machine learning: algorithms and software tools Vadim Timonin – Institute of Geomatics and Risk Analysis (IGAR), University of Lausanne

June 20

09:00 – 12:00

Room T120

Practical work session usingMachine Learning software

Page 5: Mapping and classification of spatial data using machine learning: algorithms and software tools Vadim Timonin – Institute of Geomatics and Risk Analysis (IGAR), University of Lausanne

Machine Learning OfficeSupervised

• Multilayer Perceptron (MLP)• General Regression Neural Networks (GRNN)• Radial Basis Function Neural Networks

(RBFNN)• K-Nearest Neighbour (KNN)• Support Vector Regression (SVR)

Regression

• Multilayer Perceptron (MLP)• Probabilistic Neural Networks (PNN)• K-Nearest Neighbour (KNN)• Support Vector Machines (SVM)

Classification

Page 6: Mapping and classification of spatial data using machine learning: algorithms and software tools Vadim Timonin – Institute of Geomatics and Risk Analysis (IGAR), University of Lausanne

Machine Learning OfficeUnsupervised

• K-Means & EM algorithms• Gaussian Mixture Model (GMM)• Self-Organizing (Kohonen) Maps

(SOM)

Clustering & density estimation

Page 7: Mapping and classification of spatial data using machine learning: algorithms and software tools Vadim Timonin – Institute of Geomatics and Risk Analysis (IGAR), University of Lausanne

Machine Learning OfficeMixture of supervised and unsupervised

• Mixture Density Networks (MDN)

Joint density estimation

Page 8: Mapping and classification of spatial data using machine learning: algorithms and software tools Vadim Timonin – Institute of Geomatics and Risk Analysis (IGAR), University of Lausanne

1. Simple, without difficult tuning of the

models (can be used by “non-expert” in

machine learning)

2. Result should be unique (does not depend

on training algorithms, initial values, etc.)

Automatic Mapping of Pollution Data

Procedure should be:

Page 9: Mapping and classification of spatial data using machine learning: algorithms and software tools Vadim Timonin – Institute of Geomatics and Risk Analysis (IGAR), University of Lausanne

1. KNN

2. GRNN / PNN

Automatic Mapping of Pollution Data

Good candidates:

Not so good candidates (?):

1. MLP

2. RBFNN

3. SVM / SVR

Page 10: Mapping and classification of spatial data using machine learning: algorithms and software tools Vadim Timonin – Institute of Geomatics and Risk Analysis (IGAR), University of Lausanne

http://www.ai-geostats.org/

Official report:Automatic mapping algorithms for routine and

emergency monitoring data.EUR 21595 EN EC.

Dubois G. (Ed.), Office for Official Publications of the European Communities, Luxembourg, 150 p., November 2005.

Automatic Mappingwith Prior Knowledge

in situations ofRoutine and Emergency

Spatial Interpolation Comparison 2004

Page 11: Mapping and classification of spatial data using machine learning: algorithms and software tools Vadim Timonin – Institute of Geomatics and Risk Analysis (IGAR), University of Lausanne

Spatial Interpolation Comparison 2004

Introduction

Description of the concept of SIC 2004Participants are invited using 200 observations (left, circles) to estimate (predict)

values located at 1008 locations (right, crosses).

Page 12: Mapping and classification of spatial data using machine learning: algorithms and software tools Vadim Timonin – Institute of Geomatics and Risk Analysis (IGAR), University of Lausanne

Results of the GRNN models with cross-validation tuning

Routine scenario

Emergency (joker) scenario

Epicentre of accident (hot spot)

Page 13: Mapping and classification of spatial data using machine learning: algorithms and software tools Vadim Timonin – Institute of Geomatics and Risk Analysis (IGAR), University of Lausanne

Results

In the following table the participants’ results for either of the two scenarios (routine and emergency) are presented.

The results have been sorted by Minimum Absolute Error (MAE) obtained in the case of the emergency scenario. Other statistics shown in this table are the Mean Error (ME) that allows to assess the bias of the results, the Root Mean Squared Error (RMSE), as well as Pearson’s Correlation Coefficient (Ro) between true and estimated values.

• GEOSTATS denotes Geostatistical techniques• NN Neural Networks• SVM Support Vector Machine

In each column, the best results have been bolded.

Page 14: Mapping and classification of spatial data using machine learning: algorithms and software tools Vadim Timonin – Institute of Geomatics and Risk Analysis (IGAR), University of Lausanne

Results of the SIC 2004 exerciseParticipant Method

MAE ME RMSE Ro

routine joker routine joker routine joker routine joker

Timonin NN 9.40 14.85 -1.25 -0.51 12.59 45.46 0.78 0.84

Fournier GEOSTATS 9.06 16.22 -1.32 -8.58 12.43 81.44 0.79 0.27

Pozdnoukhov SVM 9.22 16.25 -0.04 -6.70 12.47 81.00 0.79 0.28

Saveliev SPLINES 9.60 17.00 3.00 10.40 13.00 82.20 0.77 0.23

Dutta NN 9.92 17.50 0.20 5.10 13.10 80.60 0.76 0.29

Ingram GEOSTATS 9.10 18.55 -1.27 -4.64 12.46 54.22 0.79 0.86

Hofierka SPLINES 9.10 18.62 -1.30 0.41 12.51 73.68 0.79 0.50

Hofierka SPLINES 9.10 18.62 -1.30 0.41 12.51 73.68 0.79 0.50

Fournier GEOSTATS 9.22 19.43 -0.89 -0.22 12.51 73.50 0.78 0.48

Fournier OTHERS 9.29 19.44 -1.12 -0.12 12.56 71.87 0.78 0.53

Savelieva GEOSTATS 9.11 19.68 -1.39 -2.18 12.49 69.08 0.78 0.56

Palaseanu GEOSTATS 9.05 19.76 1.40 2.33 12.46 74.54 0.79 0.50

Rigol S. NN 12.10 20.30 -1.20 -9.40 15.80 84.10 0.67 0.12

Pebesma GEOSTATS 9.11 20.83 -1.22 0.92 12.44 73.73 0.79 0.50

Pebesma OTHERS 9.94 21.03 -1.35 4.50 13.32 72.12 0.78 0.51

Ingram GEOSTATS 9.08 21.77 -1.44 0.72 12.47 79.57 0.79 0.35

Lophaven GEOSTATS 9.70 22.20 1.20 -4.10 13.10 71.20 0.76 0.54

Saveliev SPLINES 9.30 22.20 1.60 0.60 12.60 76.40 0.78 0.41

Ingram GEOSTATS 9.47 22.53 -1.15 3.09 12.75 79.16 0.78 0.33

Pebesma GEOSTATS 9.11 23.26 -1.22 4.00 12.44 76.19 0.79 0.42

Rigol S. NN 16.00 25.30 -1.70 -11.10 20.80 87.50 0.55 0.02

Hofierka SPLINES 9.38 26.52 -1.27 4.29 12.68 77.98 0.78 0.38

Dutta NN 9.62 28.20 0.90 -0.22 12.70 80.10 0.78 0.31

Pebesma GEOSTATS 9.11 28.45 -1.22 12.01 12.44 81.41 0.79 0.38

Dutta NN 12.20 28.90 1.50 -1.29 15.90 79.90 0.64 0.33

Rigol S. NN 21.40 30.50 5.30 3.80 45.80 96.60 0.24 0.20

Ingram NN 9.72 38.29 -1.54 8.38 13.00 84.24 0.76 0.30

Dutta NN 9.93 38.50 2.18 17.98 13.30 87.30 0.76 0.27

Ingram NN 9.48 48.41 -1.22 -3.01 12.73 90.89 0.78 0.38

Pebesma GEOSTATS 9.11 146.36 -1.22 19.71 12.44 212.10 0.79 -0.27

Page 15: Mapping and classification of spatial data using machine learning: algorithms and software tools Vadim Timonin – Institute of Geomatics and Risk Analysis (IGAR), University of Lausanne

(pp 168-172 of the book)Monitoring network:111 stations in Switzerland(80 training + 31 for validation)

Mapping of daily:• Mean speed• Maximum gust• Average direction

Modeling of wind fields with MLPand regularization technique

Page 16: Mapping and classification of spatial data using machine learning: algorithms and software tools Vadim Timonin – Institute of Geomatics and Risk Analysis (IGAR), University of Lausanne

Monitoring network:111 stations in Switzerland (80 training + 31 for validation)

Mapping of daily:• Mean speed• Maximum gust• Average direction

Input information:X,Y geographical coordinatesDEM (resolution 500 m)23 DEM-based « geo-features »

Total 26 features

Modeling of wind fields with MLPand regularization technique

Model:MLP 26-20-20-3

Page 17: Mapping and classification of spatial data using machine learning: algorithms and software tools Vadim Timonin – Institute of Geomatics and Risk Analysis (IGAR), University of Lausanne

Model:

MLP 26-20-20-3

Training:• Random initialization• 500 iterations of the

RPROP algorithm

Training of the MLP

Page 18: Mapping and classification of spatial data using machine learning: algorithms and software tools Vadim Timonin – Institute of Geomatics and Risk Analysis (IGAR), University of Lausanne

Results: naîve approach

Page 19: Mapping and classification of spatial data using machine learning: algorithms and software tools Vadim Timonin – Institute of Geomatics and Risk Analysis (IGAR), University of Lausanne

Results: Noisy ejection regularization

Page 20: Mapping and classification of spatial data using machine learning: algorithms and software tools Vadim Timonin – Institute of Geomatics and Risk Analysis (IGAR), University of Lausanne

Results: summary

Noisy ejection regularization

Without regularization (overfitting)

Page 21: Mapping and classification of spatial data using machine learning: algorithms and software tools Vadim Timonin – Institute of Geomatics and Risk Analysis (IGAR), University of Lausanne

Next stop is:

June 20

09:00 – 12:00

Room T120

Practical work session usingMachine Learning software

Thank you for your attention!