Mapping and classification of spatial data using machine learning: algorithms and software tools...
-
Upload
geographical-analysis-urban-modeling-spatial-statistics -
Category
Technology
-
view
2.833 -
download
1
description
Transcript of Mapping and classification of spatial data using machine learning: algorithms and software tools...
Institute of Geomatics and Analysis of Risk, University of Lausanne, Switzerland
Vadim Timonin
Vadim.Timonin @UNIL.ch
Mapping and classification of spatial data using
Machine Learning Officesoftware tools
Contents
1. Short description of the Machine Learning Office
2. SIC 2004: Application to the automatic cartography of radioactivity
3. Case study: Wind fields mapping with neural network and
regularization technique.
Machine Learning Office
EPFL press
June 2009
Part of the book:
June 20
09:00 – 12:00
Room T120
Practical work session usingMachine Learning software
Machine Learning OfficeSupervised
• Multilayer Perceptron (MLP)• General Regression Neural Networks (GRNN)• Radial Basis Function Neural Networks
(RBFNN)• K-Nearest Neighbour (KNN)• Support Vector Regression (SVR)
Regression
• Multilayer Perceptron (MLP)• Probabilistic Neural Networks (PNN)• K-Nearest Neighbour (KNN)• Support Vector Machines (SVM)
Classification
Machine Learning OfficeUnsupervised
• K-Means & EM algorithms• Gaussian Mixture Model (GMM)• Self-Organizing (Kohonen) Maps
(SOM)
Clustering & density estimation
Machine Learning OfficeMixture of supervised and unsupervised
• Mixture Density Networks (MDN)
Joint density estimation
1. Simple, without difficult tuning of the
models (can be used by “non-expert” in
machine learning)
2. Result should be unique (does not depend
on training algorithms, initial values, etc.)
Automatic Mapping of Pollution Data
Procedure should be:
1. KNN
2. GRNN / PNN
Automatic Mapping of Pollution Data
Good candidates:
Not so good candidates (?):
1. MLP
2. RBFNN
3. SVM / SVR
http://www.ai-geostats.org/
Official report:Automatic mapping algorithms for routine and
emergency monitoring data.EUR 21595 EN EC.
Dubois G. (Ed.), Office for Official Publications of the European Communities, Luxembourg, 150 p., November 2005.
Automatic Mappingwith Prior Knowledge
in situations ofRoutine and Emergency
Spatial Interpolation Comparison 2004
Spatial Interpolation Comparison 2004
Introduction
Description of the concept of SIC 2004Participants are invited using 200 observations (left, circles) to estimate (predict)
values located at 1008 locations (right, crosses).
Results of the GRNN models with cross-validation tuning
Routine scenario
Emergency (joker) scenario
Epicentre of accident (hot spot)
Results
In the following table the participants’ results for either of the two scenarios (routine and emergency) are presented.
The results have been sorted by Minimum Absolute Error (MAE) obtained in the case of the emergency scenario. Other statistics shown in this table are the Mean Error (ME) that allows to assess the bias of the results, the Root Mean Squared Error (RMSE), as well as Pearson’s Correlation Coefficient (Ro) between true and estimated values.
• GEOSTATS denotes Geostatistical techniques• NN Neural Networks• SVM Support Vector Machine
In each column, the best results have been bolded.
Results of the SIC 2004 exerciseParticipant Method
MAE ME RMSE Ro
routine joker routine joker routine joker routine joker
Timonin NN 9.40 14.85 -1.25 -0.51 12.59 45.46 0.78 0.84
Fournier GEOSTATS 9.06 16.22 -1.32 -8.58 12.43 81.44 0.79 0.27
Pozdnoukhov SVM 9.22 16.25 -0.04 -6.70 12.47 81.00 0.79 0.28
Saveliev SPLINES 9.60 17.00 3.00 10.40 13.00 82.20 0.77 0.23
Dutta NN 9.92 17.50 0.20 5.10 13.10 80.60 0.76 0.29
Ingram GEOSTATS 9.10 18.55 -1.27 -4.64 12.46 54.22 0.79 0.86
Hofierka SPLINES 9.10 18.62 -1.30 0.41 12.51 73.68 0.79 0.50
Hofierka SPLINES 9.10 18.62 -1.30 0.41 12.51 73.68 0.79 0.50
Fournier GEOSTATS 9.22 19.43 -0.89 -0.22 12.51 73.50 0.78 0.48
Fournier OTHERS 9.29 19.44 -1.12 -0.12 12.56 71.87 0.78 0.53
Savelieva GEOSTATS 9.11 19.68 -1.39 -2.18 12.49 69.08 0.78 0.56
Palaseanu GEOSTATS 9.05 19.76 1.40 2.33 12.46 74.54 0.79 0.50
Rigol S. NN 12.10 20.30 -1.20 -9.40 15.80 84.10 0.67 0.12
Pebesma GEOSTATS 9.11 20.83 -1.22 0.92 12.44 73.73 0.79 0.50
Pebesma OTHERS 9.94 21.03 -1.35 4.50 13.32 72.12 0.78 0.51
Ingram GEOSTATS 9.08 21.77 -1.44 0.72 12.47 79.57 0.79 0.35
Lophaven GEOSTATS 9.70 22.20 1.20 -4.10 13.10 71.20 0.76 0.54
Saveliev SPLINES 9.30 22.20 1.60 0.60 12.60 76.40 0.78 0.41
Ingram GEOSTATS 9.47 22.53 -1.15 3.09 12.75 79.16 0.78 0.33
Pebesma GEOSTATS 9.11 23.26 -1.22 4.00 12.44 76.19 0.79 0.42
Rigol S. NN 16.00 25.30 -1.70 -11.10 20.80 87.50 0.55 0.02
Hofierka SPLINES 9.38 26.52 -1.27 4.29 12.68 77.98 0.78 0.38
Dutta NN 9.62 28.20 0.90 -0.22 12.70 80.10 0.78 0.31
Pebesma GEOSTATS 9.11 28.45 -1.22 12.01 12.44 81.41 0.79 0.38
Dutta NN 12.20 28.90 1.50 -1.29 15.90 79.90 0.64 0.33
Rigol S. NN 21.40 30.50 5.30 3.80 45.80 96.60 0.24 0.20
Ingram NN 9.72 38.29 -1.54 8.38 13.00 84.24 0.76 0.30
Dutta NN 9.93 38.50 2.18 17.98 13.30 87.30 0.76 0.27
Ingram NN 9.48 48.41 -1.22 -3.01 12.73 90.89 0.78 0.38
Pebesma GEOSTATS 9.11 146.36 -1.22 19.71 12.44 212.10 0.79 -0.27
(pp 168-172 of the book)Monitoring network:111 stations in Switzerland(80 training + 31 for validation)
Mapping of daily:• Mean speed• Maximum gust• Average direction
Modeling of wind fields with MLPand regularization technique
Monitoring network:111 stations in Switzerland (80 training + 31 for validation)
Mapping of daily:• Mean speed• Maximum gust• Average direction
Input information:X,Y geographical coordinatesDEM (resolution 500 m)23 DEM-based « geo-features »
Total 26 features
Modeling of wind fields with MLPand regularization technique
Model:MLP 26-20-20-3
Model:
MLP 26-20-20-3
Training:• Random initialization• 500 iterations of the
RPROP algorithm
Training of the MLP
Results: naîve approach
Results: Noisy ejection regularization
Results: summary
Noisy ejection regularization
Without regularization (overfitting)
Next stop is:
June 20
09:00 – 12:00
Room T120
Practical work session usingMachine Learning software
Thank you for your attention!