Extending ArcGIS with R Mark Janikas, PhD [email protected].

17
Extending ArcGIS with R Extending ArcGIS with R Mark Janikas, PhD Mark Janikas, PhD [email protected] [email protected]

Transcript of Extending ArcGIS with R Mark Janikas, PhD [email protected].

Page 1: Extending ArcGIS with R Mark Janikas, PhD mjanikas@esri.com.

Extending ArcGIS with RExtending ArcGIS with R

Mark Janikas, PhDMark Janikas, PhD

[email protected]@esri.com

Page 2: Extending ArcGIS with R Mark Janikas, PhD mjanikas@esri.com.

OutlineOutline

• IntroductionIntroduction–What is R? Why should I use it?What is R? Why should I use it?

• ApplicationApplication–Point ClusteringPoint Clustering

• Integration optionsIntegration options–R versus RpyR versus Rpy

• Conclusions and Future DirectionsConclusions and Future Directions

Page 3: Extending ArcGIS with R Mark Janikas, PhD mjanikas@esri.com.

What is R? Why should I use it?What is R? Why should I use it?

• R (The R (The RR Project for Statistical Computing) is an open- Project for Statistical Computing) is an open-source data analysis package. (GNU S)source data analysis package. (GNU S)–Widely UsedWidely Used• Over 60 CRAN sites across 30+ countriesOver 60 CRAN sites across 30+ countries

– Its FreeIts Free• GNU GENERAL PUBLIC LICENSEGNU GENERAL PUBLIC LICENSE

–Base is powerfulBase is powerful• Statistics, Linear Algebra, Visualization , etc…Statistics, Linear Algebra, Visualization , etc…

– Its extendibleIts extendible• 1800+ Contributed Extensions1800+ Contributed Extensions

• splancs, spatstat, spdep, rgdal, maptools, shapefilessplancs, spatstat, spdep, rgdal, maptools, shapefiles

Page 4: Extending ArcGIS with R Mark Janikas, PhD mjanikas@esri.com.

R Point Clustering Tools for ArcGISR Point Clustering Tools for ArcGIS

•Resource Center (Code Gallery)Resource Center (Code Gallery)• Contains two tools… that do the same thing!Contains two tools… that do the same thing!

Page 5: Extending ArcGIS with R Mark Janikas, PhD mjanikas@esri.com.

Application: Point ClusteringApplication: Point Clustering

• Cluster a given a set of point locations:Cluster a given a set of point locations:–Spatial ProximitySpatial Proximity–Attributes ValuesAttributes Values

Page 6: Extending ArcGIS with R Mark Janikas, PhD mjanikas@esri.com.

Integration with ArcGISIntegration with ArcGIS

• Two (Three) Integration Options With ArcGISTwo (Three) Integration Options With ArcGIS• Both require PythonBoth require Python• Both have pros and consBoth have pros and cons• ESRI UC Plenary 2008ESRI UC Plenary 2008–predicting plant species in unknown areaspredicting plant species in unknown areas

Page 7: Extending ArcGIS with R Mark Janikas, PhD mjanikas@esri.com.

Integration: R OptionIntegration: R Option

• Decouples R and PythonDecouples R and Python• Python Python –Retrieves and organize parameters from ArcGISRetrieves and organize parameters from ArcGIS–Convert Data (Interchange)Convert Data (Interchange)• Shapefiles, netcdf, img etc….Shapefiles, netcdf, img etc….

–Spawns R given the *.r file with provided parametersSpawns R given the *.r file with provided parameters

• RR–Does the analysisDoes the analysis

ArcGISArcGIS R ScriptR Script

Python ScriptPython Script

Page 8: Extending ArcGIS with R Mark Janikas, PhD mjanikas@esri.com.

Integration: RPy OptionIntegration: RPy Option

• R and Python closely coupledR and Python closely coupled• RPy (RPy2)RPy (RPy2)–Python Interface to the R Programming Language Python Interface to the R Programming Language

• PythonPython–Retrieves and organize parameters from ArcGISRetrieves and organize parameters from ArcGIS–RPy module is imported and R commands are executed within RPy module is imported and R commands are executed within

the Python script filethe Python script file

ArcGISArcGIS Python ScriptPython Script R ProcessingR Processing

Page 9: Extending ArcGIS with R Mark Janikas, PhD mjanikas@esri.com.

Which One Should I Use?Which One Should I Use?

• R OptionR Option–Attractive to R ProgrammersAttractive to R Programmers– ““Out of Proc”: Spawning R on every executeOut of Proc”: Spawning R on every execute–Use Copy Features!!!Use Copy Features!!!• selection setsselection sets

• Projections and other environment variablesProjections and other environment variables

–You must use an R library for handling shapefilesYou must use an R library for handling shapefiles• maptools, shapefilesmaptools, shapefiles

–Two files per script tool (*.py and *.r)Two files per script tool (*.py and *.r)

Page 10: Extending ArcGIS with R Mark Janikas, PhD mjanikas@esri.com.

R Option Code SnippetR Option Code Snippet

Page 11: Extending ArcGIS with R Mark Janikas, PhD mjanikas@esri.com.

Which One Should I Use? Cont…Which One Should I Use? Cont…

• RPy OptionRPy Option–For more advanced users (Python and R knowledge)For more advanced users (Python and R knowledge)– ““In Process”In Process”• Will be MUCH faster after the first callWill be MUCH faster after the first call

–Honors selection setsHonors selection sets–A robust choice of database formatsA robust choice of database formats–Will honor environment settings (GP Functions)Will honor environment settings (GP Functions)–Only a single file associated with your script toolOnly a single file associated with your script tool

Page 12: Extending ArcGIS with R Mark Janikas, PhD mjanikas@esri.com.

RPy Option Code SnippetRPy Option Code Snippet Source R Libraries

Create Output

Cluster Analysis

NumPy and R Interchange

Page 13: Extending ArcGIS with R Mark Janikas, PhD mjanikas@esri.com.

Which One Should I Use? Cont…Which One Should I Use? Cont…

• Wait… Why would I go with the R Option?Wait… Why would I go with the R Option?–Doesn’t have as many dependencies/layersDoesn’t have as many dependencies/layers–RPy RPy • Python, R, and RPy builds have to play nice!Python, R, and RPy builds have to play nice!

• You must know Python, some R and now RPy.You must know Python, some R and now RPy.

–Currently there is an open bug in RPy that must be fixed in Currently there is an open bug in RPy that must be fixed in order to run in the “In Process” mode in ArcGISorder to run in the “In Process” mode in ArcGIS• Manual fix in the portal tool documentationManual fix in the portal tool documentation

• Both methods require the editing of Environment Both methods require the editing of Environment Variables in order to run properlyVariables in order to run properly

Page 14: Extending ArcGIS with R Mark Janikas, PhD mjanikas@esri.com.

ConclusionsConclusions

• R R – contains “cutting edge” data analysis techniques from a wide contains “cutting edge” data analysis techniques from a wide

body of academic and applied fieldsbody of academic and applied fields– extendibleextendible–Open-source Open-source

• Can be integrated with ArcGIS using PythonCan be integrated with ArcGIS using Python–R versus RPy (RPy2)R versus RPy (RPy2)• Pros and ConsPros and Cons

Page 15: Extending ArcGIS with R Mark Janikas, PhD mjanikas@esri.com.

Future DirectionsFuture Directions

• RPy2 RPy2 • Web Portal: RToolsWeb Portal: RTools–Could be expanded uponCould be expanded upon

• Calling Python from RCalling Python from R–Leveraging geoprocessing within the R environmentLeveraging geoprocessing within the R environment–RSPython: http://www.omegahat.org/RSPython/RSPython: http://www.omegahat.org/RSPython/

Page 16: Extending ArcGIS with R Mark Janikas, PhD mjanikas@esri.com.

LinksLinks

• RR–http://www.r-project.org/index.htmlhttp://www.r-project.org/index.html

• RPy (Link to RPy2)RPy (Link to RPy2)–http://rpy.sourceforge.net/http://rpy.sourceforge.net/

• Python Python –http://www.python.org/http://www.python.org/

• NumPy NumPy –http://www.numpy.org/http://www.numpy.org/

Page 17: Extending ArcGIS with R Mark Janikas, PhD mjanikas@esri.com.

Related SessionsRelated Sessions

• Developing Python Scripts for Data Analysis Tips and Developing Python Scripts for Data Analysis Tips and TricksTricks–Geoprocessing Demo Theater – W, 5:00 – 6:00Geoprocessing Demo Theater – W, 5:00 – 6:00

• Spatial Statistics: Using Spatial StatisticsSpatial Statistics: Using Spatial Statistics–TH 1:30 – 2:45TH 1:30 – 2:45

• Regression Analysis for Spatial Data with ArcGIS 9.3Regression Analysis for Spatial Data with ArcGIS 9.3–TH 3:15 – 4:30TH 3:15 – 4:30