FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT –...

62
FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – (MONTH 12) Deliverable D4.3.1 Circulation: PU ‐ Public Lead partner: UCL Contributing partners: UCL, CNR‐IMATI‐GE, TUDelft Authors: Jan Boehm, Simon Julier, Kun Liu, Ziyi Jiang , Geoffrey Jones (UCL); Michela Spagnuolo, Silvia Biasotti, Andrea Cerri, Simone Pittaluga (CNR‐ IMATI‐GE); Roderik Lindenbergh, Beril Sirmacek (TUDelft) Quality Controller: Ewald Quak (SINTEF) Version: 1.0 Date: 01.11.2013

Transcript of FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT –...

Page 1: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

FEATUREEXTRACTIONANDCLASSIFICATIONTOOLKIT–

(MONTH12)DeliverableD4.3.1

Circulation: PU‐ Public

Leadpartner: UCL

Contributingpartners: UCL, CNR‐IMATI‐GE,TUDelft

Authors: JanBoehm,SimonJulier,KunLiu,ZiyiJiang,GeoffreyJones(UCL);MichelaSpagnuolo,SilviaBiasotti,AndreaCerri,SimonePittaluga(CNR‐IMATI‐GE);RoderikLindenbergh,BerilSirmacek(TUDelft)

QualityController: EwaldQuak(SINTEF)

Version: 1.0

Date: 01.11.2013

Page 2: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

1

©Copyright2013:TheIQmulusConsortiumConsistingof

SINTEF STIFTELSENSINTEF,DepartmentofAppliedMathematics,Oslo,Norway

Fraunhofer FraunhoferInstituteforComputerGraphicsResearch,Darmstadt,Germany

CNR‐IMATI‐GE Institute for Applied Mathematics and Information Technologies of theNationalResearchCouncil(CNR‐IMATI),Genova,Italy

MOSS M.O.S.S.ComputerGrafikSystemeGmbH(MOSS),Munich,Germany

HRW HRWallingfordLtd(HRW),Wallingford,UK

FOMI Hungarian National Mapping and Cadastral Agency (FÖMI), Institute ofGeodesy,CartographyandRemoteSensing,Budapest,Hungary

UCL University College London (UCL), Research centre for Photogrammetry, 3DImagingandMetrology,London,UK

TUDelft DelftUniversityofTechnology(TUDelft),DepartmentofGeoscience&RemoteSensing,Delft,TheNetherlands

IGN Institut National de l’Information Géographique et Forestière (IGN), Paris,France

UBO Université de Bretagne Occidentale (UBO), European Institute for MarineStudies,Brest,France

Ifremer L’Institut Francais de Recherche pour l´Exploitation de la Mer (Ifremer),Brest,France

Liguria RegioneLiguria,Genova,Italy

Thisdocumentmaynotbecopied,reproduced,ormodifiedinwholeorinpartforanypurposewithout written permission from the IQmulus Consortium. In addition to such writtenpermissiontocopy,reproduce,ormodifythisdocumentinwholeorpart,anacknowledgementoftheauthorsofthedocumentandallapplicableportionsofthecopyrightnoticemustbeclearlyreferenced.

Allrightsreserved.

Thisdocumentmaychangewithoutnotice.

Page 3: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

2

DOCUMENTHISTORY

Version IssueDate Stage ContentandChanges

0.1 Sep232013 FirstDraft

0.3 Oct042013 Commentscollected

0.5 Oct132013 Integrateservicemetadata

0.7 Oct242013 MergingContributions

0.8 Oct252013 MergingContributions

0.9 Oct262013 Proofreading

0.92 Oct282013 MergingContributions

0.94 Oct312013 QualityControl

0.96 Nov12013 QualityControl

1.0 Nov12013 VersiontobesubmittedtotheProjectOfficer

Page 4: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

3

EXECUTIVE SUMMARY

ThisreportpresentstheprogressmadeduringthefirstyearintheTask4.3Featureextraction,classification and correlation. It documents the development process, choices of tools andlibraries. It lists the software prototypes which were implemented and released, documentstheirunderlyingalgorithmsandpresentstheresultsoftheirtesting.

By the end of the first year 10 processing services were developed. The processing servicescontaintwointernalandexpertservices,whichareneededforpre‐processingofdataandasanaidindevelopment. ThesearethestatictilingofLASfilesandthequickvisualizationofpointclouds.Theremainingeightprocessingservicesaretargetedtowardstheenduser.Theyweremotivatedby theuser requirements andwere chosen to support the implementationof a fullscenariowork‐flow. The processes include point cloud to regular grid conversion, peak pointdetection,outlierclassification,3Dlocalkeypointextraction,tree‐crowndetection,detectionofflow lines and drainage basins, extraction of critical points and filtering point clouds byattributes.

TwoofthesoftwaretoolsarealreadyimplementedintheMapReduceparadigm,thechoicefortheIQmulusCloudcomputingarchitecture.Theprocessingserviceschosenforthisarethestatictiling of LAS files and the point cloud to regular grid conversion. These are typical exampleprocessesformassdataprocessing.

Mainlythreedatasetswereusedtotest theprocessingservices.Theyareallpointclouddatasets.OnedatasetisamobilemappingdatasetandtwoareairborneLiDARdatasets.

Page 5: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

4

TABLE OF CONTENTS

Executivesummary..........................................................................................................................3 

Tableofcontents..............................................................................................................................4 

1  Introduction...............................................................................................................................5 

2  BaseLibraries&DevelopmentTools................................................................................6 2.1  Hadoop ........................................................................................................................................ 6 

2.2  liblas ............................................................................................................................................. 8 

2.3  PCL ............................................................................................................................................... 9 

2.4  GitHub ....................................................................................................................................... 10 

2.5  CINDER ....................................................................................................................................... 11 

2.6  Programming Languages & Compilers ...................................................................................... 12 

2.6.1  C++ ..................................................................................................................................... 12 

2.6.2  Boost .................................................................................................................................. 12 

2.6.3  MATLAB ............................................................................................................................. 13 

3  TestDataSets..........................................................................................................................14 3.1  Bloomsbury ............................................................................................................................... 14 

3.2  London ....................................................................................................................................... 15 

3.3  Liguria ........................................................................................................................................ 16 

4  IndividualServices................................................................................................................18 4.1  Internal/Expert Services ............................................................................................................ 18 

4.1.1  Static tiling of LAS file ........................................................................................................ 18 

4.1.2  Quick Visualization for Point Clouds .................................................................................. 21 

4.2  End‐User Services ...................................................................................................................... 22 

4.2.1  Point Cloud to Regular Grid ............................................................................................... 22 

4.2.2  Peak Points ........................................................................................................................ 24 

4.2.3  Outlier Classification in Point Clouds ................................................................................. 25 

4.2.4  3D Local keypoint extraction from point clouds ................................................................ 28 

4.2.5  Tree crown recognition from mobile mapping point clouds ............................................. 29 

4.2.6  Detection of flow lines and drainage basins ...................................................................... 33 

4.2.7  Extraction of Critical Points, lines and regions from grids and triangulations .................. 35 

4.2.8  Filter Point Cloud by Attribute & Coordinate ..................................................................... 37 

5  References................................................................................................................................40 

6  Appendix–ServiceInformationTables........................................................................41 

Page 6: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

5

1 INTRODUCTION

Thisreportpresentstheprogressmadeduringthefirstyearinthetask4.3“Featureextraction,classification and correlation”. It describes the processing services developed for IQmulus aspartofworkpackage4.Task4.3istargetedatasemanticenrichmentofthedatasetsaslaidoutintheDescriptionofWork.Semanticenrichmentreferstoautomaticextractionandannotationofhigh‐levelinformationbysegmentationorclassificationprocesses.

Withinthefirstyeartenprocessingservicesweredevelopedandarenowreleasedattheendofthe first period. The processing services contain two internal and expert services, which areneededforpre‐processingofdataandasanaidindevelopment.Theremainingeightprocessingservicesaretargetedtowardstheenduser.Theyweremotivatedbytheuserrequirementsandwerechosentosupporttheimplementationofafullscenariowork‐flow.

Allprocessingservicesaredescribedinauniformwayusingstandardizedserviceinformationtables.Thesetablesgivetheservicename,adescriptionofitsfunctionality,adescriptionofthealgorithm,theconnectiontotheusecasesandfurthercharacteristics.ThesetablesarecontainedintheAppendixofthisdocument.

Amoredetaileddescriptionof theprocessing services is given in Section4of this document.Herethedetailsoftheimplementationandresultsobtainedonthetestdatasetsarepresented.For each process figures of the processed data and visualizations of the results are given toillustratethealgorithmanddemonstrateitscapabilities.

ThetestdatasetsusedtoobtaintheseresultsaredescribedindetailinSection3.Mainlythreedatasetswereusedtotesttheprocessingservices.Theyareallpointclouddatasets.OnedatasetisamobilemappingdatasetoftheLondonBloomsburyarea.TwoofthedatasetsareairborneLiDAR data sets. One data set was acquired over London, the other over the coastal area ofLiguria.Thesizeofthedatasetsisintheorderof4millionpoints.

To successfully implement the algorithms several base libraries and other tools for softwaredevelopmentwereused.Tocompletelydocument thedevelopmentprocess themain librariesandtoolsareintroducedinSection2.HadoopisthechoicefortheIQmulusarchitectureandisthereforeacruciallibraryfortheservicesimplementedintheMapReduceparadigm.Twoofthesoftwareprototypes are already implemented in theMapReduceparadigm, the choice for theIQmuluscloudcomputingarchitecture.Wechosetwotypicalexampleprocessesformassdataprocessing, namely static tilingof LAS files and thepoint cloud to regular grid conversion forimplementationintheMapReduceparadigm.

Otherlibrariesthatwereusedinthedevelopmentofsoftwareintask4.3includelibLASalibraryto read point clouds in the standardized LAS format and PCL a general purpose point cloudprocessing library. Section 2.6 further discusses the situation onprogramming languages andcompilersandrecommendationsmadeforthedevelopmentswithintask4.3.

Page 7: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

6

2 BASE LIBRARIES & DEVELOPMENT TOOLS

In the following, we briefly review the main libraries used to build the services for featureextraction, classification, and correlation. These libraries have been collected and extensivelydiscussed in collaboration with other tasks in WP4 and will be further reviewed during thedevelopmentoftheprocessingservices.

2.1 HADOOP

TheApacheHadooplibraryisanopen‐sourcesoftwarelibrarythatprovidesaframeworkforthedistributed processing of large data sets across clusters of computers. It supports a simpleprogrammingmodel, known asMapReduce.MapReducewas originally derived fromGoogle’spaperontheprogrammingmodel(DeanandGhemawat,2008).Thestrengthofthisframeworkis that itprovidesautomatedparallelizationof thecomputationandautomateddistributionofdata, while it possesses excellent scaling characteristics. Hadoop consists of a number ofmodules. For the processing in IQmulus, we focus on only two components ‐ the HadoopDistributedFileSystem(HDFS)andtheMapReduceengine.

HDFS is a distributed, scalable and rack‐aware file system. It is therefore able to handle verylargedatasets,e.g.severalterabytes.ItisthedefaultfilesystemassumedforallprocessingjobsunderHadoop.AlthoughHadoopcansupportotherdistributedfilesystemsfromdifferentcloudcomputingservicevendors,suchasAmazonS3andIBMGPFS,HDFSisexclusivelyusedforthecurrentWP4serviceimplementations.ThemainprogrammingAPIforHDFSisinJava.ACAPIisalsoprovided,knownaslibhdfs,basedonJavaNativeInterface(JNI),andimplementsasubsetoftheJavaHDFSAPI.

ForIQmulus,adistributedpointcloudLASfilereader/writerforHadoophasbeenimplementedforTask4.3usingtheJavainterface.Testingusingcurrentlyavailabledatashowsstability.Moretestingofthefilehandlerisexpectedinfutureoncetheprocessingframeworkhasbeendecided.

HadoopcurrentlysupportstwoversionsoftheMapReduceengine:MRv1andMRv2.MRv1isthedefaultoneusedformostcommerciallysupportedHadoopdistributions.MRv2,alsoknownasYARN,isunderactivedevelopmentbytheHadoopcommunityandisexpectedtobethefuturedefault. The main improvement in YARN is the replacement of the JobTracker andTaskerTracker in MRv1 with a new set of daemons such as ResourceManager andApplicationManager.Thenewdaemonsaremoreagnostictotheapplication,whichenablestheapplication to implementprogrammingmodelsbeyondMapReduce, such asmachine learningand general cluster computing. YARN maintains to a large extent the API backwardscompatibilitywithMRv1.

The currentTask4.3 implementation follows theAPIprovidedbyMRv1,becausenodecisionhasyetbeenmadeforthespecificprocessingframeworkandversionatthepointofwritingthisdocument.AsmallpercentageofcodeadaptationforusingthenewYARNengineisexpectedforfuturework.

HadoopisdevelopedinJava,andJavaoffersthemostflexibleapproachtoimplementdistributedjobsunderHadoop.ThereareinterfacesforotherprogramminglanguagesinHadoop,knownasPipesandStreaming.WhilePipesaredesignedforjobimplementations(map/reducefunctions)in C++, Streaming is more intended for text processing using applications produced in otherlanguages.

Page 8: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

7

TheC++applicationslinkedagainstHadoopPipesinheritsimilarclassinterfacesasinJava,andeachapplicationisrunasasub‐processoftheJavaVirtualMachine.Thecommunicationisviasockets,andhencethekeyandvalueclassesinPipesarebytebuffers,representedinC++asSTLstrings. The Hadoop Pipes only support custom Mappers and Reducers in C++, moresophisticateddistributedfilehandlingandmapoutputconsolidationusingPartitionerstillneedtobeimplementedinJava.ApointcloudtoregulargriddemohasbeenproducedusingHadoopPipesforTask4.3.

TheHadoopStreamingAPIusesUNIXstandardstreamsas the interfacebetweenHadoopandtheapplication.Thereforeanyapplicationthatreadsfromstandardinputandwritestostandardoutput can support Mapper or Reducer functions in a MapReduce job. A combination ofStreaming and Pipes is expected for future service implementation depending on the specificformofserviceinput.

Figure 1: ClouderaManager showing the services “LAS file tiling” and “Point Cloud to regularGrid”onafournodetestcluster.

Page 9: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

8

Figure2:Hadoopstartscreen.

There are several existing commercially supported Hadoop distributions. Testing for Hadoopclusters and its upper infrastructure services is performed using the Cloudera CDH4distributionswithdeploymentthroughVagrant.A4‐nodeHadoopclusterrunningHadoop2.0isusedtotestallcurrentTask4.3serviceimplementations(startscreeninFigure2).TheClouderaclustermanagement(seeFigure1)providescontrolthroughitsownweb‐basedinterfaces,withrich access to job management, HDFS file management, etc., in addition to existing Hadoopcommandlinefunctionsandawebinformationpage.WP3willprovideaHadoopclusterforthefuturedevelopment. Still it canbe reasonable tousea local cluster in the followingphasesofIQmulus for development and testing. The local cluster can even be connectedwith themainclustertoformabiggercluster.

2.2 LIBLAS

libLAS is an open‐source C/C++ library for reading and writing the LAS LiDAR point cloudformat (Butler et al., 2013). The library is implemented in C/C++. In addition, libLAS alsoprovidesaPythonAPIandcanbeusedinvariousotherlanguagessuchasC#andVB.NET.libLASis distributed under the BSD license. libLAS currently provides full support for all LAS fileformats up to version 1.2. Newer versions of the LAS file format include additions such aswaveformdataandalongerheader.However,thesearenotcurrentlysupportedbylibLASandthereforethefurtherdevelopmentsofthislibraryneedclosemonitoring.Potentiallythelibrarycouldbereplacedwithlidarformat(“lidarformat‐C++pointcloudlibrary,”2013)orbothcouldbeusedincombination.

AdvancedLASdatamanipulationcanbeachievedboththrough thetoolapplicationsprovidedbylibLASorthroughitsprogrammaticAPI.FunctionsprovidedbylibLASincludeinput/outputwithLASbinarydata,aswellasvariouspointrecordbasedoperations includingdata filteringand statistics production. The prerequisite libraries for building libLAS include Boost version1.38orhigher,withGDAL, libgeotiff and laszip toprovideadditional functionalities.CMake isusedasthedefaultbuildingtool.

The latest stable release for libLAS is version 1.7. Services in Task 4.3 with single nodeimplementationsalluselibLAStodealwithLASfileinput/output.Version1.7oflibLASisstableforcurrentlyplannedTask4.3servicesusingthetestdata.Futuredevelopment isexpectedtousethesameversionoflibLAS.

Page 10: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

9

2.3 PCL

ThePointCloudLibrary(PCL)isaBSD‐licensed,open‐sourceC++libraryforn‐Dpointcloudand3Dgeometryprocessing.Thelibrarycontainsvariousstate‐of‐the‐artalgorithms,whichsupportarangeofoperationsincludingfiltering,featureestimation,surfacereconstruction,registration,model fitting and segmentation. These algorithms can be efficiently used to implement theservicesinTask4.3,e.g.,outlinerclassificationandfeaturepoints.

PCLiswritteninC++andisfullytemplated.Itisacross‐platformlibrarythatcanbebuiltunderWindows, Linux and Mac OS X. In addition, the library heavily utilizes Streaming SIMDExtensions(SSE)toobtainhighperformanceonmodernCPUs.MostofmathematicaloperationsareimplementedusingtheC++linearalgebralibraryEigen.Moreover,OpenMPissupportedinPCL,which provides parallel implementation formany algorithms such as normal estimationandmoving least square fitting. In geometryprocessing, k‐nearestneighbour (KNN) search isfrequentlyusedandthusafastKNNmoduleiscrucialtothewholeframework.PCLmainlyusesthe library FLANN, i.e., Fast Library for Approximate Nearest Neighbours (Muja and Lowe,2009),which is about one order ofmagnitude faster in query time than previously availableapproximate near neighbour search software (Mount and Arya, 1998). Algorithms for spatialdata management, including octrees, are also provided. All the modules in PCL pass data bymeansofBoostsharedpointerwithoutexplicitlycopying.CurrentlythesharedpointerisapartoftheC++11standard.

PCLisdevelopedwithefficiencyandperformanceinmind,anditincludesmanytechniquesandlibrariestooptimiseperformance.Nonetheless,itcanbeveryeasilyintegratedintoauser’sownproject. The same basic interface is used in all the algorithms as shown in Figure 3. First, anobjectiscreatedforprocessing.Oneexamplewouldbeanobjecttoperformnormalestimation.Subsequentlytheinputpointcloudispassedtothecreatedobject.Parametersaresetthereafter.Finally the processing is executed and the results are obtained.A snippet of example code toestimate normals is presented in Figure 4. As illustrated in Figure 4, the basic interface iscompactandsimple.Moreover,sinceallthedatastructuresinPCLarewrittenusingtemplates,this provides enough flexibility to modify the library to satisfy special requirements, e.g.,introducingnewpointtypes,butwithouttheneedtocreateintermediatewrappercode.

Figure3:BasicinterfaceschemeforPCLcode.

Duetoitsefficiencyandconvenience,PCLisusedasacorelibraryforpointcloudprocessinginTask 4.3. In our current implementation, PCL version 1.6.0 is used. The library is wellmaintainedbyalargedevelopergroup.Thelateststablereleaseistheversion1.7.0andversion2.0isexpectedpossiblyin2014.Therefore,inthefuturewewillcontinueusingthelibraryasacorelibraryforpointcloudprocessing.

Page 11: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

10

// Create the normal estimation class, and pass the input dataset to itpcl::NormalEstimation<pcl::PointXYZ, pcl::Normal> ne; ne.setInputCloud (cloud);  // Create an empty kdtree representation, and pass it to the normal estimation object. // Its content will be filled inside the object, based on the given input dataset (as no other search surface is given). pcl::search::KdTree<pcl::PointXYZ>::Ptr tree (new pcl::search::KdTree<pcl::PointXYZ> ()); ne.setSearchMethod (tree);  // Output datasets pcl::PointCloud<pcl::Normal>::Ptr cloud_normals (new pcl::PointCloud<pcl::Normal>);  // Use all neighbors in a sphere of radius 3cm ne.setRadiusSearch (0.03);  // Compute the features ne.compute (*cloud_normals); 

Figure4:AC++examplewhichshowshowtocomputenormalsinapointcouldusingPCL

2.4 GITHUB

Duringthefirstyearofdevelopmentitwasnecessarytoexchangecodewiththepartners.Thiswas for example needed to distribute example code for the first code camp (Darmstadt, June2013),which explains the combined use of libLAS and PCL, or the normal estimation from apointcloud.WechosetouseGitHubtosharecodeamongstpartners.Thesamemechanismwasused to distribute the first code demonstrating the use of MapReduce and Hadoop as apreparationofthesecondcodecamp(Paris,October2013).GitHubisaweb‐basedsoftwareprojecthostingservicethatusestheGitversioncontrolsystem.GitHuboffersbothprivateandpublicrepositories.AlthoughGitHubactivelyencouragesuserstochoose theopen‐source‐licensingmodel for theirproject, thedefault copyright laws apply forprojectsinabsenceofalicense.GitHubprovides rich codeviewanddevelopment functionality through itsweb interface.Thebasicconceptofbranchingandversioncontrol forthecode isbasedon itsunderlyingversioncontrolsystemGit.WithproperSSHsetup,theGitHubrepositorycanbeaccessedthroughlocalGit applications on any authorised development machine. In addition to the basic Gitfunctionality,theGitHubwebinterfacealsoprovidesothereasy‐to‐accessfunctionssuchasWikipages for code documentation; a live issue tracking system allows developer interaction andvisualbranchandpullrequestmanagement.

Page 12: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

11

Figure5:GitHubWebinterfacewithTask4.3codesamples

Git isadistributedversioncontrolsystemusedbyGitHubfor itsbackbonecodemanagement.GitwasoriginallydesignedanddevelopedforLinuxkerneldevelopmentin2005.Forworkingwith GitHub, Task 4.3 is currently using a private GitHub repository maintained by UCL toprovidecodesharingandhostingfortheservicedevelopmentactivities(Figure5).FuturecodesharingactivityisexpectedtocontinuethroughtheGitHubservice.

2.5 CINDER

“Cinderprovidesapowerful,intuitivetoolboxforprogramminggraphics,audio,video,networking,imageprocessingandcomputationalgeometry.Cinderiscross‐platform,andingeneraltheexactsamecodeworksunderMacOSX,Windowsandagrowinglistofotherplatforms—mostrecentlytheiPhoneandiPad.”–sourcedfromhttp://libcinder.org/about

Cinder in the context of IQmulus is a fast (yet elegant)way to visualise the point cloud data.Cinder hasminimal dependence on third party libraries and is designed to leverage platformnativecapabilitieswhereavailable,assuchitislighterweightthanothercross‐platformfront‐end frameworks such asQt. Internally Cinder utilises aspects of the latest C++11 standard tocreate an internallymemorymanaged framework for efficientlymarshalling againstmemoryleaks, both for framework objects and, via the use of transparent C++ wrappers, of OpenGLresourcesaswell.

Cinderwasoriginallydesignedasacross‐platformframeworkforgraphicsartists,andalthoughithasmaturedintoamorebroadlyapplicabletoolkit,thecorepremiseofflexibilityandeaseofuse has remained. This was apparent by the straightforward integration to make the Cinder

Page 13: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

12

interfacewithPCLworkfordrawingandloadingpointclouds.Therearealsoseveraloptional,yet actively maintained, interface classes for leveraging other mainstream libraries such asOpenCV (current minimum version 2.1). The down sides of this open‐ended design are thatbeyond text rendering, the UI construction andmanagement is up to the user to implement,although a positive of this is that it is almost guaranteed not to look like a cookie cutterapplication which can happen when using frameworks that have morecomprehensive(/inflexible)off‐the‐shelfofferings.

The Cinder library is an industry‐strength tool originally developed for internal use by theBarbarianGroupand it continues tobedevelopedandmaintainedbyanactivecommunity. IthasrecentlyreceivedtheendorsementofHerbSutterforbeinga“flexibleandfun”frameworkfor rapid development. For the code developed in Task 4.3 version 0.8.5 was used, whichdependsonOpenGLandBoost.

2.6 PROGRAMMING LANGUAGES & COMPILERS

In the following sectionswedocument the guidelines and recommendationswemade for thesoftwaredevelopedinTask4.3andwetrytoprovidetherationalesforthechoices.

2.6.1 C++

In this phase of the project many processes were implemented in C++. This is perfectlyreasonableforfastprocessesoftenimplementedassinglethreadedprocesses.However,asthepartnersareusingdifferentplatformsandconsequentlydifferentcompilers, thiscreatessomeproblems when exchanging code. We try to give some recommendations to harmonize codedevelopmentandmaketheresultingcodemore(re‐)usable.

For one we recommend the use of a C++ style guide. We chose the Google C++ Style Guide(“Google C++ Style Guide,” 2013), as it is awell‐written, comprehensive and easily accessiblestyleguide.Itisimportantthatweusethisasaguidenotarule.

WefurtherrecommendtheuseofexistinggeneralpurposelibrariessuchasBoost(seebelow)as it is peer‐reviewed code andnicely documented.Manyof the other libraries suggested forTask4.3useBoostalready.

AtthemomentC++11(“TheStandard :StandardC++,”2013)isnotsupportedbyallcompilers.AgaininordertoimprovetheexchangeofcodewerecommendswitchingtoaC++11compatiblecompilerwhentheybecomeavailable.Inparticulartheseare

1. VisualStudioC++20132. GNUGCC>=4.8(providesfullC++11support)

2.6.2 Boost

“Boost isasetof librariesfortheC++programming languagethatprovidesupportfortasksandstructures such as linear algebra, pseudorandom number generation, multithreading, imageprocessing, regular expressions, and unit testing. Release 1.52 contains over eighty individuallibraries”.[Wikipediapage]

Givenitsbroadrangeoffunctionality,BoostisrequiredbylibLAS,PCLandCinderandisusedbycommercial packages includingMatlab and Adobe Acrobat. The libraries aremostly released

Page 14: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

13

under the Boost Software License, which allows for the use of the libraries in commercialproducts.

Given its widespread use bymultiple IQmulus libraries, this librarywill continue to be usedthroughoutthedurationoftheproject.

2.6.3 MATLAB

MATLAB (“MATLAB ‐ The Language of Technical Computing,” 2013) is both a numericalcomputing tool and a programming language. Within Task 4.3 MATLAB will not be used todeliver processes. However, it is an important prototyping tool. Particularly the statisticaltoolbox provides operators that are useful for Task 4.3. The machine learning tools are animportant reference and benchmark for processes developed within IQmulus. The class ofoperatorswith special importance are supervised classification algorithms.MATLABprovidesthefollowingstate‐of‐the‐artimplementations:

Boostedandbaggeddecisiontrees Supportvectormachines(SVMs) NaïveBayesclassifiers Nearestneighbours(kNN) Discriminantanalysis Neuralnetworks

Accessing readily implemented classification algorithms allows testing and benchmarkingdifferent classification approaches, without the implementation overhead. Since theMATLABoperators can currently be limited in performance and memory, they are not suited fordissemination.Forthealgorithmsselectednativeimplementationsarepreferred.

Page 15: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

14

3 TEST DATA SETS

During development several data sets were used to test the implementations. The followingsectionsgiveanoverviewofthemaindatasets.Wherepossiblewegivethesource,extentandsizeofthedataanddescribetheapproximatelocation.

3.1 BLOOMSBURY

TheBloomsburydatasetisamobilemappingLiDARpointcloudacquiredbyABASurveying.TheABAmappingvanisequippedwiththreelaserscannerheads.ThetrajectoryiscomputedfromGPS, IMU andwheel sensor data. Each scanner head can acquire up to 1,000,000 points persecond.Figure6showsthemappingvanandanoverviewofthepointcloud.ThetestsceneisatrajectoryintheBloomsburyareaofLondonaroundthecampusofUCL.

Figure6:MobileMappingvan(left)andcaptureddataoverBloomsbury,London(right).

Thepointcloudhasaspacingofroughly10cmalongthetrajectoryofthevanandonaverageof2 cm across the driving direction. A detail of the point cloud demonstrating the high pointdensity is given inFigure7.Besides the3Dcoordinates thebackscatteredenergyof the laserbeamisalsorecordedforeachpoint.Thiscanbedisplayedasanintensityvalue.Forthetestasingletileofthedatasetwith4448634pointswasused.

Page 16: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

15

Figure7:Detailofthepointcloud.

3.2 LONDON

TheLondondatasetisanairborneLiDARpointcloudoverLondon.ThedataisstoredinaLASfileversion1.2.Asingletilecontainsabout2,000,000points.Thepointdensityisabout4pointsper square meter. The dataset represents the current standard product for commerciallyavailable fixedwingairbornedataacquisition.Thefeaturesstoredperpointaretheattributestypicallyexpectedforanairbornedataset:

'X' 'Y' 'Z' 'Intensity' 'ReturnNumber' 'NumberofReturns' 'ScanDirection' 'FlightlineEdge' 'Classification' 'ScanAngleRank' 'UserData' 'PointSourceID' 'Time'

TheClassificationattributeconformstotheASPRSstandard.ThevisualizationofatileisgiveninFigure8.Asanexample,thespecifictileconsistsofthefollowingclasses(numberofpointsgivenfirst):

5840Unclassified 584307Ground

Page 17: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

16

316287LowVegetation 225053MediumVegetation 974272HighVegetation 2752LowPoint(noise) 1635ReservedforASPRSDefinition

Figure8:SingletileofLondonairborneLiDARdatasetusingClassificationattributeforcolouring.

3.3 LIGURIA

TheLiguriadatasetisanotherairborneLiDARdataset.Figure9showsthedatasetusingthescanangleforcolouringwhichvisualizesthedifferentflightstripsusedfortheacquisition.Thesingletiledatasetcontains4134193points.Thepointcloudcoversabout0.9squarekilometres.Onaveragethepointdensity is4pointspersquaremeter.Thedatasetcontainsonly twodistinctclassesforclassification:

3726654Unclassified 407539Ground

Page 18: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

17

Figure9:AirborneLiDARdatasetoverLiguriausingScanAngleforcolouring.

Page 19: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

18

4 INDIVIDUAL SERVICES

OneofthecoreoutputsofTask4.3inthefirstyearisthedevelopmentandimplementationofatoolkitforfeatureextractionandclassification.Thissectiondocumentsthedevelopmentofthetoolkitandmotivatesthedecisionsthatweremadeduringtheimplementation.Foreachprocesswe demonstrate the results on one of the example data sets. The following table gives anoverviewofalltenprocessingservicesincludedinthetoolkit.

Processing Task  Lead Partner 

Static Tiling of a LAS File  UCL 

Quick Visualization  UCL 

Point Cloud to Regular Grid  UCL 

Peak Points  UCL 

Outlier Classification in Point Clouds  UCL 

3D Local Keypoint Extraction from Point Clouds  TUDelft 

Tree crown recognition from mobile mapping point clouds  UCL 

Detection of Flow Lines and Drainage Basins  IMATI 

Critical Points  IMATI 

Filter Point Cloud by Attribute & Coordinate  UCL 

4.1 INTERNAL/EXPERT SERVICES

Asetofinternalprocessesisneededforpre‐processingofdataandasanaidfordevelopment.Sincetheresultsofmostprocessescanbeeasilyassessedbyvisualization,aquickvisualizationtool is essential for the developers. This is outside of the scope of WP5, which targets thedevelopmentofvisualisationtoolsforendusers.Statictilingof(potentially)largeLASfileshelpstokeepdatasizeforlocalnodeslow.Itisanessentialtoolfordistributingloadacrossacluster.

4.1.1 StatictilingofLASfile

This service divides a single large LAS file intomultiple smaller tiles, each ofwhich is bettersuitedforprocessingatanindividualnodeinthecluster.Theinputisanunorderedpointcloudin LAS format. The input parameters are the tile size and the tile‐overlapping ratio; both areindicatedinthex‐andy‐axisdirection.TheoutputisasetofmultipleunorderedpointcloudsinLASformat,oneforeachtile.

Page 20: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

19

Thisserviceisintendedtofacilitateotherservices.Itisespeciallyhelpfulforthoseservicesthatare not able to handle data sizes beyond the memory size. It effectively reduces the size ofindividualLASfiles.Usingoverlappingtilesmakesitpossibletoaddressedgeeffectswhichcanariseatthetileborders(e.g.,filtering).

Theserviceis implementedusingtheMapReduceframeworkinJavaandthereforescaleswiththesizeoftheHadoopcluster.TheLASfileI/OfunctionalityisastandalonepackagethatcanbeusedbyotherTask4.2andTask4.3serviceswhichuseLASfilesforpointcloudstorage.

Abriefdescriptionoftheserviceimplementation:

## Pre-Mapper: Input: x_tile_size Input: y_tile_size Input: x_overlap_ratio Input: y_overlap_ratio # Read LAS file header for point cloud boundary x_max, x_min, y_max, y_min = readHeader(); # Calculate number of tiles in x- and y-axis x_tile_num = ceil((x_max – x_min) / x_tile_size) y_tile_num = ceil((y_max – y_min) / y_tile_size) set_num_of_reduce_task(x_tile_num * y_tile_num) Output: x_tile_num Output: y_tile_num ## Mapper: (Key1, Value1) -> (Key2, Value2) Input: k1 # Key1, the offset of LAS point record in the file, Bytes Input: v1 # Value1, the input LAS point record # Calculate tile number in x- and y-axis x_tile = ceil((v1.x – x_min) / x_tile_size) y_tile = ceil((v1.y – y_min) / y_tile_size) # Calculate the tile id tile_id = x_tile_num * (y_tile – 1) + x_tile; k2 = tile_id v2 = (v1.x, v1.y, v1.z) Output: k2 # Key2, the tile id for individual point Output: v2 # Value2, the point # Calculate tile number with inflated tile_size x_overlap = ceil((v1.x – x_min) / (x_tile_size * (1 + x_overlap_ratio))) y_overlap = ceil((v1.y – y_min) / (y_tile_size * (1 + y_overlap_ratio))) # Calculate the inflated tile id tile_id_overlap = x_tile_num * (y_overlap – 1) + x_overlap; # If the point ends up in different tile, output the new id as well

Page 21: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

20

if (tile_id_overlap != tile_id) k2’ = tile_id_overlap v2’ = v2 Output: k2’ Output: v2’ ## Partitioner: (Key2, Value2) -> Reducer id Input: k2 Output: k2 – 1 # tile id corresponds to the reducer id ## Reducer: (Key2, Value2s) -> Output Input: k2 Input: v2s # All the values with the same k2 # Send point to the output file print(v2) Output: Tiled LAS file TheexecutionofthecodeiswrappedwithinBashscripttoprovideHadoopjobsetupandafter‐job data retrieval in addition to the main code functionality. Figure 10 and Figure 11demonstrate a tiling example of the Liguria data set with 500m by 500m tile size and 10%overlapping ratio. The original point cloud is coloured in grey in the middle of the figure,whereasthetilesarecolouredandscatteraroundtheoriginaldatafordemonstration.

The performance of current service implementation depends on the specific Hadoop clusterdeploymentanduserinputs.Forexample,theexecutiontimeforasingle‐nodeHadooprunontheLiguriadatasetwiththeuserinputsasshowninFigure11isabout50seconds.

Figure10:Dividingapointclouddataset(left)intosixseparatetiles(right).

Page 22: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

21

Figure11:Three‐dimensionalrenderingoftiling(originaldatagrey,resultingtilesincolour).

4.1.2 QuickVisualizationforPointClouds

Asexplainedabove,foralgorithmdevelopmentthereisarequirementtoprovidetechniquestorapidlyvisualizedatasuchasintermediateresults.Thisisnotthesameasthefullvisualizationsystemwhichwill be developed underWP5 to support end users. Rather,we require simpletoolswhichwillsupportdebuggingandrapidprototyping.Asstatedpreviously inSection3.3,PCL is utilized as the core library for point cloud processing. PCL supports several ways tovisualizecloudsinitsnativeformat.However,thereisnosupportwhenanexternalformat(suchasLAS)mustbevisualized.

Given these issues, a tool for quick visualizationhasbeen implementedusing the lightC/C++libraryAntTweakBar.VisualizationisachievedusingOpenGLandGLUT.Theframeworkissuchthat it is easy to introduce custom code to provide further visualization and interactioncapabilities.Moreover, AntTweakBar facilitates the creation of graphical user interfaces (GUI)for theapplication.BythemeansofAntTweakBar,bar,slider,andbuttoncanbeaddedto theparameterpanelwithoutalotofcodingwork,whichacceleratestheprocedureofprototyping.

Figure12 shows thegraphicaluser interface (GUI)providedby the currentvisualization tool.ThefigureintheLIGURIApointclouddatasetisvisualizedinthecentralwindowoftheUI.Thesemi‐transparentbluepanelatthetop‐leftcanbeusedtosetparametervalues.Threewaystoset parameter values are provided: rotating slider, increasing or decreasing by buttons andtyping in input fields. In thebottom‐left, thehelppanel isdisplayed includingshortcutsandasimpledescriptionoftheUI.

Page 23: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

22

Figure12:Visualizationofanexamplepointcloud.

4.2 END‐USERSERVICES

Thefollowingprocessesaretargetedattheendusers.Theydrawtheirmotivationfromtheuserrequirementsandtheyarebuildingblocksinaccommodatingtherequiredworkflows.

4.2.1 PointCloudtoRegularGrid

Thisserviceconvertsanunorderedpointcloudintoagridrepresentationin2.5D.Theinputisanunorderedpointcloud inLAS format.The inputparametersare thegridsamplingdistance(“spacing”)inbothx‐andy‐direction.TheoutputisagriddedpointcloudinLASformat.ItisalsopossibletooutputGeoTIFFforcompatibilitywithcertainGISsoftwarepackages.

ThisserviceisimplementedusingtheMapReduceframeworkinJavaandthereforescaleswiththe size of the Hadoop cluster. The approach is to generate keys based on point coordinateswhichidentifythegridpositions.TheLASfileI/Ofunctionalityisimplementedusingthesharedlibrary. This service has also been implemented in C++ using Hadoop Pipes to test thecombinationofC++andJavaforHadoopparallelisationusage.

Abriefdescriptionoftheserviceimplementation:

## Mapper: (Key1, Value1) -> (Key2, Value2) Input: k1 # Key1, the offset of LAS point record in the file, Bytes Input: v1 # Value1, the input LAS point record Input: grid_res_x # the x grid sampling distance, in meters Input: grid_res_y # the x grid sampling distance, in meters # Calculate lower tile coordinates in x- and y-axis lower_grid_x = floor(v1.x / grid_res_x) * grid_res_x lower_grid_y = floor(v1.y / grid_res_y) * grid_res_y k2 = (lower_grid_x, lower_grid_y) v2 = v1.z Output: k2 # Key2, the tile coordinates Output: v2 # Value2, the height information

Page 24: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

23

## Reducer: (Key2, Value2s) -> Output Input: k2 Input: v2s # All the values with the same k2 # Loop through all the value with the same key for v2 in v2s: sum_z += v2.z z_out = sum_z / v2s.size() # Taking average # Send point to the output file print(k2.lower_grid_x, k2.lower_grid_y, z_out) Output: Gridded LAS file

TheexecutionofthecodeiswrappedwithBashscripttoprovideHadoopjobsetupandafter‐jobdata retrieval in addition to the main code functionality. Figure 13 demonstrates a griddingexampleoftheLiguriadatasetwithasamplingdistanceof10mby10m.Theoriginalpointcloudiscolouredingrey,andthegeneratedgridiscolouredinyellow.

TheperformanceofthecurrentserviceimplementationdependsonthespecificHadoopclusterdeployment,anduserinputs.Forexample,theexecutiontimeforasingle‐nodeHadooprunontheLiguriadatasetwiththeuserinputsasshowninFigure13isabout40seconds.

Figure13:Regulargriddisplayedontopofdensepointcloud.

Page 25: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

24

4.2.2 PeakPoints

Thisprocessingservicedetects localpeakpoints.For thisprocessapeak isnotdefinedby itslocalshape,butbytherelationofitsheightwithrespecttoitsneighbours.Thedetectorforpeakpointscanbestbedescribedasanon‐maximumsuppressionapproach.Eachpoint inthelocalneighbourhood is inspected. If theheightof thepoint is larger thananyof itsneighbours it iskeptasapeak.Itisdiscardedotherwise.

Theimplementationoftheapproachisbasedonakd‐treeusingtheFLANNlibrary.Tobuildakd‐treerequiresO(nlogn)timewithrespecttothesizeoftheinputcloud(n).Thesearchofthetree is of complexity log n for a fixed search radius (Preparata and Shamos, 1985). Theparameters for the search are the radius and a sensitivity parameter. This is a heuristicintroducedtoavoiddetectingpeaksinpredominantlyflatregions.ResultsareshowninFigure14.

Thepseudocodedescriptionoftheimplementedapproachisthefollowing

Foreach point p in cloud

Q = RadiusSearch (p, Radius)

If height(p)> height(every q in Q) AND

height(p)> height(lowest q in Q + Sensitivity)

Add p to Peaks

Endif

Endfor

Theoperator is currently implemented fora singleCPU ina sequentialmanner.Theoperatorcanbeextendedtolargedatasets,usingtilinganddistributionoverseveralnodes.TheOperatorsynopsisonthecommandlineis

C:\>iQmulus_peak.exe --help

Allowed options:

--help produce help message

--input arg (=test.las) set input file

--output arg (=test.shp) set output file

--radius arg (=5) set search radius

--sensitivity arg (=1) set sensitivity

Page 26: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

25

Figure14:Peakpointsshowninwhitefromanunorderedpointcloud.

4.2.3 OutlierClassificationinPointClouds

This processing service performs statistical outlier detection to identify isolatedpointswhichareusuallyspuriousandarisefromerrorsinthecaptureprocess.Oneinputisthethresholdofhow many standard deviations away from the global mean neighbourhood distance isacceptabletoqualifyinliers.Thesizekreferstothekneighbourstosampleinordertogeneratethe statistical profile of the input data. This separates the input point cloud into inliers andoutliers.ResultsofthisprocessingareshowninFigure16andFigure17.

Algorithminpseudocode:

Inputk#nearestneighbourhoodsizeInputsd_tol#tolerancethresholdInputcloud#inputpointcould#calculatethemeandistancebetweenpointsinallneighbourhoodsForallpiincloud

ui=mean_dist(pi,nearest_neighbours(k,pi))#calculateglobalstatisticsforneighbourhoodsinthiscloudU=mean(u)SD=standard_deviation(u)Threshold=U+(SD*sd_tol)#dividethecloudintoinliersandoutliersForalluiinu

Ifui<=Thresholdinliers.push_back(pi)

Page 27: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

26

elseoutliers.push_back(pi)

Outputinliers#pointswhoseneighbourhoodiswithintoleranceOutputoutliers#pointswhoseneighbourhoodisoutsidetolerance

Results

Figure 15: Problematic outliers in the Bloomsbury data set, due to laser beam reflection(pointsbelowstreetlevel).

Figure 16: Automatic removal of outliers (marked red). Some vegetation points were falselyclassifiedasoutliers.

Page 28: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

27

Figure17:Exampleofremovingpower‐linesovertheterrainbyautomatedoutlierdetection.

Implementation

TheserviceisimplementedinC++.ThecorealgorithmusesPCLwhichinturndependson

Boost>1.46 Flann>1.71 Eigen>3 Vtk>5.6

The service is connected with an optional visualization tool for development and debuggingwhichdependsonCinderandrequires

OpenGL Boost

ThecodeoftheservicebuildsandrunsonLinuxandWindows.

Thecomputational time inrelationtodatasizescalesapproximatelyat(n logn)basedonthetimetakenfortheconstructionofthekd‐treeandnearestneighbourqueries.

Since the algorithmcomputes the globalmeandistance for the inputdata this could result invaryingidentificationofoutliersacrossdifferentinputdatasets.Thisconstrainsthelocalityofthealgorithmandcomplicatesdistributionofthedata.

Page 29: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

28

Toovercomethisproblemeitherthemeanandstandarddeviation(SD)couldbecalculatedassamplemeanandSD.Alternativelyasynchronisationstepcouldgatherall thetile localmeansandvarianceandcombinethemtocreateaglobalthresholdwhichisthenusedforalltiles.Thisresultsin:

1) Tiledata2) Computemean+SDofneighbourhoodsoneachtile3) Gather+combinetogettheglobalmean+SD4) Computeglobaltolerancethreshold5) Foreachtileseparateoutliersusingglobalthreshold

Fora4millionpointdatasettheexecutiontimeis intherangeof5‐20secondsdependingonmachinecapabilities.

The current implementation runs on a single node single thread implementation and couldpotentially process larger data sets in shorter times through parallelisation across tileconfigurations.

FutureDevelopment

nanoflann(https://code.google.com/p/nanoflann/)maybeanalternativetousingthePCLkd‐tree construction and approximate nearest neighbour (ANN) search interfaces. This wouldreducethedependencies(nanoflannhasonlySTLdependencies),furtheritcouldpossiblyofferupto2xspeeduponsearchandconstructionofthekd‐tree.

Thiswould be reliant on a successful tiling of the input data into contiguous tiles and on thesynchronisationstepmentionedabove.

4.2.4 3DLocalkeypointextractionfrompointclouds

Featuresingeneralaredistinctive locationsofboth2Dand3Dinputdata. Theycanbeeitherpoints given by certain coordinates, or small regions. In our case we consider points to berepresentedbytheir3Dcoordinatesandarethusreferredtoaskeypoints.Werefertothemaslow‐levelfeaturestoindicatethattheextractedkeypointmightbemeaninglessfortheenduser,however it contains crucial information for other internalmodules of the IQmulus system forfurtherprocessingandtogenerateendresults.Onesuchmoduleistheregistrationoftwodatasetsforwhichcorrespondingkeypointsarecrucialinput.

Theprocedurecanbesummarizedasfollows:

1. UsingtheLibLASlibrarytoreadthepointclouddatawhichisin“.las”fileformat2. CalculatethesurfacenormalsusingthePCLlibrary3. Extractthelocalmaximaofthesurfacenormalvaluesaslocalfeatures4. UsetheVTKlibrarytodisplaytheinputpointcloud(optional)5. UsetheVTKlibrarytolabelthelocationsonthe3Ddisplaywherethelocalfeaturesare

extracted(optional)

Page 30: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

29

Figure18:ExampleoutputoflocalkeypointsontheBrittanypointcloudbasedonanalysisoflocalnormals(DoN).

4.2.5 Treecrownrecognitionfrommobilemappingpointclouds

Thepurposeofthisprocessingserviceistoclassifythepointclouddataaccordingtounderlyingobject type. For this initial implementation, the classification attempts to separatemanmadefrom naturally occurring features. It exploits the fact thatmanmade objects tend to bemoreplanar.

AstandardapproachtoinspectthelocalspatialdistributionofpointsistocomputeaPrincipleComponentAnalysis(PCA)(Demantkéetal.,2011).Thisresults in threeEigenvectorsandthe

correspondingEigenvalues λi. TheEigenvalues relate to the semi axis lengths ii of the

corresponding ellipsoid. Therearethreeextremecases:

321 :mainlydistributedalongoneaxisdirection

321 :mainlydistributedalongtwoaxisdirections

321 : distributedalongalldirections

Dimensionalitydescriptorsnormalizetheseeigenvaluerelationships,anddefinehowclosetheindividualneighbourhoodshapeistooneofthethreecasesdescribedabove:

1211 / D

1322 / D

133 /D

Twoclassificationalgorithmshavebeendeveloped.ClassificationalgorithmAisahand‐crafteddecisiontreeusingdescriptorsD2andD3.Thisservesasabaselinealgorithmtoseeifmachine

Page 31: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

30

learningcanactuallyimproveclassificationresults.TheclassificationalgorithmAwasshowntoevaluatethetwomanuallylabelledtestdatasetswithonlyplanarsurfacesoronlytree‐crowns.ResultsonthefulldatasetareshowninFigure21.

PlanarSurface TreeCrowns

Totalpoints 77566 58955

Correctlylabelled 96% 83%

Falselylabelled 4% 17%

Classification algorithm B is a random forest classification using all three descriptors. WecurrentlyusetheMATLABimplementationtoperformrandomforesttrainingandclassification.TheclassifierBwasevaluatedonthesametestdatasets.

PlanarSurface TreeCrowns

Totalpoints 77566 58955

Correctlylabelled 99% 96%

Falselylabelled 1% 4%

For random forest classification the training phase already gives inside into classificationaccuracy. The out‐of‐bag error (Breiman, 2001) is a vital indicator of this and also helps tochoose the number of treeswhich are needed. Figure 19 shows that the graph saturates foraround 15 trees. While the classification results on the test data look satisfactory, we canobserve misclassifications when the classifier B is applied to the full tile (Figure 22). Weattributethistoover‐fitting,butfurtherinvestigationsareneeded.Atthemomentwedelivertheclassicaldecisiontree,butweseeastrongpotentialintherandomforestapproach.

Page 32: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

31

Figure19:Out‐of‐bagerrorversusthenumberoftreesgrown.

Figure20:MobilemappingdatasetBloomsbury. The grey values correspond to the amount ofbackscatteredenergyroughlysimilartointensity.

0 5 10 15 20 25 30 35 40 45 500.05

0.1

0.15

0.2

0.25

0.3

number of grown trees

out-

of-b

ag c

lass

ifica

tion

erro

r

Page 33: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

32

Figure 21: Classification using Decision Tree. Man‐made planar surfaces (Grey), tree crowns(Green)andunclassified(Red).

Figure 22: Classification using Random Forest. Man‐made planar surfaces (Grey) and treecrowns(Green).MisclassificationcanbeobservedontheStreetPlane

Thecurrentimplementationisasingle‐threadedsequentialimplementationofthedecisiontreeinC++.While theMATLAB random forest showsbetter statistics on the test data,weobservesomeartefactsonthefulldataset.Thisrequiresfurtherinvestigation.

There are documented approaches in the literature using parallelization to speed upclassification (Low et al., 2012) with some using MapReduce as a paradigm (Lin and Kolcz,2012).AnApachemachine learning library isavailable that is implementedon topofHadoopwhich provides high‐level machine learning algorithms (“Apache Mahout: Scalable machine

Page 34: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

33

learninganddatamining,”2013).TheuseofHadoopwouldmakeitanidealfitfortheIQmulusarchitecture.WewillinvestigatetheuseofthelibrarytobringthisalgorithmtotheCloud.

4.2.6 Detectionofflowlinesanddrainagebasins

Thisserviceanswersthespecificneedsofhydrologiststoknowtheextentandtheshapeoftheareathatdrainstowardsanoutletsection.Theknowledgeofdrainagebasinsisakeystepinthehydrologicalmodellingforprotectionfromflood.Thisisespeciallytrueforsmallbasins,wherethe flooding is fast and the overlap of rainfall datawith the shape of catchment can help thedecision making pipeline in civil protection. This specific requirement comes from RegioneLiguriaanditsdepartmentsinchargeofcivilprotectionandenvironmentmonitoring.Givenitspeculiarorography, theLiguria region is indeedparticularly critical asmost of thedangerousriversflowoutfromsmallbasins,usuallywithasurfacelessof10km2.

Thereferenceuserstoryisthefollowing:“IwouldliketohaveashapefileoftheLiguriansmallbasins (< 10 km2) derived from a high resolutionDTM exploiting the available Lidar data ofRegioneLiguriatoderivetheDTMandacutinterpolatedrainfallfieldovertheshapefile.”

Functionality

Theservice releasedwith the“watershedDEM”provides the functionality toextract flow linesand drainage basins from terrain models. In the current release of the service, we haveconsideredasinputtheterrainrepresentedasagriddedDEM,andinthenextprojectperiodwewilldeveloptheserviceprovidingthesamefunctionalitywhenstartingfromtriangulations.Theservicetakesathresholdparameterthat filterstheminimumsizeofthebasinthattheservicewillextract.

Considering the agreed data representation types, the service accepts as input gridded pointclouds(format:geoTIFF).Theoutputisagriddedpointcloud(format:geoTIFF),wherethecellsstoretheX,YcoordinatesofthepointandtheZvaluestorestheuniqueintegeridentifierofthebasinthecellbelongsto.Asoutput,theservicealsoproducesashapefile(format:.shp.shx.dbf)storing the mainstream reaches. The services can save, as intermediate output, slope,accumulationareaandflowdirectionsmap.

Librariesused

The service is based on the TauDEM toolkit for gridded data released under GPL by D.G.Tarboton (Tarboton, 2005). The original software is implemented in C++ and usesparallelizationbasedonMPI. TheTauDEM toolkit doesnotneed external libraries except thestandardC++andtheMPIcompiler.TogettheshapefileofbasinsitrequirestheGDALlibrary.

UnderlyingAlgorithm

Theserviceexecutioncanbedescribedbythefollowingpseudocode:

#Read and load the input DEM, and the optional threshold following #the constant drop law (Broscoe, 1959)

Read_input (DEM, threshold);

Find_and_remove_pits_from (DEM);

#For each cell in the DEM, compute the flow directions, using a 8-#window neighbourhood. This step writes two geoTiff files: one for

Page 35: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

34

#direction map and one for slope activity. Find_flow_directions (DEM);

#For each cell in the DEM, compute the cumulative number of cells #that drains into downslope cells. The output is a geoTIFF with #the computed value as attribute. Compute_contributing_area (DEM);

Run a Peuker-Douglas filter over topography (DEM);

With this step produces a skeleton of streams network as a grid of {0,1} values.

Apply a threshold to delineate stream network;

#The threshold is related to the similarity between the mean stream #drop of first order streams and the mean stream drop of higher #streams according to the constant drop law.

Extract and write watershed and streamnet (output a watershed geoTiff and a streamnet shapefile);

Executionarchitectureofcurrentimplementation

The different steps explained in the previous section are implemented in parallel using aMessagePassing Interface(MPI).The inputgriddeddata isdivided inequallysizedhorizontalstrips.Thenumberofstripscorrespondstothenumberofprocesses.Foreachstrip,thecorrectsizeofmemoryisallocatedwithanoverlapoftwomorelinesforeachstripinordertoexchangedatawiththeneighbouringstrip;thistaskisperformedbyaspecificfunction(Tesfaetal.,2011).

ExampleOutput

AsshowninFigure23eachextractedbasinrepresents thecontributingarea forriverreachesbetweenintersections.Eachcolourdisplaysadifferentdrainagearea.Theblackthicklinesarethemainstreams.ThisbasinextractionhasbeentestedontheLiguriadatasetandinmoredetailusingaDigitalSurfaceModelderivedfromaninversedistanceweightinginterpolationofLiDARdataoftheVernazzaarea.

Page 36: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

35

Figure23:BasinsextractionontheLiguriadataset.

PerformanceonTestDataset

Theparallelizationofthecodeallowsexecutingthecodeondatasetsoftensofmillionsofcellsin less than aminute on an eight core desktop computer. The service has been tested on theLiguriadata set andon thedatasetprovided for the IQmulusProcessingContest (seeD8.8.1).Someresultsobtainedfromthecontestdatasetsarenotasgoodasexpected.Whentheprocessisconfrontedwithalargeflatarea(6.4*10^7cells),suchasencounteredoverthesea,itspendsalotoftime(tensofminutes)tofindapossibledownslopedirection.Thisissuehastobefixedinthenextreleasetoimprovethebehaviouroverflatareas.

SuitabilityforMap/Reduce

ThisalgorithmmaybeimplementedfollowingaMapReduceparadigm.Amappermayemiteachcell index as a key and pass as value the neighbourhood list of characteristic values, e.g., theelevation value. The reducer may reduce each cell index with a value, for example, of flowdirectionandsoonforeverylocalcomputationstep(directionmap,contributingarea...).

4.2.7 ExtractionofCriticalPoints,linesandregionsfromgridsandtriangulations

Thisserviceforsurfacecharacterizationisbasedontheextractionofthecriticalpoints(inthecase of a differentiable function points where all partial derivatives are zero) from either atrianglemesh or a grid. The knowledge about critical points is crucial for understanding andorganizingthetopologicalstructureofasurface.Unfortunatelythehypothesisthatasurfaceisonly continuousdoesnotguarantee that theassociatedheight function isMorsenor that it isderivable.

The approach proposed in this service is based on the popular characterisation of trianglemeshesproposedby(Banchoff,1970).Suchamethodusesalocalpoint‐wisecriteriontodetect

Page 37: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

36

and classify critical points: for example, trianglemeshes are analysed by checking the heightdifference(z‐value)betweenavertexandtheadjacentonesinitsstar‐neighbourhood.

Theoriginalalgorithmisthenextendedby:

consideringcriticallinesandtrianglesthataremaximaorminimawithrespecttothez‐value;

addingan(optional)thresholdtoestablishiftwopointscanbeconsideredatthesameheightornot.

Thealgorithmextractsalistofcriticalpoints(maxima,minimaandsaddles)withrespecttothez‐coordinates and flat non‐isolatedmaxima/minima lines and regions represented in a shapefile.ThecomputationalcostoftheserviceforthesurfacecharacterizationisO(n),wherenisthenumber of input vertices. Figure 24 shows two examples of the surface characterization theserviceextractsfromatrianglemesh.

Figure24Twoexamplesoffeatureextractionwithtwodifferentchoicesofthethresholdvalueforcriticalpoints.

The number of the detected critical points is usually very high and pruning or simplificationstepsarenecessarytomaketheresultingstructuresunderstandable.Forthisreason,afurtherserviceisplannedtofilterouttheirrelevantcriticalpointsuptoathreshold,expressedintermof“persistence”.

Thealgorithmisoutlinedbythefollowingpseudocode:

Foreach vertex v in triangulation

Star = StarofaVertex(v);

Foreach edge e in Star

If (abs(height(e->v[0] - height(v)) < -threshold && (abs(e->v[1] - height(v)) > threshold) up++

Page 38: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

37

If (abs(height(e->v[0] - height(v)) > -threshold && (abs(e->v[1] - height(v)) < threshold) down++

Endfor

If (up == 0) v is a Pits

If (down == 0) v is a Peaks

If (down >= 2) && (up >=2) v is a saddle

Endfor

Theservicecurrentlyrunsontriangulations,butwillbeextendedtorunongrids.TheInputisaTrianglemesh(inPLYformat)oraGriddedpointcloud(geoTIFF).Anoptionalparameteristhethresholdforpointstobeconsideredatthesameheight.TheOutputisaListoffeaturepoints(inshapefile format).Theservice is implemented inC++anddoesnothavedependenciesonanyother libraries. It is compiled under Linux and it is planned towork onWindows and Linuxoperativesystems.Furtherpropertiesare:

Visualization: the input model, with a colour map of the height, and visualization(possiblyincolour)ofthefeaturesintheshapefile.

Accuracy:featuresaredetectedatthesamelevelofthemachineprecisionusedtostoretheheightfunctionvalues.

Robustness: the robustness is measured by the threshold used, that is, smallperturbationsoftheinputwithinthethresholddonotchangetheresult

Scalability:Themethodislocalandthereforethecharacterizationcanrunwithdomainpartitioning(onagrid trivial,ontrianglemeshesmorecomplex); therefore, it iseasilyscalable.

4.2.8 FilterPointCloudbyAttribute&Coordinate

Thisservice filtersanunorderedpointcloudbytheattributesattachedtoeachpoint.TheLASfilescontainmanyattributes(includingpulsenumberandclassification).Thepointcloudcanbefiltered according to any of the attributes (including by coordinate value and pointclassification).TheInput isanunorderedpointcloudinLASformat.Theinputparametersarethepointattributenameandthepointattributeboundaryconditionstobeselected.TheoutputisanewunorderedpointcloudinLASformatinwhichonlyasubsetofpoints,whichmatchthefilteringconditions,aremaintained.

This service is implementedusing theMapReduce framework in Java and can therefore scalewiththeHadoopcluster.TheLASfileI/Ofunctionalityisimplementedasasharingpackagewithother Task 4.3 services that use LAS files for point cloud storage. In addition a single CPUimplementation depending on libLAS is also done using C++. The main loop for this serviceiteratesthroughallpointrecordsintheLASfile.OnlypointsmeetingtheattributesspecifiedbytheuserwillbewrittentotheoutputLASfile(ormultiplelayersofdifferentLASfilesasfortheHadoopimplementation).Thecomplexityislinearinthesizeoftheinputcloud.

Abriefdescriptionoftheserviceimplementation:## Mapper: (Key1, Value1) -> (Key2, Value2)

Page 39: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

38

Input: k1 # Key1, the offset of LAS point record in the file, Bytes Input: v1 # Value1, the input LAS point record Input: attribute # the user specified point attribute Input: attribute_bound # the attribute boundary # Examine point attribute switch (attribute): if ( v1.attribute not meets attribute_bound) break k2 = (attribute, attribute_bound) v2 = (v1.x, v1.y, v1.z) Output: k2 # Key2, the tile coordinates Output: v2 # Value2, point record ## Partitioner: (Key2, Value2) -> Reducer id Input: k2 Output: k2 # each reducer writes points with one attribute ## Reducer: (Key2, Value2s) -> Output Input: k2 Input: v2s # All the values with the same k2 # Loop through all the value with the same key for v2 in v2s: # Send point to the output file print(v2) Output: Filtered LAS files TheexecutionofthecodeiswrappedwithaBashscripttoprovideHadoopjobsetupandafter‐jobdataretrievalinadditiontothemaincodefunctionality.Figure25demonstratesafilteringexample of the London LiDAR data set using point classification. The original point cloud iscoloured in grey on the top left, and layers of different point classification are coloureddifferently.

TheperformanceofthecurrentserviceimplementationdependsonthespecificHadoopclusterdeploymentanduserinputs.Forexample,theexecutiontimeforasingle‐nodeHadooprunontheLondonLiDARdatasetusingpointclassificationasshowninFigure25isabout30seconds.

Page 40: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

39

Figure25:Exampleoutputofattributefiltering.TheclassificationlabelstoredintheLASfileisused for selection. Labels include ground, low vegetation, medium vegetation and highvegetation.

Page 41: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

40

5 REFERENCES

Apache Mahout: Scalable machine learning and data mining [WWW Document], 2013. URLhttp://mahout.apache.org/(accessed10.28.13). 

Banchoff,T.F.,1970.Criticalpointsandcurvatureforembeddedpolyhedralsurfaces.AmMathMon.77,475–485. 

Breiman,L.,2001.Randomforests.Mach.Learn.45,5–32. Broscoe, A.J., 1959. Quantitative analysis of longitudinal stream profiles of small watersheds.

DTICDocument. Butler,H.,Loskot,M.,Vachon,P.,Vales,M.,Warmerdam,F.,2013.libLAS:ASPRSLASLiDARData

Toolkit[WWWDocument].URLhttp://www.liblas.org/(accessed10.28.13). Dean,J.,Ghemawat,S.,2008.MapReduce:simplifieddataprocessingonlargeclusters.Commun.

ACM51,107–113. Demantké,J.,Mallet,C.,David,N.,Vallet,B.,2011.Dimensionalitybasedscaleselectionin3dlidar

pointclouds.Int.Arch.Photogramm.RemoteSens.Spat.Inf.Sci.LaserScanning. Google C++ Style Guide [WWW Document], 2013. URL http://google‐

styleguide.googlecode.com/svn/trunk/cppguide.xml(accessed10.28.13). lidarformat ‐ C++ point cloud library [WWW Document], 2013. URL

http://code.google.com/p/lidarformat/(accessed10.28.13). Lin,J.,Kolcz,A.,2012.Large‐scalemachinelearningattwitter,in:Proceedingsofthe2012ACM

SIGMOD International Conference on Management of Data. ACM, Scottsdale, Arizona,USA,pp.793–804. 

Low, Y., Bickson, D., Gonzalez, J., Guestrin, C., Kyrola, A., Hellerstein, J.M., 2012. DistributedGraphLab:a framework formachine learninganddatamining in thecloud.ProcVLDBEndow5,716–727. 

MATLAB ‐ The Language of Technical Computing [WWW Document], 2013. URLhttp://www.mathworks.co.uk/products/matlab/(accessed10.28.13). 

Mount,D.M.,Arya,S.,1998.ANN:LibraryforApproximateNearestNeighbourSearching. Muja, M., Lowe, D.G., 2009. Fast Approximate Nearest Neighbors with Automatic Algorithm

Configuration.PresentedattheVISAPP(1),pp.331–340. Preparata,F.P.,Shamos,M.I.,1985.GeometricSearching,in:ComputationalGeometry.Springer. Tarboton,D.G.,2005.Terrainanalysisusingdigitalelevationmodels(TauDEM).UtahWater. Tesfa, T.K., Tarboton, D.G., Watson, D.W., Schreuders, K.A., Baker, M.E., Wallace, R.M., 2011.

Extraction of hydrological proximity measures from DEMs using parallel processing.Environ.Model.Softw. 

TheStandard :StandardC++[WWWDocument],2013.URLhttp://isocpp.org/std/the‐standard(accessed10.28.13). 

Page 42: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

41

6 APPENDIX – SERVICE INFORMATION TABLES

IQmulusServiceinformation#7

Name of the metadata  Content expected Motivation/comments 

ServiceAcronym StatictilingofLASfile

Uniqueidentifieroftheservice;necessarytocallitwithinuser‐definedworkflows

Description Break up a large LAS file into tiles

Brieftextualdescriptionoftheservice:whatitprovides,whatcanbeusedfor.Thistextcouldbeusedasashort“helptext”intheUserInterfacesofIQmulus

Servicefunctionality Input:<datarepresentationandformat>

Pointcloud,format:LAS

Includeheretheinput/outputoftheservice,thenameofthefunctionality(e.g.,registration,fusion),andinputparameters(ifany).Inthefuture,wemayconsiderdefiningataxonomyofthefunctionalitiestoharmonizetheterminologyusedtofillthisfield.

Inputparameters:[optional]

TileSize;TileOverlappingPercentage;

Output:<datarepresentationandformat>

Multiple point clouds, format: LAS

Functionalityoftheservice:<text>

ReducethesizeofindividualLASfiles

Algorithm Mapinputs:pointrecordfromLASfile

Mapoutputs/Reduceinputs:Key–tileid,Value–pointrecord

Reduceoutputs:multiplexyzfiles

Finaloutputs:translatingxyzintoLAS

Thesamefunctionalitymayinprinciplebeimplementedbydifferentalgorithms.

Implementationdetails Implementationlanguage

JAVA,C++,Bash Includeinformationrelatedtotheimplementationoftheservice,suchaslanguage(e.g.,C,C++,etc.);dependencieswithotherlibraries(e.g.,ANNlibrary);constraintsontheoperatingsystems,andvisualizationmodalitiesoftheoutput.

Dependencieswithotherlibraries

Hadoop,libLAS

Operatingsystem Unix/Linux

Page 43: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

42

Visualizationmodalitiesoftheoutput

pcl_viewer,

meshlab

IQmulusData

AvailableIQmulusinputdata:

<IDofthedatasetasintheeRoomtables>

23,Liguria

Ifthereareexamplesofdatathatcouldserveasanexample,thenincludeheretheiridentifiersasinthedatatableineRoom(usefultotesttheservice]

Servicecharacteristics Accuracy:N/A Thesefieldsarenecessarytodocumentallthecharacteristicsoftheservicesthatareimportanttoassessthequalityoftheresults.Also,thesefieldscouldbeusedtoselectaspecificserviceamongmoreservicesthatimplementthesamefunctionalitybutwithdifferentcharacteristics.Seethefollowingbox.

Robustness:Robust

Computationaltimeinrelationtodatasize:O(n),Linear

Locality/globalityofthealgorithm:

NativeHadoopimplementationunderMapReduceframework.

Alternatives <ListofServiceIDs> Listotherservices(ifany)thathavethesamefunctionality,havethesameinput/outputbutuseadifferentalgorithmorhavedifferentfeatures

Relatedusecases <UsecasesIDasinRedMine> Mentiontheuserstoriesrelatedtotheusageoftheservice(seeD4.1.1)andrelatedusecasesuploadedonRedMine

ResponsiblePartner UCL PartnerIDandresponsibleperson(includeemail)

InvolvedPartners

Page 44: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

43

IQmulusServiceinformation#16

Name of the metadata  Content expected Motivation/comments

ServiceAcronym Quickvisualization Quickdisplayandrenderingofthe3Dinputdata.

Description The3Ddataisreadfromthefolderpathwhichisprovidedbytheuser.Dataisautomaticallydisplayedonthescreenand3Drendering,zoominglikemouseactionsaretakenforvisualizationofthedifferentviewsandthedetails.

Usercanhaveaquickoverviewofthedatatobeprocessed.

Servicefunctionality Input: .pcdpointcloudfileformat Quickdisplayofthe3Dinputdata.Verylargepointclouddatacanbedisplayedafterdownsamplinginordertoincreasethespeedofzoomingandrendering.

Inputparameters:[optional]

Pathofthefolderwheretheinput3Ddataislocated.

Output:<datarepresentationandformat>

Screendisplay

Functionalityoftheservice:<text>

Quickvisualizationofthe3Ddata.

Algorithm ReadingpointclouddatausingPointCloudLibrary(PCL)functions.

Creating3DdisplayscreenbyusingVisualizationKit(VTK)libraryandaddingdatarenderingfeatures.

PCLandOpenGLalsoprovidesfunctionsfor3Ddisplay.

Implementationdetails Implementationlanguage

C++ ItisalsopossibletodisplaydatabyusingcommercialsoftwarelikeQuickTerrain,MeshLab,CloudCompare,etc.Dependencies

withotherlibraries

PCL,VTK,BoostLibrariesareneeded.

Operatingsystem Windows

Visualizationmodalitiesoftheoutput

Visualizationandrenderingofthepointcloud.

Renderingoftrianglemeshesoptioncanbeprovided.

IQmulusData

AvailableIQmulusinputdata:

Fortestswehaveusedthefollowingdata;

eRoomid’sareprovided.

Page 45: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

44

(1) UBO– TLSTopographyofasmallbeachinBrittany(ID‐10‐11)

(2) UBO–BathymetryinBrittany(ID‐12‐13)

(3) Linguria–LidardataforthestudyareaofVernazza,Liguria,Italy(ID‐22‐23)

Servicecharacteristics Accuracy:‐

Robustness:reliabletoworkwithlargepointclouds

Computationaltimeinrelationtodatasize:==lessthantwosecondsevenforpointclouddatawith~600000points.

Locality/globalityofthealgorithm:theprogramcanbeusedtoseetheinputdataortheprocesseddataquickly.

Alternatives Using3DdisplayfunctionsfromotherlibrarieslikePCLandOpenGLatthecodegeneration.

Anotheroptionisusingcommercialsoftwarepackagesforpointclouddisplay;i.e.QuickTerrain,MeshLab,CloudCompare.

Relatedusecases EachuserstoryintheRedMine listcanbenefitfromquick3Dvisualizationapplicationfordisplayingtheinputortheoutputdata.

ResponsiblePartner UCL

InvolvedPartners TUDelft,BerilSirmacek[[email protected]]

Page 46: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

45

IQmulusServiceinformation#2

Name of the metadata  Content expected Motivation/comments 

ServiceAcronym PointCloudtoRegularGrid Uniqueidentifieroftheservice;necessarytocallitwithinuser‐definedworkflows

Description Fromanunorderedpointclouddevelopagridrepresentationin2.5D

Brieftextualdescriptionoftheservice:whatitprovides,whatcanbeusedfor.Thistextcouldbeusedasashort“helptext”intheUserInterfacesofIQmulus

Servicefunctionality Input:<datarepresentationandformat>

Unorderedpointcloud,format:LAS

Includeheretheinput/outputoftheservice,thenameofthefunctionality(e.g.,registration,fusion),andinputparameters(ifany).Inthefuture,wemayconsiderdefiningataxonomyofthefunctionalitiestoharmonizetheterminologyusedtofillthisfield.

Inputparameters:[optional]

Gridsize(resolution)

Output:<datarepresentationandformat>

Griddedpointcloud,format:LAS;2.5Draster,format:GeoTIFF

Functionalityoftheservice:<text>

Algorithm Mapinputs:LASpointrecords

Mapoutputs/Reduceinputs:Key–Gridid;Value:Pointrecords;

Reduceoutputs:multiplexyzfiles

Finaloutputs:translatingxyzintoLAS,andGeoTIFF

Thesamefunctionalitymayinprinciplebeimplementedbydifferentalgorithms.

Implementationdetails Implementationlanguage

JAVA,C++,Bash Includeinformationrelatedtotheimplementationoftheservice,suchaslanguage(e.g.,C,C++,etc);dependencieswithotherlibraries(e.g.,ANNlibrary);constraintsontheoperatingsystems,andvisualizationmodalitiesoftheoutput.

Dependencieswithotherlibraries

Hadoop,libLAS

Operatingsystem Unix/Linux

Visualizationmodalitiesoftheoutput

pcl_viewer

meshlab

Page 47: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

46

IQmulusData

AvailableIQmulusinputdata:

<IDofthedatasetasintheeRoomtables>

23,Liguria

Ifthereareexamplesofdatathatcouldserveasanexample,thenincludeheretheiridentifiersasinthedatatableineRoom(usefultotesttheservice]

Servicecharacteristics Accuracy:N/A Thesefieldsarenecessarytodocumentallthecharacteristicsoftheservicesthatareimportanttoassessthequalityoftheresults.Also,thesefieldscouldbeusedtoselectaspecificserviceamongmoreservicesthatimplementthesamefunctionalitybutwithdifferentcharacteristics.Seethefollowingbox.

Robustness:Robust

Computationaltimeinrelationtodatasize:O(n),Linear

Locality/globalityofthealgorithm:

NativeHadoopimplementationunderMapReduceframework.

Alternatives <ListofServiceIDs> Listotherservices(ifany)thathavethesamefunctionality,havethesameinput/outputbutuseadifferentalgorithmorhavedifferentfeatures

Relatedusecases <UsecasesIDasinRedMine> Mentiontheuserstoriesrelatedtotheusageoftheservice(seeD4.1.1)andrelatedusecasesuploadedonRedMine

ResponsiblePartner UCL PartnerIDandresponsibleperson(includeemail)

InvolvedPartners

Page 48: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

47

IQmulusServiceinformation#3

Name of the metadata  Content expected Motivation/comments 

ServiceAcronym Peakpoints Unique identifier of theservice; necessary to call itwithinuser‐definedworkflows

Description Detectpeakpoint (highestpoint) in anunorderedpointcloud,withrespecttoagivenregionsize.

Brieftextualdescriptionoftheservice:whatitprovides,whatcanbeusedfor.Thistextcouldbeusedasa short “help text”in the User Interfaces ofIQmulus

Servicefunctionality Input: <datarepresentationandformat>

UnorderedpointcloudinLASfileformat

Includehere the input/outputoftheservice,thenameofthefunctionality (e.g.,registration,fusion),andinputparameters (if any). In thefuture, we may considerdefining a taxonomy of thefunctionalities to harmonizethe terminology used to fillthisfield.

Inputparameters:[optional]

Radiusofregion,sensitivity

Output: <datarepresentationandformat>

Sparsepointcloud(i.e.listof)peakpointsasASCIIorshapefile

Functionality ofthe service:<text>

Detecthighestpointwithinasetsearchradius.

Algorithm Thealgorithmloopsoverallpointsoftheinputcloud(O(n)).Foreachpointthek‐neighbourhoodisinspectedwithinthegivenradius(O(logm)).Ifthepointisthehighestpointitisstored,otherwisediscarded.

Thesamefunctionalitymayinprinciple be implemented bydifferentalgorithms.

Implementationdetails Implementationlanguage

C++ Include informationrelatedtothe implementation of theservice,suchaslanguage(e.g.,C,C++,etc);dependencieswithother libraries (e.g., ANNlibrary); constraints on theoperating systems, andvisualizationmodalitiesof theoutput.

Dependencieswith otherlibraries

PCL1.6,Shapelib,libLAS

Operatingsystem MicrosoftWindows

Visualizationmodalities of theoutput

Sparsefeaturepointstogetherwithdensepointcloud

Page 49: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

48

IQmulusData

AvailableIQmulusinputdata:

Liguria

If there are examples of datathat could serve as anexample, then include heretheir identifiersas in thedatatable in eRoom (useful to testtheservice]

Servicecharacteristics Accuracy:Onlypointsfromtheoriginalpoint cloud will be returned, nointerpolationisperformed.

These fields are necessary todocument all thecharacteristics of the servicesthat are important to assessthequalityoftheresults.Also,these fields could be used toselectaspecificserviceamongmore services that implementthe same functionality butwith different characteristics.Seethefollowingbox.

Robustness: The currentimplementation is not robust againstoutliers. Outliers need to be removedprior to this process. See Process“outlierremoval”.

Computationaltimeinrelationtodatasize:O(nlogm)withnthesizeoftheinputcloudandmthesizeoftheregion(#ofpoints)

Locality/globality of the algorithm:Thealgorithmworkslocallyonasubsetof the point cloud there are no globalinterdependencies.

CurrentlytheserviceisavailableonlyasasingleCPUimplementation.Externaltilingisrequiredtoenableparallelism.

Alternatives #44CriticalPoints Listotherservices(ifany)thathave the same functionality,have the same input/outputbut use a different algorithmorhavedifferentfeatures

Relatedusecases Userstory(IQmulus)#1072 Mention the user storiesrelated to the usage of theservice (see D4.1.1) andrelatedusecasesuploadedonRedMine

ResponsiblePartner UCLJanBoehm([email protected]) Partner ID and responsibleperson(includeemail)

InvolvedPartners

Page 50: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

49

IQmulusServiceinformation#10

Name of the 

metadata 

Content expected Motivation/comments

ServiceAcronym OutlierDetection Uniqueidentifieroftheservice;necessarytocallitwithinuser‐definedworkflows

Description Statisticaloutlierdetectionofpointsinlaserscandata,usefulinremovingerroneousdatacausedbyavarietyofeffectinherentinthecaptureprocess

Brieftextualdescriptionoftheservice:whatitprovides,whatcanbeusedfor.Thistextcouldbeusedasashort“helptext”intheUserInterfacesofIQmulus

Servicefunctionality

Input:<datarepresentationandformat>

pcl::PointCloud<pcl::PointXYZ>

apointcloudinpclpointcloudformat

WillbechangedtoLAS

Includeheretheinput/outputoftheservice,thenameofthefunctionality(e.g.,registration,fusion),andinputparameters(ifany).Inthefuture,wemayconsiderdefiningataxonomyofthefunctionalitiestoharmonizetheterminologyusedtofillthisfield.

Inputparameters:[optional]

Thethresholdofhowmanystandarddeviationsawayfromtheglobalmeanneighbourhooddistanceisacceptabletoqualifyasinliers.

Thekneighbourstosampleinordertogeneratethestatisticalprofileoftheinputdata

Output:<datarepresentationandformat>

Twopcl::PointCloud<…>(asabove)relatingtothe‘outliers’andthe‘inliers’.WillbechangedtoLAS

Functionalityoftheservice:<text>

Thisseparatestheinputpointcloudintoinliersandoutliers

Algorithm Inputk#nearestneighbourhoodsizeInputsd_tol#tolerancethresholdInputcloud#inputpointcould#calculatethemeandistancebetweenpointsinallneighbourhoodsForallpiincloud

ui=mean_dist(pi,nearest_neighbours(k,pi))#calculateglobalstatisticsforneighbourhoodsinthiscloudU=mean(u)SD=standard_deviation(u)Threshold=U+(SD*sd_tol)

Thesamefunctionalitymayinprinciplebeimplementedbydifferentalgorithms.

Page 51: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

50

#dividethecloudintoinliersandoutliersForalluiinu

Ifui<=Thresholdinliers.push_back(pi)

elseoutliers.push_back(pi)

Outputinliers#pointswho’sneighbourhoodarewithintoleranceOutputoutliers#pointswho’sneighbourhoodareoutsidetolerance

Implementationdetails

Implementationlanguage

C++ Includeinformationrelatedtotheimplementationoftheservice,suchaslanguage(e.g.,C,C++,etc);dependencieswithotherlibraries(e.g.,ANNlibrary);constraintsontheoperatingsystems,andvisualizationmodalitiesoftheoutput.

Dependencieswithotherlibraries

CorealgorithmPcl+Requires

Boost>1.46Flann>1.71Eigen>3Vtk>5.6

Debug/developmentviewerCinder+Requires

OpenGLBoost

Operatingsystem

CorealgorithmBuild&runsonLinuxandWindowsDebug/developmentviewerBuildandrunsonwindows,howeverCindernotyetsupportedforlinux.

Visualizationmodalitiesoftheoutput

Arudimentarydebugging/developmentGUIvisualizertoeyeballthedatainputandgenerated

IQmulusData

AvailableIQmulusinputdata:CanreadPLYandLASfilesdrag‐n‐dropintothevisualisationwindowtoload(maytakeafewmomentsasoutlierdetectionisperformedonload).

Ifthereareexamplesofdatathatcouldserveasanexample,thenincludeheretheiridentifiersasinthedatatableineRoom(usefultotesttheservice]

Servicecharacteristics

Accuracy:variabledependingonparametersandinnatecharacteristicsoftheinputcloud

Thesefieldsarenecessarytodocumentallthecharacteristicsoftheservicesthatareimportanttoassess

Page 52: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

51

Robustness:variabledependingonparametersandinnatecharacteristicsoftheinputcloud

thequalityoftheresults.Also,thesefieldscouldbeusedtoselectaspecificserviceamongmoreservicesthatimplementthesamefunctionalitybutwithdifferentcharacteristics.Seethefollowingbox.

Computationaltimeinrelationtodatasize:approximatelyscalesat(nlogn)basedonthetimetakenforconstructionofthekdtreeandnearestneighbourqueries.Locality/globalityofthealgorithm:sincethealgorithmcomputestheglobalmeandistancefortheinputdatathiscouldresultinvaryingidentificationofoutliersacrossdifferentinputdatasets.Toovercomethisproblemeitherthemeanandstandarddeviation(SD)couldbecalculatedassamplemeanandSD.Alternativelyasynchronisationstepcouldgatherallthetilelocalmeanandvarianceandcombinethemtocreateaglobalthresholdwhichisthenusedforalltiles.Resultingin:

1) Tiledata2) Computemean+SDofneighbourhoodsoneach

tile3) Gather+combinetogettheglobalmean+SD4) Computeglobaltolerancethreshold5) Foreachtileseparateoutliersusingglobal

thresholdFora4millionpointdatasettheexecutiontimeisintherangeof5‐20secondsdependingonmachinecapabilities.Thecurrentimplementationrunsonasinglenodesinglethreadimplementationandcouldpotentiallyprocesslargerdatasetsinshortertimesthroughparallelisationormultimodeconfigurations.Thiswouldbereliantonasuccessfultilingoftheinputdataintocontagioustilesandonthesynchronisationstepmentionedabove.

Alternatives nanoflann(https://code.google.com/p/nanoflann/)maybeanalternativetousingthepclkdtreeconstructionandapproximatenearestneighbour(ANN)searchinterfaces.Thiswouldreducethedependencies(nanoflannhasonlySTLdependencies),furtheritcouldpossiblyofferupto2xspeeduponsearchandconstructionofthekdtree.

Listotherservices(ifany)thathavethesamefunctionality,havethesameinput/outputbutuseadifferentalgorithmorhavedifferentfeatures

Relatedusecases Datacleanuppre‐processingstepforuseaspartofotheralgorithms.

ResponsiblePartner

UCL PartnerIDandresponsibleperson(includeemail)

InvolvedPartners OtherPartnersandresponsiblepersons(ifany,emails)involvedinthedevelopmentand/ortestingoftheservice

Page 53: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

52

IQmulusServiceinformation#17

Name of the metadata  Content expected Motivation/comments

ServiceAcronym 3DLocalkeypointextractionfrompointclouds(low‐levelfeature)

Extractinglow‐levelfeaturesfrom3Ddatawhichdoesnotgivevaluableinformationtotheenduserhowevercanbeusedatrecognition,detection,registration,modelling,datacompression,etc.modulesoftheIQmulussystem.

Description Featuresaredistinctiveandinformationcontaininglocationsofan2Dor3Dinputdata.Theycanbeeitherpointsgivenbycertaincoordinates,orsmallregions.Inourcaseweconsiderpointsrepresentedbytheircoordinatesandwecallthemkeypoint.Low‐leveltermisusedtoindicatethattheextractedkeypointsmightbemeaninglessfortheenduser,howeveritcontainscrucialforothermodulesoftheIQmulussystemforfurtherprocessandtogeneratefurtherendresults.

Servicefunctionality Input: .laspointcloudfileformat Quickdisplayofthe3Dinputdatawithkeypointlocations.Input

parameters:[optional]

Pathofthefolderwheretheinput3Ddataislocated.

Output:<datarepresentationandformat>

.matmatrixfilewhichcontainsx,y,zcoordinatesoftheextractedkeypoints.

Functionalityoftheservice:<text>

Extractinglow‐levelfeaturesoftheinputpointcloudfile.

Algorithm ReadingthepointclouddatausingLibLasfunctions.

Creating3DdisplayscreenbyusingVisualizationKit(VTK)libraryandaddingdatarenderingfeatures.

Extractinglow‐levelfeaturesthepointcloudusingPCLandotherlibrarieswhicharegoingtobeexploredbyexperiments.

Visualizationoftheextractedkeypointsontheoriginalpointcloud.

Page 54: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

53

Savingthekeypointlocationsintoa.matfile.

Implementationdetails Implementationlanguage

C++ ItisalsopossibletodisplaydataandtheextractedkeypointlocationsbyusingcommercialsoftwarelikeQuickTerrain,MeshLab,CloudCompare,MatLabetc.

Dependencieswithotherlibraries

PCL,VTK,BoostLibrariesareneeded.(Morelibrariesmightbeadded.)

Operatingsystem Windows

Visualizationmodalitiesoftheoutput

Visualizationandrenderingofthepointcloudwiththeextractedkeypoints.

IQmulusData

AvailableIQmulusinputdata:

Fortestswehaveusedinternaldataset.

Wecontinuetotheexperimentsbyusingthefollowingdatasets;

(4) UBO–TLSTopographyofasmallbeachinBrittany(ID‐10‐11)

(5) UBO–BathymetryinBrittany(ID‐12‐13)

(6) Linguria–LidardataforthestudyareaofVernazza,Liguria,Italy(ID‐22‐23)

eRoomid’sareprovided.

Servicecharacteristics Accuracy:Dependsonthecorrectscaleselection.Automaticscaleselectionwillbedevelopedinthefuture.

Robustness:reliabletoworkwithlargepointcloudsComputationaltimeinrelationtodatasize:~5minutesforthepointclouddatawith~100000points.Locality/globalityofthealgorithm:theprogramcanbeusedwithobjectdetection,reconstruction,classification,registration,etc.modulesoftheIQmulusproject.

Alternatives Anotheroptionisusingcommercialsoftwarepackagesforpointclouddisplayandkeypointextraction;i.e.QuickTerrain,Terrasolid.

Relatedusecases UserStory(IQmulus)#1037,LandShowcaseUserStory2(1.2.2_SC2_2);Usercanregisterup‐to‐datesceneontothe3Dmodels.Thekeypointsareneededfortheregistrationprocess.

ResponsiblePartner TUDelft,BerilSirmacek[[email protected]]

InvolvedPartners TUDelft,RoderikLindenbergh(algorithmdevelopmentsupport)

TUDelft,JinhuWang(testingthedevelopedalgorithms)

Page 55: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

54

IQmulusServiceinformation#27

Name of the metadata  Content expected Motivation/comments

ServiceAcronym Treecrownrecognitionfrommobilemappingpointclouds

Unique identifier of theservice; necessary to call itwithinuser‐definedworkflows

Description Detectingtreecrownsandseparatethemfromplanarsurfacesinpointclouddatacollectedfrommobilemapping

Brieftextualdescriptionoftheservice:whatitprovides,whatcan be used for. This textcouldbeusedasashort“helptext” in theUser InterfacesofIQmulus

Servicefunctionality Input: <datarepresentationandformat>

Unordered pointcloudsfromLiDARinLASformat

Includehere the input/outputoftheservice,thenameofthefunctionality (e.g.,registration, fusion), andinput parameters (if any). Inthe future, we may considerdefining a taxonomy of thefunctionalities to harmonizethe terminology used to fillthisfield.

Inputparameters:[optional]

Radiusofinfluenceforfeaturecomputation

Output: <datarepresentationandformat>

Labelledunorderedpointcloud

Functionality ofthe service:<text>

The service readsan unorderedpoint cloudcaptured frommobile mappingand classifies thedata into treecrown and planarsurfaceclasses.

Algorithm Inthefirststagethealgorithmcomputesper point features based on PCA toestimate local point entropy. In thesecond stage these features are used toclassifydata into treecrownandplanarsurface classes. The classifier must betrainedontrainingdatapriortothat.

Service #66 “Point CloudDimensionnality (PCD)” canbe used to compute featuresfromyear2on.

Implementationdetails Implementationlanguage

C++ usingpropriety decisiontreeimplementation

In the next period a cloud‐based classificationframework will be used, e.g.Mahout.

Dependencieswith otherlibraries

PCLforfeaturecomputation

Operatingsystem Microsoft

Page 56: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

55

Windows

Visualizationmodalities of theoutput

Pointcloudandscalarlabel

IQmulusData

AvailableIQmulusinputdata:

If there are examples of datathat could serve as anexample, then include heretheir identifiersas inthedatatable ineRoom (useful to testtheservice]

Servicecharacteristics Accuracy:Classificationcorrectness(currently99%and96%inlimitedtests)

These fields are necessary todocument all thecharacteristics of the servicesthat are important to assessthequalityoftheresults.Also,these fields could be used toselectaspecificserviceamongmore services that implementthe same functionality butwith different characteristics.Seethefollowingbox.

Robustness:Misclassification,falsepositives(currently1%and4%inlimitedtests)

Computational time in relation todata size: Feature computation is thebottle neck, but complexity is generallyO(n)albeitwithlargeconstant.

Locality/globality ofthealgorithm:Perpointclassificationisindependentforallpoints,somaximumparallelismcanbeexploited.Featurecomputationdependsonaregionsurroundingthepoint,whichiscontrolledbytheradiusparameter.Againgoodparallelizationcanbeachieved.

AtthemomentonlyasingleCPUimplementationisprovided.Tilingisrequiredtobeperformedexternally.

Alternatives #27 “Tree recognition from pointclouds”and

#59 “Multi‐object classification of 3Dpointclouds[objectmustbespecified]”

Listotherservices(ifany)thathave the same functionality,have the same input/outputbut use a different algorithmorhavedifferentfeatures

Relatedusecases User story (IQmulus) #1097 “As a GISexpert (geodata provider/integrator)working at local level, I want toautomatically detect individual treesfrom a LiDAR point cloud in a urbanarea, so that I canmonitor growth andforeseepruningworks.”

Mention the user storiesrelated to the usage of theservice (see D4.1.1) andrelatedusecasesuploadedonRedMine

ResponsiblePartner UCL‐[email protected]

InvolvedPartners

Page 57: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

56

IQmulus Service information #42 

Name of the metadata  Content expected Motivation/comments

Service Acronym  watershedDEM Unique identifier of the 

service; necessary to 

call it within user‐

defined workflows 

Description   Extract watershed and flow directions from digital 

elevation model. 

 

As an optional output, it is possible to get the 

accumulation area and the slope map. 

Brief textual 

description of the 

service: what it 

provides, what can be 

used for. This text 

could be used as a 

short “help text” in the 

User Interfaces of 

IQmulus 

Service functionality  Input: <data 

representation and 

format> 

DEM gridded point cloud 

(GeoTIFF) 

Include here the 

input/output of the 

service, the name of 

the functionality (e.g., 

registration, fusion), 

and input parameters 

(if any). In the future, 

we may consider 

defining a taxonomy of 

the functionalities to 

harmonize the 

terminology used to fill 

this field. 

Input parameters: 

[optional]  

Threshold size of the basin.

Degree of the river network. 

Shapefile representing the 

outlet point of the basin 

Output: <data 

representation and 

format> 

Shapefile, Gridded point cloud 

(GeoTIFF) 

Functionality of the 

service: <text> 

Extraction of watershed and 

flow directions 

Algorithm  This algorithm uses both D8 – Dinf flow directions for 

routing the flow to outlet. (D8 refers to flow direction 

calculated in the steepest direction and only cell 

centre to cell centre – Dinf refers to a more flexible 

routing that give a more real direction) 

 

The same functionality 

may in principle be 

implemented by 

different algorithms.  

Implementation details  Implementation 

language 

C++ Include information 

related to the 

implementation  of the 

service, such as 

language (e.g., C, C++, 

etc.); dependencies 

with other libraries 

(e.g., ANN library); 

constraints on the 

operating systems, and 

visualization 

Dependencies with 

other libraries 

Read and write libraries for 

GeoTIFF and shapefile, MPI, 

Taudem 

Operating system Linux‐like

Visualization 

modalities of the 

3D visualization of the DEM 

with watersheds and each 

different basin visualized with 

Page 58: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

57

output  different colours, possibly in 

relation to their size; 

visualization of the river net, 

possibly with different colours 

according to their degree 

modalities of the 

output. 

IQmulus Data 

 

Available  IQmulus input data: 

Dataset 22; Dataset 6; 

If there are examples 

of data that could 

serve as an example, 

then include here their 

identifiers as in the 

data table in eRoom 

(useful to test the 

service] 

Service characteristics  Accuracy:  These fields are 

necessary to document 

all the characteristics 

of the services that are 

important to assess 

the quality of the 

results. Also, these 

fields could be used to 

select a specific service 

among more services 

that implement the 

same functionality but 

with different 

characteristics. See the 

following box. 

Robustness: 

Computational time in relation to data size: 

Locality/globality of the algorithm: 

 

Alternatives    List other services (if 

any) that have the 

same functionality, 

have the same 

input/output but use a 

different algorithm or 

have different features 

Related use cases   Userstory(IQmulus)#1076

 

Mention the user 

stories related to the 

usage of the service 

(see D4.1.1) and 

related use cases 

uploaded on Red Mine 

Responsible Partner  CNR‐IMATI, Simone Pittaluga ([email protected]

Partner ID and 

responsible person 

(include email) 

Involved Partners  Regione Liguria    

Page 59: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

58

IQmulusServiceinformation#64

Name of the metadata  Content expected Motivation/comments

ServiceAcronym PersistentCriticalPoints Uniqueidentifieroftheservice;necessarytocallitwithinuser‐definedworkflows

Description Extraction of critical points (isolated and grouped 

in  lines/regions) from grids and triangulations Brieftextualdescriptionoftheservice:whatitprovides,whatcanbeusedfor.Thistextcouldbeusedasashort“helptext”intheUserInterfacesofIQmulus

Servicefunctionality Input:<datarepresentationandformat>

Trianglemesh(PLY)

Griddedpointclouds(geoTIFF)

Includeheretheinput/outputoftheservice,thenameofthefunctionality(e.g.,registration,fusion),andinputparameters(ifany).Inthefuture,wemayconsiderdefiningataxonomyofthefunctionalitiestoharmonizetheterminologyusedtofillthisfield.

Inputparameters:[optional]

The threshold, expressed 

in term of “persistence”, 

which is used to filter out 

irrelevant critical points. If 

no threshold is given, a 

default value is chosen.

Output:<datarepresentationandformat>

Listoffeatures(Shapefile)

Functionalityoftheservice:<text>

Persistent critical points

Algorithm The local characterization uses Banchoff’s 

definition of critical points, extended to non‐

isolated critical points. Critical points within the z‐

value threshold are filtered out using 

persistence/significance evaluation.

Thesamefunctionalitymayinprinciplebeimplementedbydifferentalgorithms.

Implementationdetails Implementationlanguage

C++ Includeinformationrelatedtotheimplementationoftheservice,suchaslanguage(e.g.,C,C++,etc.);dependencieswithotherlibraries(e.g.,ANNlibrary);constraintsontheoperatingsystems,andvisualizationmodalitiesofthe

Dependencieswithotherlibraries

==

Operatingsystem Windows,Linux

Visualizationmodalitiesoftheoutput

Visualizationoftheinputmodel,incasewithacolourmapoftheheight

Visualization

Page 60: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

59

(possiblyincolour)ofthefeaturesintheshapefile

output.

IQmulusData

AvailableIQmulusinputdata: 7(FOMI) 17,22,23(RegioneLiguria) 8,9,12,13(UBO)

Ifthereareexamplesofdatathatcouldserveasanexample,thenincludeheretheiridentifiersasinthedatatableineRoom(usefultotesttheservice]

Servicecharacteristics Accuracy:features are detected at the same 

level of the machine precision used to store the 

height function values, and within the threshold 

indicated.

Thesefieldsarenecessarytodocumentallthecharacteristicsoftheservicesthatareimportanttoassessthequalityoftheresults.Also,thesefieldscouldbeusedtoselectaspecificserviceamongmoreservicesthatimplementthesamefunctionalitybutwithdifferentcharacteristics.Seethefollowingbox.

Robustness:depends on the method 

implemented for the persistence evaluation

Computationaltimeinrelationtodatasize:The computational cost of the service for the 

surface characterization is O(n)+O(c log c), where 

n is the number of input vertices and c are the 

critical points (they are ordered for the 

persistence evaluation).

Locality/globalityofthealgorithm:

the choice of a threshold makes the algorithm not local anymore

Alternatives Service#3:Peakpoints Service#44:CriticalPoints:computationof

thecriticalpointswithoutpersistencefiltering.

Thefollowingserviceshavesimilargoalbutwithlowlevelfeatures:

Service#17:3DLocalkeypointextractionfrompointclouds(low‐levelfeature)

Service#37:Low‐levelfeaturePoint

Listotherservices(ifany)thathavethesamefunctionality,havethesameinput/outputbutuseadifferentalgorithmorhavedifferentfeatures

Relatedusecases Userstory(IQmulus)#1035 Userstory(IQmulus)#1074? Userstory(IQmulus)#1086

Mentiontheuserstoriesrelatedtotheusageoftheservice(seeD4.1.1)andrelatedusecasesuploadedonRedMine

ResponsiblePartner CNR‐IMATI,

[email protected],[email protected]

PartnerIDandresponsibleperson(includeemail)

InvolvedPartners PossibletestsondatasetsprovidedbyRegioneLiguria,UBO,andFOMI.PossiblecomparisonwiththeservicesprovidedbyTuDelftandUCL.

Page 61: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

IQmulus(FP7‐ICT‐2011‐318787) D4.3.1

60

IQmulusServiceinformation#54

Name of the metadata  Content expected Motivation/comments 

ServiceAcronym Filter Point Cloud by Attribute & Coordinate

Uniqueidentifieroftheservice;necessarytocallitwithinuser‐definedworkflows

Description Filter an unordered point cloud by the attributes attached to each point. LAS files contain various attributes (pulse number, classification). The point cloud can be filtered according to any of the attributes (including by coordinate value).

Brieftextualdescriptionoftheservice:whatitprovides,whatcanbeusedfor.Thistextcouldbeusedasashort“helptext”intheUserInterfacesofIQmulus

Servicefunctionality Input:<datarepresentationandformat>

Pointcloud,format:LAS

Includeheretheinput/outputoftheservice,thenameofthefunctionality(e.g.,registration,fusion),andinputparameters(ifany).Inthefuture,wemayconsiderdefiningataxonomyofthefunctionalitiestoharmonizetheterminologyusedtofillthisfield.

Inputparameters:[optional]

Point Attribute Name; Point Attribute Boundary Conditions;

Output:<datarepresentationandformat>

Pointcloud,format:

LAS

Functionalityoftheservice:<text>

FilteringpointsinaLASfilebasedonitspointformatattributes.

Algorithm LoopingthroughtheLASfilepointrecordsection,onlypointsmeetingtheattributesspecifiedbyuserwillbewritetotheoutputLASfile.

Thesamefunctionalitymayinprinciplebeimplementedbydifferentalgorithms.

Implementationdetails Implementationlanguage

C++,BASH Includeinformationrelatedtotheimplementationoftheservice,suchaslanguage(e.g.,C,C++,etc);dependencieswithotherlibraries(e.g.,ANNlibrary);constraintsontheoperatingsystems,andvisualizationmodalitiesoftheoutput.

Dependencieswithotherlibraries

libLAS

Operatingsystem Unix/Linux

Visualizationmodalitiesoftheoutput

pcl_viewer

meshlab

Page 62: FEATURE EXTRACTION AND CLASSIFICATION TOOLKIT – …iqmulus.eu/dynamic/document/D4.3.1_Feature_Extraction_and_Classification_Toolkit... · MOSS M.O.S.S. Computer Grafik Systeme GmbH

D4.3.1 IQmulus(FP7‐ICT‐2011‐318787)

61

IQmulusData

AvailableIQmulusinputdata:

<IDofthedatasetasintheeRoomtables>

23,Liguria

Ifthereareexamplesofdatathatcouldserveasanexample,thenincludeheretheiridentifiersasinthedatatableineRoom(usefultotesttheservice]

Servicecharacteristics Accuracy:N/A Thesefieldsarenecessarytodocumentallthecharacteristicsoftheservicesthatareimportanttoassessthequalityoftheresults.Also,thesefieldscouldbeusedtoselectaspecificserviceamongmoreservicesthatimplementthesamefunctionalitybutwithdifferentcharacteristics.Seethefollowingbox.

Robustness: Robust

Computationaltimeinrelationtodatasize:O(n),linear

Locality/globalityofthealgorithm:

CanberunasaHadoopstreamingutility.

Alternatives Listotherservices(ifany)thathavethesamefunctionality,havethesameinput/outputbutuseadifferentalgorithmorhavedifferentfeatures

Relatedusecases Mentiontheuserstoriesrelatedtotheusageoftheservice(seeD4.1.1)andrelatedusecasesuploadedonRedMine

ResponsiblePartner UCL PartnerIDandresponsibleperson(includeemail)

InvolvedPartners OtherPartnersandresponsiblepersons(ifany,emails)involvedinthedevelopmentand/ortestingoftheservice