Download - O parl Developer presentation Fusepool-Locationmapper Andreas Kuckartz

  • 1. Fusepool - LocationMapper Andreas Kuckartz Fusepool 2014 Brussels 2014-06-25

2. Why OParl ? (I) New building site across street where Marian Steinbach was living he was not aware of that plan before But the plan was available in the RIS (Ratsinformationssystem) Offenes Kln (Open Cologne), scraping the RIS and adding geo-references But standard would be much easier and better 3. Why OParl ? (II) Tried to find information about use of Open Source and Microsoft in municipalities in Metropole Ruhr Information is available, but it is extremely time consuming to find it Some cities include JavaScript calls instead of URLs. There are no permanent links to documents! No meta search! Google, Bing can not help! 4. Workshop Cologne 2013-04-17 (I) Representatives of RIS manufacturers (RIS = Ratsinformationssystem) City administrations IT service providers Vitako (55 IT service providers, services for 10.000 municipalities) Fraunhofer FOKUS All declared support for OParl initiative 5. Workshop Cologne 2013-04-17 (II) Intention to release Oparl 1.0 on 2013-06-30 (latest) Ignore Linked Data for that version 6. Workshop Bielefeld 2014-03 Many issues discussed and resolved Decision to use JSON-LD 7. Workshop Dsseldorf 2014-03-26 Several issues resolved New OParl release plan: prepare draft for discussion by April 30 Review before May 31 Publish OParl 1.0 before June 6 8. What is Oparl ? (I) Vocabulary using JSON-LD Additional requirements such as support for JSONP and querying for objects within time frame using URL-parameters No full-text search planned for version 1.0 No query language SPARQL can not be required 9. What is Oparl ? (II) Person Document (single file) AgendaItem Organization Body Meeting Paper (= DocumentContainer) Consultation 10. What is JSON-LD? JSON = JavaScript Object Notation LD = Linked Data Can be used as RDF serialization => Unites both worlds: web developers and semantic technology W3C Recommendation since 2014 11. Issue: GeoJSON Recent initiative: GeoJSON-LD but not finished and blank nodes ... Not necessarily future proof INSPIRE regulations and specifications Well-Known Text could be alternative 12. Issue: Compatibility with future Other relevant specifications Popolo Project, W3C Open Government Community Group, JSON-LD, W3C Linked Data Platform, W3C Linked Data Platform Paging, GeoJSON-LD, http, JSONP, ... Manufacturers are using conventional relational databases, legacy technology 13. Impact of Oparl (I) OParl will be implemented by RIS manufacturers Some cities Bonn, Aachen already decided to implement it, others very likely, such as Cologne No real alternative 14. Impact of Oparl (II) Users Citizens Journalists Scientists Politicians Usefullness of Linked Data will be demonstrated ! 15. Vision until end of June 2014 (I) OParl issues resolved OParl 1.0 draft finished (and hopefully accepted without many substantial changes) OParl platform based on Fusepool results providing representative scraped data of one or two municipalities converted to OParl Linked Data, personalized access User interfaces for desktop and mobile browsers Integrate NER for texts (for example: person names linked to his/her OParl IRI) Full text search for OParl-data and documents Recommending similar documents ? Integration of SPARQL GUI Squebi for experiments 16. Vision until end of June 2014 (II) Potentially relevant components Apache Stanbol Apache Jena (with spatial extension) Apache Marmotta Fusepool ECS (for PDF etc.) Fusepool data life cycle Fusepool SILK linking Fusepool dictionary matching algorithm 17. LocationMapper Fusepool technology stack is huge OParl Vocabulary/Ontology Some data from city Recognize street names in text Align with real street names Find street coordinates Show location on map linked with document 18. Oparl Vocabulary/Ontology Creating a standard is a lot of work Still not finished JSON-LD is more complex than JSON 80% solutions not possible Even 99% solutions lead to issues But enormous potential impact 19. Some data from city PDF files containing street names Meta-data using Oparl vocabulary 20. Recognize street names in text Apache Stanbol Using Apache Tika for extracting text from PDF Simple enhancement algorithm: Hier geht es um einen Bebauungsplan fr die Aachener Strae. Und hier um die Brsseler Str. ... und die Neue Ruhr Strae 21. Align with real street names Where to get street names for Cologne in structured form? HTML page really ? CSV ! (also JSON API but with paging) Use OpenRefine with RDF plug-in to convert to RDF 22. Align with real street names Aachener Strae Brsseler Str. Neue Ruhr Strae 23. Find street coordinates Use the official streetname (found using SILK) Call Nominatim to get location (adding city name) 24. Find street coordinates Use the official streetname (found using SILK) Call Nominatim to get location (adding city name) 25. Align with real street names Use SILK to align found text Aachener Strae, Aachener Str. and Aachener Strasse with skos:Concept with prefLabel Aachener Str. Potential for improvement 26. Show location on map MapBox / Leaflet MarkerCluster extension (show example) 27. Thesaurus Two major multilingual thesaurus GEMET and EuroVoc Not yet used 28. Future potential Future version of OParl could become standard for about 19000 municipalities in Germany German constitution was changed to enable this: IT Planungsrat Fusepool funding mentioned LocationMapper will be used for tests Linked Data Fragments (now part of Hydra!)