Linda newsletter issue 1 dec2014

Click here to load reader

  • date post

  • Category


  • view

  • download


Embed Size (px)

Transcript of Linda newsletter issue 1 dec2014

  1. 1. The LinDA outcomes The different components developed during the LinDA project are deployable in a common , integrated environment called the LinDA Workbench hosting all available LinDA tools; nevertheless they are able to operate also independently from each other, with only front-end components depending on the respective back-end components to fulfil their spe- cific business functionality. The main available LinDA tools are the following: The Transformation Engine, intended to data conversion from various input formats to the RDF format The Vocabulary and Metadata Repository, contains a collection of vocabularies, covering various thematic areas The Visualisation Tool, to provide a largely automatic visualisation workflow that enables SMEs to visualise data in different formats and modalities The LinDA Analytics and Data Mining Ecosystem, A library of basic and robust data analytic functionality is provided enabling SMEs to utilise and share analytic methods on linked data for the discovery and communication of meaning- ful new patterns The Query Builder, enabling users to explore open RDF data and reuse it easily The RDF2Any converter, to support conversion of RDF to various serialisation outputs like RDB (Relational database script), JSON, and CSV The Consumption Apps IN THIS ISSUE The LinDA outcomes P.1 The LinDA Publication and Consumption framework P.2 Cost-effective access to linked data for SMEs P.3 Events P.4 LinDA Consumption Apps Three Consumption Apps are being developed in the LinDA project, corresponding to the three pilots run by the Consortium: Business Intelligence Analytics Pilot Environmental Analytics Pilot Media Analytics Pilot QUARTERLY NEWSLETTER ISSUE 1 December 2014 The LinDA quarterly newsletter This is the first issue of the LinDA newsletter, aimed at collecting in a freely downloadable document some of the articles and news ap- peared on the LinDA website. The newsletter will be published quarterly and is intended to reach a broad audience, focusing more on exploitation and the adoption by the project stakeholders (from Public Sector Information (PSI) Providers to SMEs and Enterprises) than on purely technological issues. 1
  2. 2. Due to the proliferation of available public sector data sources and initia- tives, the interlinking and combination of such datasets has become a topic of major interest within SME infor- mation managers. While more agile options for data integration are being requested, conventional methods of data integration are not feasible for use due to the massive size of availa- ble data. The current state of the latter data is also mostly unstructured, thus making it either unaccessible for SMEs, or else making the cost of utiliz- ing such data unbearable for SMEs. This calls for tools that support users in the re-use of such data, whilst hid- ing the underlying complexity and al- lowing the re-use of existing software applications. The LinDA Publication and Consump- tion Framework aims to assist SMEs and data publishers and consumers in analyzing and interlinking public sec- tor information with enterprise data. The main approach of this framework towards encouraging data re-use is the conversion from RDF to a number of other formats, included but not limited to CSV, XLS, XML, JSON and RDB. This conversion will allow users to import open RDF data into the orig- inal format of their current system. The data can then be interlinked with the entitys own data, enabling the potential identification of patterns and maybe even predictions. Through a user interface, users of the framework are enabled to select what data to access from the available open datasets. This is possible either through a SPARQL endpoint, or other- wise, if the user is not familiar with SPARQL, through a Query Builder. The latter provides drag and drop and au- to-complete features which allow a user to easily build the desired query and access the required data. The user can then access an API server where, through RESTful calls, the user can pass the SPARQL query that results in the data to be converted and the con- version format to be used. The gener- ated results can then be downloaded to the users desktop. Let us take German Tours as a use case. German Tours is an SME that provides various tours for tourists vis- iting Germany. A number of languages are used for the tours, including Eng- lish, Italian and German. Alice, the manager, thinks the tours provided by German Tours need to be updated to reflect current tourist trends. She is of the opinion that they need other rele- vant data aside from the information that the SME already has, which in- cludes the tour type, language used, and number of tourist bookings. She therefore starts looking for any relevant information on the web. She discovers the statistical data published by the German NSO, which contains information about relevant touristic information such as most popular tourist nationalities visiting Germany, as well as the time of the year in which they visited the country. This data is ideal for her purpose, as it would ena- ble her to compare the SMEs tour bookings with the actual tourists who visited Germany. Unfortunately how- ever, the available data is in RDF while her data is in RDB format. Alice thus exploits the LinDA Publication and Consumption Framework in order to import the required data into the SMEs system. The German NSO data can be easily accessed through the LinDA easy-to use and intuitive inter- face which allows Alice to create a SPARQL query in order to query the desired data. Alice then proceeds to convert the required data into RDB format, and download it into the SMEs system. By linking the SMEs data with the German NSO open data, Alice re- alises that a large number of Spanish tourists visited Germany, however, since German Tours has no Spanish tours available, Alice thus manages to discover a niche in the market that they do not cater for. Thus, through the LinDA Publication and Consump- tion Framework, Alice is given the op- portunity to enhance the SMEs ser- vices in order to better reflect current tourist demands. LinDA Publication and Consumption Framework by Judie Attard Recent news SDI4APPS/LinDA Cod Joint SDI4APPS/LinDA Code Camp The project SDI4APPS hosted a Code Camp from 1st to 3rd De- cember 2014 in Brno, and the LinDA project has been invited to take part and show the results achieved to the invited SMEs. The Code Camp has been an ex- cellent opportunity to Demonstrate the feasibility and impact of the LinDA ap- proach to the European SMEs which is one of the project ob- jectives and move forward in both its stakeholder involvement strategy and collaboration with the development community. 38th Conference of the Italian Association of Epidemiology (Naples 5-7 November 2014). The conference focus was the role of epidemiology in improv- ing the planning, management and evaluation of the prevention programs to support the citizens wellbeing. LinDA contributed with a poster Linked Data Ana- lytics for the Identification of Epidemiological Trends, focus- ing on the opportunity given by the availability of data sources in the health and environmental sector and the production of ad- vanced analytics to extract meaningful conclusions regard- ing the causes of specific diseas- es. 2
  3. 3. Given the availability of huge amount of information in heterogeneous pub- lic and private data sources world- wide, the realization of advanced analysis over the available data is con- sidered crucial for Small and Medium Enterprises (SMEs) in order to proper- ly exploit the available information and turn it into competitive ad- vantage. Data analytics have the potential to help SMEs to identify the data that is most important to the current and future business decisions, provide insights based on the analysis, answer specific business questions and facili- tate/guide the decision making pro- cess. The combination of publicly available data (e.g. governmental open data, environmental data) with privately transformed data, main- tained by SMEs, can help enhancing their experience of managing and pro- cessing of data, in ways not available before. It should be noted that many major companies are already moving to- wards the transition from business intelligence to business analytics ap- proaches, since they consider ana- lytics as the scientific process of trans- forming data into insight for making better decisions. Towards this direc- tion, the engagement of data scien- tists in addition to information scien- tists in the data analysis process is considered a must. Given the need for interlinking of concepts described in different da- tasets towards the preparation of meaningful datasets for analysis, the modern approach adopted is Linked Data, a set of best practices for repre- senting and connecting structured information on the web. Following these practices enables the creation of a web of data a large interconnect- ed web consisting of integrated data elements. Within the Linked Data do- main, the LinDA (Linked Data Ana- lytics) project is going to provide a set of tools that will assist SMEs in effi- ciently developing novel data analyti- cal services that are linked to the available public data, therefore con- tributing to improve their competi- tiveness and stimulating the emer- gence of innovative business models. The proposed approach is building The exploitation perspective upon the collection of data from available data sources, their transformation in proper format (e.g. RDF format) and their interlinking for the creation of extended linked datasets, fed as input in the analytics extraction process. Then, the analysis part can be realised, while the output can be fed up to visu- alisation tools. LinDA is handling several technical challenges regarding the creation, publica- tion and consumption of Linked Dat