Toward Semantic Sensor Data Archives on the Web

21
Toward Semantic Sensor Data Archives on the Web Jean-Paul Calbimonte – Karl Aberer LSIR EPFL MEPDAW, ESWC Heraklion, Greece. June 2016 @jpcik

Transcript of Toward Semantic Sensor Data Archives on the Web

Toward Semantic Sensor Data

Archives on the WebJean-Paul Calbimonte – Karl Aberer

LSIR EPFL

MEPDAW, ESWC

Heraklion, Greece. June 2016

@jpcik

2

Sensor Data on the Web

http://mesowest.utah.edu/http://earthquake.usgs.gov/earthquakes/feed/v1.0/http://swiss-experiment.ch

• Monitoring • Alerts • Notifications• Hourly/daily updates

• Myriad of Formats• Ad-hoc access points• Informal description• Convention-semantics• Uneven use of standards• Manual exploration

Sensor Archives: Challenges

3

Discoverability: • Subject of sensing identified and searchable. • Explicit semantics on the sensor metadata • Common understanding of the objects of sensing• Agreed models e.g. ontologies

Storage: • Persistence not always required. • Sensor data is (sometimes) consumed live • Aggregations stored permanently. • Different archival options available• Reduce volume as much as possible, using compressed formats• Querying and transactional requirements often less critical • Silos of sensor data in the form of compressed files. • Replication or backup

Sensor Archives: Challenges

4

Reusability: • Reusing the data for other purposes • Compare data from another locations• Use for calibration purposes • Finding correlations. • Historical and batch analysis • Benchmarking • Training datasets for mining algorithms. • Feed numerical models

Accessibility: • Data access through APIs • Consumption from people/software applications.• De-referenceable URIs • Simple but effective retrieval of sensor data. • SPARQL -> selecting relevant parts of the data• Complex queries not always required • Simple time interval and filters just enough

Interoperability & Standardization. • RDF/SPARQ: building block for publishing

data,• Specific ontologies and vocabularies,

such as the SSN ontology• Represent both sensor metadata, and

observations.

Sensor Data & Linked Data

5

Zip Files

Number of Triples

Example: Nevada dataset-7.86GB in n-triples format-248MB zipped

An example: Linked Sensor Data

http://wiki.knoesis.org/index.php/LinkedSensorData

Sensor Data & Linked Data

6

<http://knoesis.wright.edu/ssw/MeasureData_Precipitation_4UT01_2003_3_31_5_10_00> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://knoesis.wright.edu/ssw/ont/sensor-observation.owl#MeasureData> .<http://knoesis.wright.edu/ssw/MeasureData_Precipitation_4UT01_2003_3_31_5_10_00> <http://knoesis.wright.edu/ssw/ont/sensor-observation.owl#floatValue> "30.0"^^<http://www.w3.org/2001/XMLSchema#float> .<http://knoesis.wright.edu/ssw/MeasureData_Precipitation_4UT01_2003_3_31_5_10_00> <http://knoesis.wright.edu/ssw/ont/sensor-observation.owl#uom> <http://knoesis.wright.edu/ssw/ont/weather.owl#centimeters> .<http://knoesis.wright.edu/ssw/Observation_Precipitation_4UT01_2003_3_31_5_10_00> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://knoesis.wright.edu/ssw/ont/weather.owl#PrecipitationObservation> .<http://knoesis.wright.edu/ssw/Observation_Precipitation_4UT01_2003_3_31_5_10_00> <http://knoesis.wright.edu/ssw/ont/sensor-observation.owl#observedProperty> <http://knoesis.wright.edu/ssw/ont/weather.owl#_Precipitation> .<http://knoesis.wright.edu/ssw/Observation_Precipitation_4UT01_2003_3_31_5_10_00> <http://knoesis.wright.edu/ssw/ont/sensor-observation.owl#procedure> <http://knoesis.wright.edu/ssw/System_4UT01> .<http://knoesis.wright.edu/ssw/Observation_Precipitation_4UT01_2003_3_31_5_10_00> <http://knoesis.wright.edu/ssw/ont/sensor-observation.owl#samplingTime> <http://knoesis.wright.edu/ssw/Instant_2003_3_31_5_10_00> . <http://knoesis.wright.edu/ssw/Instant_2003_3_31_5_10_00> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2006/time#Instant> .<http://knoesis.wright.edu/ssw/Instant_2003_3_31_5_10_00> <http://www.w3.org/2006/time#inXSDDateTime> "2003-03-31T05:10:00-07:00^^http://www.w3.org/2001/XMLSchema#dateTime" .

What do we get in these datasets?

Nice triples

Do we care about all the rest?

What is measured?

MeasurementUnit

Sensor

When is it measured

Semantic Sensor Data Archives

7

How to address these challenges?

Discoverability

Reusability

Accessibility

Interoperability & Standardization

Storage

How to use existing Semantic Web technologies appropriately?Need for new standards and techniques?

Localization: GNSS fusioned with odometry

GPRS

• packet parser• system logging• database server• GPS interpolation• advanced filtering• fault detection• system health monitor• automatic reporting

10 b

uses

in L

ausa

nne

CO, NO2, O3, CO2, UFP, temperature, humidity

OpenSense2 @ Lausanne

8

Reference station

Crowd sensing

Public transportation

Raw Data Acquisition

Air Pollutants Time Series

Temporal Spatial

Aggregations

Pollution Maps Pollution Models Air Quality recommendation

s

Health Studies

Air Quality Products &

Applications

From Sensing to Actionable Data

9

Running example for discussing a Semantic Sensor Data Archive

An Architecture for a Sensor Archive

10Disclaimer: Work in Progress

• RDF for Sensor and Catalog metadata• Native format for Sensor observations (time series)• CSV archive for sensor observations• RDF-unpack of CSV archived data• Mappings for Native format-to-RDF live transofrmation

Data characteristics

Sensor data characteristics

11

Sensor data regularity• Raw sensor data typically collected as time series• Very regular structure. • Patterns can be exploited

E.g. mobile NO2 sensor readings

29-02-2016T16:41:24,47,369,46.52104,6.6357929-02-2016T16:41:34,47,358,46.52344,6.6359529-02-2016T16:41:44,47,354,46.52632,6.6363429-02-2016T16:41:54,47,355,46.52684,6.63729...

Sensor data order• Order of sensor data is crucial • Time is the key attribute for establishing an order among the data items. • Important for indexing • Enables efficient time-based selection, filtering and windowing

Timestamp Sensor Observed Value

Coordinates

An Architecture for a Sensor Archive

12

Catalog, Dataset & Sensor Metadata

Sensor Dataset Metadata

13

:sensorCatalog a dcat:Catalog ; dct:title "OpenSense data catalog" ; dct:language iso639-1:en ; dct:publisher :LSIR-EPFL ; foaf:homepage <http://opensense.epfl.ch/data/> ; dcat:dataset :geo-osanm, :geo-osfpm , :geo-oso3m.

:geo-osanm-csv a dcat:Distribution ; dcat:downloadURL <http://opensense.epfl.ch/data/api/sensors/geo_osanm>; dct:title "CSV distribution of NO2 measurements"; dcat:mediaType "text/csv"; dcat:byteSize "5534530"^^xsd:decimal .

• Dataset distribution: different accessible formats• Multiple distributions for the same dataset

Using DCAT• W3C Recommendation• Organizing Sensor archive

in datasets

Sensor Dataset Metadata

14

:geo-osanm a dcat:Dataset; dct:title "OpenSense NO2 measurements"; dcat:theme :NO2; dct:issued "2015-12-05"^^xsd:date; dct:temporal g-interval:1977-11-01T12:22:45/P1Y; dct:spatial <http://www.geonames.org/6695072>; dct:publisher :LSIR-EPFL; dct:accrualPeriodicity sdmx:freq-W; ssn:isProducedBy :NO2VsensorBox; dcat:distribution :geo-osanm-csv .

:NO2VsensorBox a ssn:Sensor; rdfs:label "NO2 Virtual Sensor Lausanne"; ssn:observes :NO2; ssn:hasMeasurementCapability [ a ssn:Accuracy; ssn:forProperty :NO2; ssn:inCondition ... ; ssn:hasValue ... ] .

Using DCAT + SSN• W3C Recommendation• Dataset description• Sensor description

• Observed property• Feature of interest• Accuracy• Measurement

Capabilities• Location, extension,

context

An Architecture for a Sensor Archive

15

Sensor ObservationsR2RML

Semantic Sensor Network Ontology

16

ssn:Sensor

ssn:Platform

ssn:FeatureOfInterest

ssn:Deployment

ssn:Property

cf-prop:air_temperature

ssn:observes

ssn:onPlatform

dul:Placedul:hasLocation

ssn:SensingDevicessn:inDeployment

ssn:MeasurementCapability

ssn:MeasurementProperty

geo:lat, geo:lngxsd:double

ssn:hasMeasurementProperty

ssn:Accuracy

ssn:ofFeature

aws:TemperatureSensor

aws:Thermistor

ssn:Latency

dim:Temperature

qu:QuantityKind

cf-prop:soil_temperature

cf-feat:Wind

cf-feat:Surface

cf-feat:Medium

cf-feat:aircf-feat:soil

dim:VelocityOrSpeed cf-prop:wind_speedcf-prop:rainfall_rate

aws:CapacitiveBead …

Sensor Observations

17

:no2obs1 a :NO2Observation ; ssn:observedProperty :NO2 ; ssn:featureOfInterest aq:AirMedium ; ssn:observedBy :NO2SensorBox ; ssn:observationResult :no2obs1result ; ssn:observationResultTime :instant_20160331232000 .

:no2obs1result a :NO2ObservationValue ; qu:numericalValue "345.00"^^xsd:float ; qu:unit :ppm .

:instant_20160331232000 a time:Instant ; time:inXSDDateTime "2016-03-31T23:20:00"^^xsd:datetime .

Type of Measurement

Sensor

Observed Value

Unit

Generated only on demand through mappings

R2RML Mappings

18

:ObsValueMap rr:subjectMap [ rr:template "http://opensense.epfl.ch/data/ObsResult_NO2_{sensor}_{time}"]; rr:predicateObjectMap [ rr:predicate qu:numericalValue; rr:objectMap [ rr:column "no2"; rr:datatype xsd:float; ]];

rr:predicateObjectMap [ rr:predicate obs:uom; rr:objectMap [ rr:parentTriplesMap :UnitMap; ]].

:ObservationMap rr:subjectMap [ rr:template "http://opensense.epfl.ch/data/Obs_NO2_{sensor}_{time}"]; rr:predicateObjectMap [ rr:predicate ssn:observedProperty; rr:objectMap [ rr:constant opensense:NO2]];

URI of subject

URI of predicate

Object: colum name

Column names in a template

Can be used for mapping both databases and CSVs

Discussion: Preliminary Experimentation

19

E.g. comparing with ERI: RDF data compression: what is the size and how long it takes?

Live filtering: how much do we wait to get the data?

CSV on the Web Standards

20

{ "@context": ["http://www.w3.org/ns/csvw", ... ], "tableSchema": { "columns": [ { "name": "no2", "titles": "NO2 concentration", "aboutUrl": "ObsResult_NO2_{sensor}_{time}", "propertyUrl": "qu:numericalValue", { "name": "sensor", "titles": "Bus sensor", "aboutUrl": "Obs_NO2_{sensor}_{time}", "propertyUrl": "ssn:observedBy", "valueUrl": "Sensor_{sensor}” }, { "name": "obsProperty", "virtual": true, "aboutUrl": "Obs_NO2_{sensor}_{time}", "propertyUrl": "ssn:observedProperty", "valueUrl": "opensense:NO2”} ]}

http://www.w3.org/TR/csv2rdf/

URI of subject

Predicate

URI Value

Convenient alternative to R2RML mappings?

Constant URI

Thanks a lot!

Jean-Paul CalbimonteLSIR EPFL

@jpcik