Publish and use your data

146
LD4SC Summer School 7 th 12 th June, Cercedilla, Spain 1st Summer School on Smart Ci2es and Linked Open Data (LD4SC15) Handson 4 Publish and use your data Álvaro Sicilia, Filip Radulovic

Transcript of Publish and use your data

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

1st  Summer  School  on    Smart  Ci2es  and  Linked  Open  Data  (LD4SC-­‐15)  

Hands-­‐on  4  Publish  and  use  your  data  

Álvaro  Sicilia,  Filip  Radulovic  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Background  

•  Álvaro  Sicilia  ([email protected])  

•  Background:  Computer  Science  

•  From:  Architecture,  RepresentaMon  &  ComputaMon  (ARC)    Engineering  and  Architecture  La  Salle  (FUNITEC)      Universitat  Ramon  Llull    Barcelona,  Spain  

•  Since  2008  working  with  SemanMc  Web  technologies    

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Background  

-­‐  IntUBE  2008-­‐2011  7th  Framework  Programme      Intelligent  use  of  building’s  energy  informa2on  

-­‐  RÉPENER  2009-­‐2012  Spanish  NaMonal  RDI  Plan    Control  and  improvement  of  energy  efficiency  in  buildings    through  the  use  of  repositories    

-­‐  SEMANCO  2011-­‐2014  7th  Framework  Programme    Seman2c  Tools  for  Carbon  Reduc2on  in  Urban  Planning  

 

Project  Coordinator:    VTT,  Finland  

Project  Coordinator:  ARC  Engineering  and  Architecture  La  Salle,  Spain  

Project  Coordinator:  ARC  Engineering  and  Architecture  La  Salle,  Spain  

 -­‐  OPTIMUS  2013-­‐2016  7th  Framework  Programme    Op2mising  the  energy  use  in  ci2es  with  smart  decision  support  system  

Project  Coordinator:    NaMonal  Technical  University  of  Athens,  Greece  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Index  

•  Requirements  •  Guidelines  for  the  PublicaMon  of  Linked  Data  •  Guidelines  for  the  ExploitaMon  of  Linked  Data  •  Hands-­‐on  Session  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Linked  Data  life  cycle  

Specification

Modelling

Generation Publication

Exploitation

Linking

5  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Requirements  

•  Legal  framework  of  the  dataset  The  publicaMon  of  energy  related  data  requires  that  these  data  are  equipped  with  a  proper  license  framework  in  order  to  be  later  re-­‐used  and  exploited  by  the  wide  public.    Linked  Open  Data  legal  compliancy  is  closely  related  to  data  protecMon  issues,  IPR  (Intellectual  Property  Rights)  and  copyright  (legal  enMtlements  for  work  creaMons),  and  privacy.  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Requirements  type   License  

 Comments   Suitability  for  

LOD  principles  

Public  Domain        

CreaMve  Commons  CCZero  (CC0)  

-­‐  The  least  restricMve  CC  license.    -­‐  Removes  all  copyright  restricMons  from  content.    -­‐  Can  be  only  applied  by  authors/neighbouring  rights  over  the  content.    -­‐  The  Public  Domain  Mark  can  be  applied  by  anyone  to  content  that  is  already  free  of  copyright  restricMons  .  

Ideal  

Open  Data  Commons  Public  Domain  DedicaMon  and  Licence  (PDDL)  

-­‐  Allows  data  and  database  unrestricted  sharing,  reuse,  reproducMon  and  adapMon  with  no  restricMons.    

AfribuMon      AfribuMon      

CreaMve  Commons  AfribuMon  4.0  (CC-­‐BY-­‐4.0)  

-­‐  Allows  the  users  to  copy  or  remix  the  work  in  any  way.    -­‐  The  users  must  afribute  the  work  to  the  original  creator.    -­‐  The  creator  should  provide  a  link  to  the  CreaMve  Commons  page  explaining  the  user’s  responsibiliMes  and  provide  an  easily-­‐accessed  list  of  creators.    

AfribuMon  requirements  may  lead  to  afribuMon  stacking    

Open  Data  Commons  AfribuMon  License  (ODC-­‐BY)    

-­‐  Allows  users  to  copy,  distribute,  and  use  the  database.    -­‐ Allows  users  to  produce  works  from  the  database,  modify,  transform  and  built  upon  the  database.    -­‐   Users  must  afribute  any  public  use  of  the  database  or  works  produced  from  the  database  and  make  clear  to  others  the  license  of  the  database  .  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Requirements  type   License  

 Comments   Suitability  for  LOD  

principles  

AfribuMon  Share-­‐Alike            

CreaMve  Commons  AfribuMon  Share-­‐Alike  4.0  (CC-­‐BY-­‐SA-­‐4.0)  

-­‐   AfribuMon  restricMons  are  applied.    -­‐   Users  transforming  the  work  into  something  new  must  distribute  that  work  under  the  CC  BY-­‐SA  license,  or  a  similarly  open  license.  

AfribuMon  requirements  may  lead  to  afribuMon  stacking.      Share-­‐alike  requirements  may  lead  to  interoperability    issues  

Open  Data  Commons  Open  Database  License  (ODbL)  

-­‐   Users  must  keep  the  database  open  technologically  and  offer  any  adapted  version  of  the  database  or  works  produced  from  it  under  the  ODbL.  -­‐   Limited  commercial  reuse  of  the  database  or  its  contents.  -­‐  A  separate  Database  Contents  License  (DbCL)  to  the  contents  of  a  database  licensed  under  ODbL,  which  waives  all  rights  in  the  individual  contents.  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Requirements  

+  Easy  to  use    +  Widespread  adopted    +  Flexible    +  Available  to  human  &machine  readable  forms    +  Direct  links  between  the  resource  and  its  license    +  Symbolic  representaMon  of  the  license  to  recognize  usage  terms      

-­‐CC  licenses  are  copyright  based  and  designed  to  protect  creaMve  works  (content)  –  databases  are  not  creaMve  works  but  facts  of  this  work  -­‐  Third  party  rights  material  included  in  the  data  may  require  addiMonal  clearances  and  is  not  provided  as  informaMon  in  the  license  -­‐Cannot  be  revoked  once  applied.  CC  licenses  and  ODC-­‐By  as  well  as  ODC-­‐ODbl  are  irrevocable  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Requirements  

•  Quality  requirements  according  to  ISO/IEC  25012  -­‐   Accuracy:    

-­‐   The  publishing  dataset  must  be  semanMcally  and  syntacMcally  accurate.  -­‐  The  dataset  should  not  contain  repeatedly  redundant  values.  -­‐  Accuracy  is  also  expected  for  the  reuse  of  the  published  dataset.  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Requirements  

•  Quality  requirements  according  to  ISO/IEC  25012  -­‐   Completeness:    

-­‐   The  data  items  published  are  necessary  to  support  the  applicaMon  for  which  it  is  intended.  

-­‐   Consistency:  -­‐   The  dataset  before  published  should  be  complete  and  consistent.  -­‐  ConflicMng  statements  and  errors  should  be  detected  before  publicaMon.  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Requirements  

•  Quality  requirements  according  to  ISO/IEC  25012  -­‐   Credibility:    

-­‐   Linked  Open  Data  must  be  credible  and  fully  compliant  with  the  license,  policy  and  terms  of  use  derived  from  the  provenance  source.  -­‐   Agreements  and  afribuMons  should  be  defined  where  appropriate  to  clarify  users  whether  they  can  or  cannot  trust  the  data.  

-­‐   Timeless:  -­‐   The  publicaMon  process  should  be  designed  for  its  maintenance.  -­‐   The  dataset  should  be  Mmelessly  handled  by  the  responsible  dataset  supporter  in  order  to  maintain,  update  and  enable  the  usage  and  exploitaMon  of  data.  -­‐   The  processes  and  tools  should  be  able  to  support  maintaining  the  dataset  over  Mme  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Requirements  

•  Requirements  according  to  AENOR  PNE  178301  standard  on  Smart  Ci2es  and  Open  Data  -­‐ Open  data  must  be  published  through  persistent  URLs  and  using  standard,  structured,  open,  and  non-­‐proprietary  formats  that  allow  the  unique  idenMficaMon  of  resources.  -­‐ Vocabularies  used  in  open  data  must  be  published  online  through  persistent  URLs.  -­‐  Open  datasets  must  be  included  in  relevant  open  data  catalogues.  -­‐  The  organizaMon  must  promote  the  reuse  of  open  data  by  providing  supporMng  documents  and  materials.  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Requirements  

•  Requirements  according  to  AENOR  PNE  178301  standard  on  Smart  Ci2es  and  Open  Data  -­‐ Open  data  must  be  available  by  downloading  the  respecMve  files  and  through  web  APIs  or  SPARQL  endpoints.  -­‐ Non-­‐discriminatory  access  to  open  data  must  be  ensured  by:  not  requiring  administraMve  procedures  or  user  registraMon  and  by  guaranteeing  equal  rights,  non-­‐discriminaMon  and  accessibility.  AdministraMve  procedures  or  user  registraMon  could  be  allowed  in  jusMfied  cases.  -­‐   The  access  and  use  of  open  data  must  be  periodically  measured.  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Requirements  

•  Requirements  according  to  AENOR  PNE  178301  standard  on  Smart  Ci2es  and  Open  Data  -­‐ Open  data  must  be  documented  using  metadata.  -­‐ Vocabularies  used  in  open  data  must  be  documented  using  metadata.  -­‐   Open  data  licenses  and  use  condiMons  must  be  documented  and  published  online.  -­‐   Open  data  licenses  must  be  standard,  self-­‐documented,  based  on  exisMng  standards,  and  preferably  in  a  machine-­‐processable  format.  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Requirements  

Requirements  summary:    •  Legal  compliance  aspects  à  rights  protecMon,  license  terms  

•  Quality  of  data  and  metadata  !  accuracy,  completeness,  consistency,  credibility,  and  sustainability  

•  Publica2on  requirements  !  data  and  vocabularies  accessible  

•  Social  requirements  !  maintenance  and  promoMon  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Index  

•  Requirements  •  Guidelines  for  the  Publica2on  of  Linked  Data  •  Guidelines  for  the  ExploitaMon  of  Linked  Data  •  Hands-­‐on  Session  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Linked  Data  life  cycle  

Specification

Modelling

Generation Publication

Exploitation

Linking

18  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Ensure  Legal  compliance  Two  possible  cases:  a)  RDF  dataset  is  generated  from  a  data  source  that  permits  the  

use  of  the  data,  and  when  the  publicaMon  of  the  produced  RDF  dataset  complies  to  the  license  and  legal  terms  of  the  original  data  source    

 à  RDF  dataset  can  be  published  without  obstacles    

b)  Or,  to  ensure  legal  compliance.  Usually,  legal  aspects  can  be  addressed  by  preserving  the  privacy  of  the  data    

 à  data  anonymizaMon  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Ensure  Legal  compliance  !  Data  anonymiza2on  Privacy-­‐preserving  data  publishing  :  

a)  Explicit  idenMfiers  are  afributes  that  explicitly  idenMfy  the  enMty  of  interest  (e.g.,  id  of  a  person,  property  number  of  a  building)  

b)  Quasi  idenMfiers  (QID)  are  sets  of  afributes  that  can  potenMally  idenMfy  the  enMty  of  interest.  (e.g.,  date  of  birth,  postal  code,  gender…)  

Radulovic  F.,  García-­‐Castro  R.,  Gómez-­‐Pérez  A.:  Towards  the  AnonymisaMon  of  RDF  Framework.  In  Proceedings  of  the  27th  InternaMonal  Conference  on  Sotware  Engineering  and  Knowledge  Engineering  (SEKE2015),  Pifsburg,  Pennsylvania,  USA.  July  2015.  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Ensure  Legal  compliance  !  Data  anonymiza2on  Techniques:  •  Suppression  is  a  technique  in  which  some  values  or  complete  

records  in  a  dataset  are  replaced  with  some  other  specific  value  or  record.  

Id   Building  use   Consump2on  

1   School   12352  

2   ResidenMal   2334  

3   School   15121  

4   Office   5252  

Id   Building  use  

1   School  

2   ResidenMal  

3   School  

4   Office  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Ensure  Legal  compliance  !  Data  anonymiza2on  

Techniques:  •  Generaliza2on  is  a  technique  that  transforms  pieces  of  data  

into  more  general  data  or  sets  of  data,  and  is  suited  for  transformaMon  of  categorical  afributes  and  discrete  numerical  afributes  à  use  of  postal  codes  instead  of  addresses,  use  range  of  values  instead  of  specific  values  

Id   Postal  Code   Consump2on  

1   08006   12352  

2   08022   2334  

3   08021   15121  

Id   Postal  Code   Consump2on  

1   0800*   12352  

2   0802*   2334  

3   0802*   15121  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Ensure  Legal  compliance  !  Data  anonymiza2on  

Techniques:  •  Generaliza2on  is  a  technique  that  transforms  pieces  of  data  

into  more  general  data  or  sets  of  data,  and  is  suited  for  transformaMon  of  categorical  afributes  and  discrete  numerical  afributes  à  use  of  postal  codes  instead  of  addresses,  use  range  of  values  instead  of  specific  values  

Id   Postal  Code   Consump2on  

1   08006   12352  

2   08022   2334  

3   08021   15121  

Id   Postal  Code   Consump2on  

1   08006   10k-­‐14k  

2   08022   1k-­‐5k  

3   08021   14k-­‐20k  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Ensure  Legal  compliance  !  Data  anonymiza2on  Techniques:  •  Data  aggrega2on:  For  example  aggregate  energy  consumpMon  

of  buildings  by  type  of    buildings  

Id   Building  use   Consump2on  

1   School   12352  

2   ResidenMal   2334  

3   School   15121  

4   Office   5252  

5   Office   5623  

6   ResidenMal   3452  

Building  use   Total  Consump2on  

School   13736  

ResidenMal   2893  

Office   5437  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Ensure  Legal  compliance  !  Data  anonymiza2on  Techniques:  •  Data  aggrega2on:  For  example  aggregate  energy  consumpMon  

of  buildings  by  type  of    buildings  

Id   Building  use   Consump2on  

1   School   12352  

2   ResidenMal   2334  

3   School   15121  

4   Office   5252  

5   Office   5623  

6   ResidenMal   3452  

Building  use   Ave.  Consump2on  

School   6836  

ResidenMal   1443  

Office   2677  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Ensure  Legal  compliance  !  Data  anonymiza2on  Techniques:  •  Anatomiza2on  and  permuta2on  are  techniques  that,  unlike  

suppression  and  generalizaMon,  do  not  modify  the  data  but  rather  remove  the  relaMonship  between  the  quasi  idenMfiers  and  sensiMve  values.  

•  Perturba2on  is  a  technique  in  which  original  data  are  replaced  with  noise  or  syntheMc  data  in  such  a  way  that  staMsMcal  analyses  based  on  the  perturbed  data  do  not  significantly  differ  from  the  staMsMcal  analysis  of  the  original  data.  à  to  replace  observed  values  with  the  average  computed  on  a  small  group  of  units    

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Ensure  Legal  compliance  !  Data  anonymiza2on  Steps  to  ensure  legal  aspects  of  the  dataset  

1.   To  iden2fy  explicit  idenMfiers,  quasi  idenMfiers,  and  sensiMve  afributes  in  the  dataset.    

2.   To  select  the  techniques  to  use  on  the  previously  idenMfied  afributes  in  order  to  ensure  legal  compliance.  

3.   To  apply  the  selected  techniques  over  the  idenMfied  afributes.  

4.   To  modify  the  ontology.  In  the  case  that  data  anonymizaMon  implies  some  changes  in  the  data  model  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Ensure  Legal  compliance  !  Examples  Leeds  City  Council  example  energy  consumpMon  of  council  sites  within  the  Leeds  jurisdicMon  is  licensed  under  the  Open  Government  License,  which  permits  the  use  and  modificaMon  of  the  data      à  it  is  not  necessary  to  perform  any  addiMonal  step  in  order  to  ensure  legal  compliance  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Ensure  Legal  compliance  !  Examples  BECA  energy  consump2on  example  Needed  anonymizaMon:      QIDs  and  sensi2ve  a`ributes  

   Anonymiza2on  Technique      

{evaluaMon  number}      {tenant  idenMfier}      {residence  idenMfier}      

QIDs.  Those  afributes  will  be  generalized  (e.g.,  value  35698456  is  generalized  to  356*****).    

{building  idenMfier,  residence  idenMfier,  tenant  number}      

QID.  Since  residence  idenMfier  is  already  generalized,  addi2onal  generaliza2on  will  be  performed  for  tenant  number    

{comment}       QID  because  in  some  cases  it  contains  informaMon  about  tenant  idenMfiers  in  natural  language.  Therefore,  this  afribute  will  be  completely  suppressed.      

{evaluaMon  number,  residence  size}        

The  residence  size  afribute  will  be  generalized  by  taking  into  account  the  interval  to  which  the  size  belongs  to  and  then  by  assigning  the  mean  value  of  the  interval  (e.g.,  if  the  size  of  residence  is  38  square  meters,  the  corresponding  interval  is  30-­‐49  and,  therefore,  value  35  will  be  assigned).  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Publish  the  dataset  and  the  ontology  The  goal  of  this  step  is  to  make  accessible  through  the  Web  the  ontology  and  the  RDF  dataset  following  Linked  Data  principles.    1.  Use  URIs  to  name  (idenMfy)  things.  2.  Use  HTTP  URIs  so  that  these  things  can  be  looked  up  

(interpreted,  "dereferenced").  3.  Provide  useful  informaMon  about  what  a  name  idenMfies  

when  it's  looked  up,  using  open  standards  such  as  RDF,  SPARQL,  etc.  

4.  Refer  to  other  things  using  their  HTTP  URI-­‐based  names  when  publishing  data  on  the  Web.  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Publish  the  dataset  and  the  ontology  Steps:  1.  To  store  the  RDF  data  into  a  persistent  repository  where  

data  can  be  then  accessed  and  queried  à  RDF  repository  2.  To  enable  resolvable  HTTP  URIs  and  content  negoMaMon,  

i.e.,  the  mechanisms  for  accessing  the  data  through  the  Web.  

3.  To  enable  a  SPARQL  HTTP  endpoint.  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Publish  the  dataset  and  the  ontology  Steps:  1.  To  store  the  RDF  data  into  a  persistent  repository  where  

data  can  be  then  accessed  and  queried  à  RDF  repository  

dataset   Rdf  dump   Triple  store  

Sparql  queries  

dataset   SQL   RDF  wrapper  

Sparql  queries  

•  Fast    •  Not  up  to  date  

•  Not  fast  •  Updated  

R2RML  mappings  Rela/onal  database  

Virtuoso  server  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Publish  the  dataset  and  the  ontology  Steps:  2.  To  enable  resolvable  HTTP  URIs  and  content  negoMaMon,  i.e.,  

the  mechanisms  for  accessing  the  data  through  the  Web  

Pubby:  Image  taken  from:    hfp://wifo5-­‐03.informaMk.uni-­‐mannheim.de/pubby/  

Humans  à  HTML  content  

Computers  à  RDF  content  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Publish  the  dataset  and  the  ontology  Steps:  2.  To  enable  resolvable  HTTP  URIs  and  content  negoMaMon,  i.e.,  

the  mechanisms  for  accessing  the  data  through  the  Web  

PUBBY  configura2on    Go  to:  tomcat/webapps/pubby/WEB-­‐INF/config.n3  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Publish  the  dataset  and  the  ontology  Steps:  2.  To  enable  resolvable  HTTP  URIs  and  content  negoMaMon,  i.e.,  

the  mechanisms  for  accessing  the  data  through  the  Web  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  Content  nego2a2on:  HTML  "  for  humans  

Content  nego2a2on:  RDF  "  for  computers  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Publish  the  dataset  and  the  ontology  Steps:  2.  To  enable  resolvable  HTTP  URIs  and  content  negoMaMon,  

i.e.,  the  mechanisms  for  accessing  the  data  through  the  Web  

Alterna2ves:    • Linked  Data  API        !  Elda  or  Puelia  

• W3C  Linked  Data  Plaeorm  (LDP)  specifica2on      !  LDP4j  or  Apache  Marmo`a  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Publish  the  dataset  and  the  ontology  !  Examples  Leeds  City  Council  &  BECA  energy  consump2on  examples  1.  RDF  dataset  à  Openlink's  Virtuoso  Open  Source  repository.  

 Ontology  à    hfp://smartcity.linkeddata.es/lcc/ontology/EnergyConsumpMon#,          

hfp://smartcity.linkeddata.es/BECA/ontology/EnergyConsumpMon#    

2.  HTTP  access  to  the  data  à  Linked  Data  API  (ELDA)  

3.  SPARQL  endpoint  à  Virtuoso  at  hfp://smartcity.linkeddata.es/lcc/sparql                hfp://smartcity.linkeddata.es/BECA/sparql    

 RDF  dumps  à    hfp://smartcity.linkeddata.es/lcc/lcc-­‐dataset.fl          hfp://smartcity.linkeddata.es/BECA/BECA-­‐dataset.fl    

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Publish  metadata  and  online  documenta2on  The  goal  of  this  step  is  to  create  and  publish  the  documentaMon  of  the  RDF  dataset  and  the  ontology.      This  documentaMon  is  oriented  to  both  humans  and  machines    and  its  purpose  is  to  facilitate  the  usage  of  the  dataset  that  is  being  made  available.  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Publish  metadata  and  online  documenta2on  To  create  and  publish  human-­‐readable  metadata  descripMons:    

To  create  and  publish  a  human-­‐readable  documentaMon  of  dataset  and  ontology.      Providing  documentaMon  about  the  dataset  and  the  ontology  can  ease  the  data  usage  to  consumers  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Publish  metadata  and  online  documenta2on  To  create  and  publish  machine-­‐readable  metadata  descripMons:    Two  vocabularies  published  by  the  W3C  allow  describing  datasets  and  data  catalogues  in  RDF:    -­‐   VoID  (Vocabulary  of  Interlinked  Datasets)  -­‐   DCAT  (Data  Catalogue  Vocabulary)    

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Publish  metadata  and  online  documenta2on  To  create  and  publish  machine-­‐readable  metadata  descripMons:  -­‐  DCAT  (Data  Catalogue  Vocabulary)     @prefix  os:        <hfp://a9.com/-­‐/spec/opensearch/1.1/>  .  

@prefix  dct:      <hfp://purl.org/dc/terms/>  .  @prefix  xsd:      <hfp://www.w3.org/2001/XMLSchema#>  .  @prefix  api:      <hfp://purl.org/linked-­‐data/api/vocab#>  .  @prefix  rdf:      <hfp://www.w3.org/1999/02/22-­‐rdf-­‐syntax-­‐ns#>  .  @prefix  xhv:      <hfp://www.w3.org/1999/xhtml/vocab#>  .    <h`p://smartcity.linkeddata.es/lcc>                  a                            dct:Dataset  ;                  dct:license        <h`p://purl.org/NET/rdflicense/ukogl1.0>;                  dct:source          “We  acknowledge  that  this  dataset  uses  data  coming  from  the  Leeds  City  Council  by  including,  please      

   check  the  machine-­‐readable  licenses  provided  here  and  further  informa2on  at          h`p://www.leeds.gov.uk/opendata/pages/developer-­‐datasets.aspx"  ;  

               <hfp://www.w3.org/2002/07/owl#sameAs>                                    <h`p://datahub.io/dataset/lcc-­‐leeds-­‐city-­‐council-­‐energy-­‐consump2on-­‐linked-­‐data>  .                dct:publisher        “The  publisher  of  the  dataset”;                dct:language        <h`p://id.loc.gov/vocabulary/iso639-­‐1/en>    ;                dct:accrualPeriodicity    <h`p://purl.org/linked-­‐data/sdmx/2009/code#freq-­‐W>    ;  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Publish  metadata  and  online  documenta2on  To  create  and  publish  machine-­‐readable  metadata  descripMons:  -­‐  VoID  (Vocabulary  of  Interlinked  Datasets)    @prefix  rdf:  <hfp://www.w3.org/1999/02/22-­‐rdf-­‐syntax-­‐ns#>  .  

@prefix  rdfs:  <hfp://www.w3.org/2000/01/rdf-­‐schema#>  .  @prefix  foaf:  <hfp://xmlns.com/foaf/0.1/>  .  @prefix  dcterms:  <hfp://purl.org/dc/terms/>  .  @prefix  void:  <hfp://rdfs.org/ns/void#>  .  @prefix  xsd:  <hfp://www.w3.org/2001/XMLSchema#>  .    ##  your  dataset    <h`p://your.dataset.com/>  rdf:type  void:Dataset  ;    foaf:homepage  <h`p://your.dataset.com/homepage>  ;    dcterms:Mtle  “Title  of  your  dataset"  ;    dcterms:descripMon  “Descrip2on  of  your  dataset."  ;    void:sparqlEndpoint  <h`p://your.dataset.com/sparql>  ;    void:uriSpace  "h`p://your.dataset.com/resource/";    void:exampleResource  <h`p://your.dataset.com/resource/URI/XXXX>  .    dcterms:source  "  Descrip2on  of  the  dataset  source"  ;    dcterms:created  “XXXX-­‐XX-­‐XX"^^xsd:date;    dcterms:license  <h`p://crea2vecommons.org/licenses/by/3.0/>                  dcterms:subject  <h`p://dbpedia.org/resource/Building>;  

 void:triples  150297  ;    void:enMMes  18890  ;    void:classes  65  ;    void:properMes  100  ;    void:disMnctSubjects  18962  ;    void:disMnctObjects  26097  ;      ##  datasets  you  link  to    :Anotherdataset  rdf:type  void:Dataset  ;    foaf:homepage  <  h`p://another.dataset.com/homepage>  ;    dcterms:Mtle  “Another  2tle"  ;    dcterms:descripMon  “Another  descrip2on."  ;      void:exampleResource  <  h`p://another.dataset.com/resource/URI/XXXX  >  .    :Yourdataset-­‐Anotherdataset  rdf:type  void:Linkset  ;    void:linkPredicate  <h`p://your.dataset.com/predicate  used  for  linking>  ;    void:target  <h`p://your.dataset.com/>  ;    void:target  :Anotherdataset  .  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Publish  metadata  and  online  documenta2on  !  Examples  Leeds  City  Council  example  DCAT  descripMon:  hfp://smartcity.linkeddata.es/lcc/dcat.fl    

Ontology:  hfp://smartcity.linkeddata.es/lcc/ontology/EnergyConsumpMon/      

BECA  energy  consump2on  example  DCAT  descripMon:  hfp://smartcity.linkeddata.es/BECA/dcat.fl    

Ontology:  hfp://smartcity.linkeddata.es/BECA/ontology/EnergyConsumpMon      

 

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Enable  dataset  discovery  to  enable  the  mechanisms  to  complement  the  efforts  from  the  previous  step  and  to  allow  both  human  and  machines  to  discover  and  befer  use  the  dataset.      1.  To  create  a  sitemap.  2.  To  register  the  dataset  in  a  dataset  catalogue.  3.  To  ensure  the  fulfillment  of  requirements  for  addiMon  to  the  

LOD  cloud.  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Enable  dataset  discovery  1.  To  create  a  sitemap.    A  sitemap  is  a  mechanism  to  inform  search  engines  about  the  page  structure  of  a  certain  web  site  in  order  to  allow  for  a  more  efficient  crawling.      It  is  widely  used  and  adopted  by  major  search  engines  and  it  is  therefore  recommended  for  any  type  of  web  site  including  datasets.  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Enable  dataset  discovery  1.  To  create  a  sitemap  with  sitemap4rdf.  

There  are  tools  like  sitemap4rdf  that  can  generate  a  sitemap  based  on  the  contents  of  a  sparql  endpoint:  hfp://lab.linkeddata.deri.ie/2010/sitemap4rdf/  

   sitemap4rdf  {your_sparql_endpoint}  {prefix_of  your_url}    

 

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Enable  dataset  discovery  1.  To  create  a  sitemap  with  sitemap4rdf.  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Enable  dataset  discovery  2.  To  register  the  dataset  in  a  dataset  catalogue.  

There  are  available  several  online  data  catalogues  that  range  from  general  to  corporate  iniMaMves:  

-­‐Datahub:  this  data  management  pla�orm  covers  a  wide  range  of  topics.  It  offers  data  collecMons,  some  of  which  are  linked  and  open.    -­‐  Reegle:  the  gateway  has  already  established  itself  as  a  popular  informaMon  portal  in  the  fields  of  renewable  energy  and  energy  efficiency.  It  offers  all  of  its  data  under  W3C  standards,  i.e.,  it  is  open  and  Linked  Data  in  a  non-­‐proprietary  format  (RDF).    -­‐  OpenEI:  a  collaboraMve  knowledge-­‐sharing  pla�orm  with  free  and  open  access  to  energy-­‐  related  data,  models,  tools,  and  informaMon.  OpenEI  features  over  55,000  content  pages,  more  than  600  downloadable  datasets,  regional  gateways  on  a  variety  of  energy-­‐related  topics,  and  numerous  online  tools.    

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Enable  dataset  discovery  2.  To  register  the  dataset  in  a  dataset  catalogue.  

There  are  available  several  online  data  catalogues  that  range  from  general  to  corporate  iniMaMves:  

-­‐DataCatalogs:  a  comprehensive  list  of  open  data  catalogues  in  the  world  including  representaMves  from  local,  regional  and  naMonal  governments,  internaMonal  organisaMons  and  numerous  NGOs.  NaMonal  energy  related  data  are  contained  in  the  listed  datasets.  -­‐  Google  Public  Data:  a  corporate  iniMaMve  for  large  datasets  publicaMon  enabling  exploraMon  easiness,  visualizaMon  and  communicaMon.  -­‐  READY4SmartCi2es:  One  of  the  direct  outcomes  of  the  project  is  a  web-­‐portal  providing  an  extended  list  of  ontologies  and  datasets  for  smart  ciMes  published  both  in  human-­‐readable  (HTML  web  site)  and  machine-­‐processable  (RDF  Format)  formats.  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Enable  dataset  discovery  2.  To  register  the  dataset  in  a  dataset  catalogue.  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Enable  dataset  discovery  3.  To  ensure  the  fulfillment  of  requirements  for  addiMon  to  the  LOD  

cloud.  

The  dataset  is  checked  with  the  record  validator    (hfp://validator.lod-­‐cloud.net)  provided  by  the  LOD  cloud  website.    

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Enable  dataset  discovery  3.  To  ensure  the  fulfillment  of  requirements  for  addiMon  to  the  LOD  

cloud.  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Dataset  promo2on  The  promoMon  is  important  in  order  to  ensure  that  people  are  aware  of  the  existence  of  the  dataset  and  to  ensure  its  usage.      This  way  it  can  be  used  by  third-­‐parMes  by  querying,  linking  to  other  datasets  and  visualizing.    The  dataset  can  be  promoted  using  different  channels:  Twifer,  LinkedIn,  Mailing  lists,  Workshops,  Conferences,  VoCamps….  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  publicaMon  

Data  set  support  Dataset  support  is  a  conMnuous  process  in  which  the  creators  and  publishers  of  the  dataset  provide  support  in  terms  of  possible  errors  correcMon  (both  related  to  data  themselves  and  technical  errors),  data  updates  in  the  case  that  new  data  become  available,  and  technical  support  in  terms  of  solving  any  problem  that  can  affect  the  accessibility  of  the  dataset.    Dataset  support  is  usually  provided  by  the  persons  that  parMcipated  in  data  generaMon  and  publicaMon.  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Index  

•  Requirements    •  Guidelines  for  the  PublicaMon  of  Linked  Data  •  Guidelines  for  the  Exploita2on  of  Linked  Data  •  Hands-­‐on  Session  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Linked  Data  life  cycle  

Specification

Modelling

Generation Publication

Exploitation

Linking

64  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  the  ExploitaMon  

Generic  Linked  Data  use  cases  Use  Case  1  -­‐  SME  use  of  public  Linked  Energy  Data  Contributors:  Filip  Radulovic  (UPM)  Use  of  different  energy  related  data  sources  such  as  energy  consumpMon,  weather  condiMons,  building    and  HVAC  system  features,  socio-­‐economic  indicators...  To  enable  SMEs  to  develop  new  business  models.    LD  benefits:  data  integraMon/interlinking,  data-­‐driven  decision-­‐making.    Challenges:  •  Manual  access  to  data.  •  Obtaining  heterogeneous  data  from  various  sources.  •  Different  data  formats  and  structures.  •  Dynamically  updated  data.  •  Instance  specific  data  with  different  detail  levels.  •  No  open  license  associated  with  data.  •  Data  not  available  for  reuse.  •  Privacy  protecMon.  •  ParMal  or  missing  data.  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  the  ExploitaMon  

Generic  Linked  Data  use  cases  Use  Case  1  -­‐  SME  use  of  public  Linked  Energy  Data  

EECITIES  example  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Use  Case  1  -­‐  SME  use  of  public  Linked  Energy  Data  

Data  connected  through  the  Seman&c  Energy  Informa&on  Framework  

Energy  asse

ssment  (SAP,  U

EP…)  Energ

y  simulaMon  (UR

SOS,  …)  

Energy  ana

lysis  (data  m

ining,..)  

GIS  model  (

geometric  d

ata)  

DATA   TOOLS  

CADASTER  

GIS  

ENERGY  PERFORMANCE  

SOCIOECONOMIC    

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Use  Case  1  -­‐  SME  use  of  public  Linked  Energy  Data  

Once  a  baseline  reflec2ng  the  current  state  of  the  urban  energy  model  has  been  created,  different  visualiz2on  tools  can  be  used  to   iden2fy  problem  areas.  

Cluster  view  Table  view    

Performance  indicators  filtering  

Mul2ple  scale  visualiza2on    

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Use  Case  1  -­‐  SME  use  of  public  Linked  Energy  Data  

informa2on  concerning  the  selected  building  which  have  not  yet  assessed  

Building  geometry  obtained  from  the  3D  model    

Street  address  obtained  from  Google  GeolocaMon  services  

Performance  values  to  be  calculated  with  energy  assessment  tool  

Year  of  construcMon  obtained  from  the  cadastre  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Use  Case  1  -­‐  SME  use  of  public  Linked  Energy  Data  

Interface   of   the   URSOS   tool.   The   input   data   is   automa2cally   filled   thanks   to   the  seman2c  integra2on  of  different  data  sources.  Users  can  modify  the  input  data  in  case  there  are  errors.  

Year  of  construcMon  from  the  Cadastre    

Geometry  obtained  from  the  3D  model    

Street  address  name  and  Street  view  from  Google  GeolocaMon  services  

Wall,  ground  and  roof  properMes  from  the  building  typologies  database  

VenMlaMon  from  the  building  typologies  database  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Use  Case  1  -­‐  SME  use  of  public  Linked  Energy  Data  

URSOS  Energy  calcula2on  engine  

GIS  data  

Census   Cadastre  Climate  

Typology   Socio-­‐Economic  

Energy-­‐related  data   Seman2c  Energy  Informa2on  Framework  

Integrated  Plaeorm  

ELITE    Federa&on  engine  

Ontology  OWL-­‐DL  liteA  

URSOS  Input  form  

       

   

3D  Maps  

1

2

3 5

4

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Use  Case  1  -­‐  SME  use  of  public  Linked  Energy  Data  

URSOS  Energy  calcula2on  engine  

GIS  data  

Census   Cadastre  Climate  

Typology   Socio-­‐Economic  

Energy-­‐related  data   Seman2c  Energy  Informa2on  Framework  

Integrated  Plaeorm  

ELITE    Federa&on  engine  

Ontology  OWL-­‐DL  liteA  

URSOS  Input  form  

       

   

3D  Maps  

1

2

3 5

41.  The  user  selects  a  building  2.  The  ID  of  the  selected  

building  is  used  to  retrieve  the  building  parameters  form  the  data  sources  using  SPARQL:  Cadastre  Census  Building  typologies  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Use  Case  1  -­‐  SME  use  of  public  Linked  Energy  Data  

URSOS  Energy  calcula2on  engine  

GIS  data  

Census   Cadastre  Climate  

Typology   Socio-­‐Economic  

Energy-­‐related  data   Seman2c  Energy  Informa2on  Framework  

Integrated  Plaeorm  

ELITE    Federa&on  engine  

Ontology  OWL-­‐DL  liteA  

URSOS  Input  form  

       

   

3D  Maps  

1

2

3 5

41.  The  user  selects  a  building  2.  The  ID  of  the  selected  

building  is  used  to  retrieve  the  building  parameters  form  the  data  sources  using  SPARQL:  Cadastre  Census  Building  typologies  

prefix sumo: <http://www.ontologyportal.org/SUMO.owl#> prefix semanco: http://www.semanco-project.eu/2012/5/SEMANCO.owl# SELECT DISTINCT ?year WHERE { ?b a sumo:Building; semanco:hasAge [semanco:year_Of_ContructionValue ?year]; semanco:hasBuilding_Cadastral_Data [semanco:hasCadastral_Reference ?ref]. ?ref semanco:cadref1Value "2402012". }

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Use  Case  1  -­‐  SME  use  of  public  Linked  Energy  Data  

URSOS  Energy  calcula2on  engine  

GIS  data  

Census   Cadastre  Climate  

Typology   Socio-­‐Economic  

Energy-­‐related  data   Seman2c  Energy  Informa2on  Framework  

Integrated  Plaeorm  

ELITE    Federa&on  engine  

Ontology  OWL-­‐DL  liteA  

URSOS  Input  form  

       

   

3D  Maps  

1

2

3 5

41.  The  user  selects  a  building  2.  The  ID  of  the  selected  

building  is  used  to  retrieve  the  building  parameters  form  the  data  sources  using  SPARQL:  Cadastre  Census  Building  typologies  

prefix sumo: <http://www.ontologyportal.org/SUMO.owl#> prefix semanco: <http://www.semanco-project.eu/2012/5/SEMANCO.owl#> SELECT DISTINCT ?age ?to ?from WHERE { ?age a semanco:Age_Class . ?age semanco:hasTo_Year ?age_to_instance . ?age_to_instance semanco:toYearValue ?to . filter(?to >= '1885') . ?age semanco:hasFrom_Year ?age_from_instance2 . ?age_from_instance2 semanco:fromYearValue ?from . filter(?from <= '1885') . }

prefix sumo: <http://www.ontologyportal.org/SUMO.owl#> prefix semanco: http://www.semanco-project.eu/2012/5/SEMANCO.owl# SELECT DISTINCT ?year WHERE { ?b a sumo:Building; semanco:hasAge [semanco:year_Of_ContructionValue ?year]; semanco:hasBuilding_Cadastral_Data [semanco:hasCadastral_Reference ?ref]. ?ref semanco:cadref1Value "2402012". }

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Use  Case  1  -­‐  SME  use  of  public  Linked  Energy  Data  

URSOS  Energy  calcula2on  engine  

GIS  data  

Census   Cadastre  Climate  

Typology   Socio-­‐Economic  

Energy-­‐related  data   Seman2c  Energy  Informa2on  Framework  

Integrated  Plaeorm  

ELITE    Federa&on  engine  

Ontology  OWL-­‐DL  liteA  

URSOS  Input  form  

       

   

3D  Maps  

1

2

3 5

41.  The  user  selects  a  building  2.  The  ID  of  the  selected  

building  is  used  to  retrieve  the  building  parameters  form  the  data  sources  using  SPARQL:  Cadastre  Census  Building  typologies  

prefix sumo: <http://www.ontologyportal.org/SUMO.owl#> prefix semanco: <http://www.semanco-project.eu/2012/5/SEMANCO.owl#> SELECT DISTINCT ?age ?to ?from WHERE { ?age a semanco:Age_Class . ?age semanco:hasTo_Year ?age_to_instance . ?age_to_instance semanco:toYearValue ?to . filter(?to >= '1885') . ?age semanco:hasFrom_Year ?age_from_instance2 . ?age_from_instance2 semanco:fromYearValue ?from . filter(?from <= '1885') . }

prefix sumo: <http://www.ontologyportal.org/SUMO.owl#> prefix semanco: http://www.semanco-project.eu/2012/5/SEMANCO.owl# SELECT DISTINCT ?year WHERE { ?b a sumo:Building; semanco:hasAge [semanco:year_Of_ContructionValue ?year]; semanco:hasBuilding_Cadastral_Data [semanco:hasCadastral_Reference ?ref]. ?ref semanco:cadref1Value "2402012". }

prefix sumo: <http://www.ontologyportal.org/SUMO.owl#> prefix semanco: <http://www.semanco-project.eu/2012/5/SEMANCO.owl#> SELECT DISTINCT ?uvalue where { ?b semanco:hasSpace [ semanco:hasCS_Envelope [semanco:hasBottom_Floor ?bf]]; semanco:hasAge <http://www.semanco-project.eu/manresa/age_class/1>. ?bf semanco:hasBottom_Floor_U-value [semanco:bottom_Floor_U-valueValue ?uvalue]. ?bf semanco:hasBottom_Floor_Type [semanco:bottom_Floor_TypeValue "Bottom"]. }

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Use  Case  1  -­‐  SME  use  of  public  Linked  Energy  Data  

URSOS  Energy  calcula2on  engine  

GIS  data  

Census   Cadastre  Climate  

Typology   Socio-­‐Economic  

Energy-­‐related  data   Seman2c  Energy  Informa2on  Framework  

Integrated  Plaeorm  

ELITE    Federa&on  engine  

Ontology  OWL-­‐DL  liteA  

URSOS  Input  form  

       

   

3D  Maps  

1

2

3 5

41.  The  user  selects  a  building  2.  The  ID  of  the  selected  

building  is  used  to  retrieve  the  building  parameters  form  the  data  sources  using  SPARQL:  Cadastre  Census  Building  typologies  

prefix sumo: <http://www.ontologyportal.org/SUMO.owl#> prefix semanco: <http://www.semanco-project.eu/2012/5/SEMANCO.owl#> SELECT DISTINCT ?age ?to ?from WHERE { ?age a semanco:Age_Class . ?age semanco:hasTo_Year ?age_to_instance . ?age_to_instance semanco:toYearValue ?to . filter(?to >= '1885') . ?age semanco:hasFrom_Year ?age_from_instance2 . ?age_from_instance2 semanco:fromYearValue ?from . filter(?from <= '1885') . }

prefix sumo: <http://www.ontologyportal.org/SUMO.owl#> prefix semanco: http://www.semanco-project.eu/2012/5/SEMANCO.owl# SELECT DISTINCT ?year WHERE { ?b a sumo:Building; semanco:hasAge [semanco:year_Of_ContructionValue ?year]; semanco:hasBuilding_Cadastral_Data [semanco:hasCadastral_Reference ?ref]. ?ref semanco:cadref1Value "2402012". }

prefix sumo: <http://www.ontologyportal.org/SUMO.owl#> prefix semanco: <http://www.semanco-project.eu/2012/5/SEMANCO.owl#> SELECT DISTINCT ?uvalue where { ?b semanco:hasSpace [ semanco:hasCS_Envelope [semanco:hasBottom_Floor ?bf]]; semanco:hasAge <http://www.semanco-project.eu/manresa/age_class/1>. ?bf semanco:hasBottom_Floor_U-value [semanco:bottom_Floor_U-valueValue ?uvalue]. ?bf semanco:hasBottom_Floor_Type [semanco:bottom_Floor_TypeValue "Bottom"]. }

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Use  Case  1  -­‐  SME  use  of  public  Linked  Energy  Data  

Applying  improvements.  For  example,  renova2ng  the  exis2ng  windows  or  replacing  them  with  new  ones  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Use  Case  1  -­‐  SME  use  of  public  Linked  Energy  Data  

Results  aqer  applying  the  improvement  measures      

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Use  Case  1  -­‐  SME  use  of  public  Linked  Energy  Data  

Projects  can  be  compared  with  a  mul2-­‐criteria  decision  tool  included  in  the  plaeorm.  Users  can  select  the  weight  (importance)  of  the  performance  indicators.  

Besides,  other  indicators  defined  by  users  can  be  included  in  the  analysis,  for  example:  foreseen  funding.  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Use  Case  1  -­‐  SME  use  of  public  Linked  Energy  Data  

Services: •  Assessing the current energy performance of buildings in towns and cities. •  Identifying priority areas and buildings for energy efficiency interventions. •  Evaluating the impact of proposed new buildings at the urban level. •  Evaluating the impact of refurbishing buildings at the urban level. •  Evaluating the impact of potential of local policies and interventions on

Sustainable Energy Action Plans (SEAP). •  Generating missing data to enable the classification of buildings according to

their energy performance. •  Integrating new data to be visualized and analysed.

SERVICE PLATFORM TO SUPPORT PLANNING OF ENERGY EFFICIENT CITIES

www.eeciMes.com  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  the  ExploitaMon  

Generic  Linked  Data  use  cases  Use  Case  2  -­‐  The  energy  data  portal  Contributors:  Filip  Radulovic  (UPM)  The  energy  portal  is  a  portal  that  integrates  energy  data  and  provides  access  to  various  energy  indicators  from  different  data  sources  (e.g.,  heaMng  consumpMon,  water  consumpMon,  electricity  consumpMon  for  different  geographical  regions  etc.).    LD  benefits:  data  integraMon,  data  search,  shareable  data,  reusable  data,  extensible  data,  mulMlingual  support,  data  discovery,  transparency,  and  high  degree  of  automaMon.    Challenges:  •  Quality  of  data  and  metadata.  •  Inconsistency  between  different  sources.  •  Wide  variety  of  data  formats.  •  Data  provenance.  •  No  open  license  associated  with  data.  •  Licenses  complexity  and  diversity.  •  Tracking  changes  in  data.  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  the  ExploitaMon  

Generic  Linked  Data  use  cases  Use  Case  2  -­‐  The  energy  data  portal  

SEÍS  example  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Use  Case  2  -­‐  The  energy  data  portal  

Energy  Model  

Energy  Performance  Benchmarking  

Examples  of  Energy  Efficient  Building   Energy  Efficient  Design  Paferns  

Enter  a  Building  SimulaMon  

SEÍS  system   Data  portal  

Ontology  Repository  

Climate  Geographic  Monitoring  CerMficaMon  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Use  Case  2  -­‐  The  energy  data  portal  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Use  Case  2  -­‐  The  energy  data  portal  

Efficient  buildings  and  their  performance  indicators  divided  in  two  groups:  Energy  (hea2ng  and  cooling  demand,  total  primary  energy,  CO2  emissions)  and  Indoor  Space  (2me  above  and  below  comfort).    

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Use  Case  2  -­‐  The  energy  data  portal  

The  informa2on  to  be  uploaded  is  divided  in  five  categories:  Project  Data,  Building  Proper2es,  Outdoor  Environment,  Opera2on  and  Performance.    The  data  uploaded  through  this  service  will  be  assigned  to  the  terms  of  the  ontology  thus  ensuring  the  compa2bility  of  the  new  data  with  the  exis2ng  data.  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Use  Case  2  -­‐  The  energy  data  portal  

This  service  calculates  the  value  for  the  indicators  for  energy  efficient  and  other  buildings.  The  graphics  display  the  minimun,  maximum,  and  median  values  for  each  indicator  and  type  of  building.  The  values  of  the  indicators  of  the  selected  buildings  are  shown  in  the  orange  circles.  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  the  ExploitaMon  

Generic  Linked  Data  use  cases  Use  Case  3  -­‐  Energy  data  infographic  Contributors:  Filip  Radulovic  (UPM)  Energy  data  infographic  allows  the  visualizaMon  of  energy  data,  using  a  variety  of  visual  analyMcs  techniques,  related  to  different  aspects  of  these  data,  for  example  geographic  regions,  energy  indicators    LD  benefits:  data  integraMon,  mulMlingual  support,  data  discovery,  and  transparency.    Challenges:  •  Data  provenance.  •  No  open  license  associated  with  data.  •  ParMal  or  missing  data.  •  Different  data  formats  and  structures.  •  Quality  of  data  and  metadata.  •  Inconsistency  between  different  sources.  •  Deprecated  data.  •  Data  persistence.  •  Data  integraMon  from  several  resources  with  diverse  licenses  and  formats.  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  the  ExploitaMon  

Generic  Linked  Data  use  cases  Use  Case  3  -­‐  Energy  data  infographic  

A  script  to  create  charts  based  on  the  results  of  a  SPARQL  query  using  Google  Visualiza2on  API:  Example  1  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  the  ExploitaMon  

Generic  Linked  Data  use  cases  Use  Case  3  -­‐  Energy  data  infographic  

A  script  to  create  charts  based  on  the  results  of  a  SPARQL  query  using  Google  Visualiza2on  API:  Example  2  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  the  ExploitaMon  

A  script  to  create  charts  based  on  the  results  of  a  SPARQL  query  using  Google  Visualiza2on  API:  Example  3    

Generic  Linked  Data  use  cases  Use  Case  3  -­‐  Energy  data  infographic  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  the  ExploitaMon  

Generic  Linked  Data  use  cases  Use  Case  4  -­‐  Energy  consumpMon  map  Contributors:  Filip  Radulovic  (UPM)  Energy  consumpMon  data  from  various  sources  and  regions  can  be  interacMvely  shown  on  a  geographic  map    LD  benefits:  data  integraMon,  data  discovery,  and  transparency.    Challenges:  •  Data  provenance.  •  No  open  license  associated  with  data.  •  Quality  of  data  and  metadata.  •  Inconsistency  between  different  sources.  •  Deprecated  data.  •  Different  data  formats  and  structures..  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  the  ExploitaMon  

Generic  Linked  Data  use  cases  Use  Case  4  -­‐  Energy  consumpMon  map  

Energy  indicators  shown  in  a  geographical  map.  Leq:  SEMANCO  plaeorm  with  neighborhood  boundaries.  Right:  SEIS  system  with  a  heat  map  of  energy  efficiency  buildings.  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  the  ExploitaMon  

Generic  Linked  Data  use  cases  Use  Case  5  -­‐  Linked  Energy  Data  quality  improvement  Contributors:  Filip  Radulovic  (UPM)  A  large  number  of  Linked  Data  datasets  is  being  published  on  the  web,  and  it  can  be  expected  that  the  published  data  are  prone  to  errors,  which  can  originate  either  in  the  Linked  Data  generaMon  process  or  in  the  original  data  source    LD  benefits:  reusable  data,  extensible  data,  and  transparency.    Challenges:  •  Enabling  user  feedback  and  updates.  •  Machine-­‐readable  descripMon  of  data  quality  improvements.  •  ParMal  or  missing  data.  •  Tracking  changes  in  data.  •  ReincorporaMng  such  improvements.  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  the  ExploitaMon  

Generic  Linked  Data  use  cases  Use  Case  6  -­‐  Energy  situaMon  awareness  Contributors:  Filip  Radulovic  (UPM)  The  collecMon  and  analysis  of  Linked  Energy  Data  may  provide  insight  into  situaMons  that  might  lead  to  technical  or  environmental  hazards  and  that  would  require  human  acMon    LD  benefits:  sharable  data,  data  discovery,  and  transparency.    Challenges:  •  Quality  of  data  and  metadata.  •  Inconsistency  between  different  sources.  •  Wide  variety  of  data  formats.  •  Data  provenance.  •  Deprecated  data.  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  the  ExploitaMon  

Generic  Linked  Data  use  cases  Use  Case  6  -­‐  Energy  situaMon  awareness  

Newcastle  case  study:  Fuel  poverty  issue  visualized  and  quan2fied  through  the  SEMANCO  plaeorm.  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  the  ExploitaMon  

Generic  Linked  Data  use  cases  Use  Case  6  -­‐  Energy  situaMon  awareness  

Energy  data  analysis  

RDF  dataset  Sparql  queries  

Machine  learning  methods  

New  insights  

Decision  making  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  the  ExploitaMon  

Generic  Linked  Data  use  cases  Use  Case  6  -­‐  Energy  situaMon  awareness  

Rapidminer  a  suit  for  implemen2ng  Machine  Learning  methods  using  a  graphic  interface.  Includes  Weka  repository  (clustering,  regression,  classifica2on…)  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  the  ExploitaMon  

Generic  Linked  Data  use  cases  Use  Case  6  -­‐  Energy  situaMon  awareness  

Operators  for  retrieving  RDF  data  from  an  endpoint  with  SPARQL  queries  

SELECT  disMnct  ?uprn  ?label  ?suburb  ?gas1213    WHERE  {    ?feature  <hfp://schema.org/containedIn>  [rdf:label  ?suburb].  ?feature  rdf:label  ?label.  ?feature  lcc:uprn  ?uprn.  ?observaMon  ssnx:featureOfInterest  ?feature.  ?observaMon  ssnx:observedProperty  [rdf:label  "Gas"].  ?observaMon  ssnx:observaMonResult    

 [ssnx:hasValue  [lcc:hasQuanMtyValue  ?gas1213]].  ?observaMon  ssnx:observaMonSamplingTime  [rdf:label  ?Mme].  FILTER  regex(?Mme,  "2012/2013").  }    

Gas  consump2on  12/13  

Gas  consump2on  11/12  

Gas  consump2on  10/11  

Electricity  consump2on  12/13  

Electricity    consump2on  11/12  

Electricity    consump2on  10/11  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  the  ExploitaMon  

Generic  Linked  Data  use  cases  Use  Case  6  -­‐  Energy  situaMon  awareness  

Operators  for  joining  the  results  of  the  queries  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  the  ExploitaMon  

Generic  Linked  Data  use  cases  Use  Case  6  -­‐  Energy  situaMon  awareness  

Operators  for  cleaning  the  data  items  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  the  ExploitaMon  

Generic  Linked  Data  use  cases  Use  Case  6  -­‐  Energy  situaMon  awareness  

Operator  for  clustering  examples  based  on  their  proper2es  

Minimum  number  of  clusters  

Maximum  number  of  clusters  

Methods  for  calcula2ng  the  distance  between  items  (similitude)    

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  the  ExploitaMon  

Generic  Linked  Data  use  cases  Use  Case  6  -­‐  Energy  situaMon  awareness  

Operator  for  preparing  the  final  results  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Guidelines  for  the  ExploitaMon  

Generic  Linked  Data  use  cases  Use  Case  6  -­‐  Energy  situaMon  awareness  

Cluster  0  

Cluster  1  

Cluster  2  

The  buildings  of  each  cluster  have  similar  performance  over  the  period  2010-­‐2013.  "  The  City  Council  can  propose  a  public  funding  for  refurbishing  the  bad  performing  buildings  based  on  the  results  of  the  clustering    

Results  of  the  process:  4  clusters  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Index  

•  Requirements    •  Guidelines  for  the  PublicaMon  of  Linked  Data  •  Guidelines  for  the  ExploitaMon  of  Linked  Data  •  Hands-­‐on  Session  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

What  are  we  going  to  do?  

Specification

Modelling

Generation Publication

Exploitation

Linking

106  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

PublicaMon  Index  

1.  Ensure  legal  compliance  2.  Publish  the  dataset  and  the  ontology  3.  Publish  metadata  and  online  documentaMon    4.  Enable  dataset  discovery  5.  Dataset  promoMon    6.  Dataset  support    

107  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Ensure  Legal  compliance  

•  If  the  source  of  your  dataset  permits  the  use  of  data  then  the  RDF  dataset  has  to  comply  the  license  and  legal  terms  of  the  original  data  source.    

•  The  main  issue  to  take  into  account  is  the  data  privacy.  That  is:  private  enMMes  described  in  the  dataset  cannot  be  idenMfied.  (e.g.  people)  

108  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Ensure  Legal  compliance  

•  TASK  1:  Check  if  your  dataset  preserves  privacy  No  explicit  iden&fiers  are  used  as  literals  and  in  URIs  (e.g.  Na&onal  ID  numbers,  credit  card  numbers….)    

 

109  

In   the   case   your   dataset   does   not   preserves  privacy  go  back  to  RDF  genera2on  session  and  apply  the  following  anonymiza2on  techniques:      -­‐  Suppression  -­‐  Generaliza2on  -­‐  Anatomiza2on  -­‐  Perturba2on  -­‐  Aggrega2on  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Publish  the  dataset  and  the  ontology  

•  Make  accessible  through  the  Web  the  ontology  and  the  RDF  dataset  (also  the  links  to  other  datasets)  following  Linked  Data  principles.  

–  To  store  the  RDF  data  into  a  persistent  repository    –  To  enable  resolvable  HTTP  URIs  and  content  negoMaMon    –  To  enable  a  SPARQL  HTTP  endpoint    

 110  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Publish  the  dataset  and  the  ontology  

•  TASK  2:  To  store  the  RDF  data  into  a  persistent  repository    Upload  the  RDF  dataset  and  the  ontology  to  the  OpenLink  Virtuoso  server    

 

111  

Deliverable:  a  report  with  the  result  of  some  SPARQL  queries    using  the  Virtuoso  server  endpoint  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Publish  the  dataset  and  the  ontology  

•  TASK  2:  To  store  the  RDF  data  into  a  persistent  repository    Upload  the  RDF  dataset  and  the  ontology  to  the  OpenLink  Virtuoso  server    

 

112  

Op2on  a)  Using  conductor  (Web  interface)  for  small  datasets  (  <  10  MB)  

RDF  file  in  RDF/XML  format  

Name  of  the  graph    (e.g  name  of  the  dataset)  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Publish  the  dataset  and  the  ontology  

•  TASK  2:  To  store  the  RDF  data  into  a  persistent  repository    Upload  the  RDF  dataset  and  the  ontology  to  the  OpenLink  Virtuoso  server    

 

113  

Op2on  b)  Using  Virtuoso  console  for  big  datasets  (  >  20  MB)  

1.  Store  your  RDF  dump  file  in  RDF/XML  format  in  a  folder  of  the  server  where  Virtuoso  is  installed    2.  Go  to  ISQL  console      3.  Invoke  this  method:      

 DB.DBA.RDF_LOAD_RDFXML_MT  (file_to_string_output  (‘path_to/rdf_dump.rdf'),'',  '{your_graph_name}');    4.  In  case  you  need,  clear  the  graph  with  this  method:    

 SPARQL  CLEAR  GRAPH  <{your_graph_name}>;    

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Publish  the  dataset  and  the  ontology  

•  TASK  3:  To  enable  resolvable  HTTP  URIs  and  content  negoMaMon    Install  Pubby  to  enable  resolvable  HTTP  URIs  

114  

Deliverable:  dataset  with  resolvable  URIs  

op2onal  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Publish  the  dataset  and  the  ontology  

•  TASK  3:  To  enable  resolvable  HTTP  URIs  and  content  negoMaMon    Install  Pubby  to  enable  resolvable  HTTP  URIs  

115  

PUBBY  configura2on    Go  to:  tomcat/webapps/pubby/WEB-­‐INF/config.n3  

op2onal  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Publish  metadata  and  online  documentaMon  

116  

•  Create   and   publish   the   documentaMon   of   the   RDF  dataset   and   the   ontology.   This   documentaMon   is  oriented   to   both   human   and   machine   users   and   its  purpose  is  to  facilitate  the  usage  of  the  dataset  that  is  being  made  available.  

–  DescripMon  of  the  datasets  in  DCAT  and  VoID  vocabularies  –  Human-­‐readable  documentaMon  of  dataset  and  ontology  

 

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Publish  metadata  and  online  documentaMon  

•  TASK  4:  To  describe  the  dataset  in  DCAT  /VoID  vocabularies    

 

117  

Deliverable:  a  DCAT  or  VoID  file  describing  your  dataset  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Publish  metadata  and  online  documentaMon  

•  TASK  4:  To  describe  the  dataset  in  DCAT  /VoID  vocabularies  Following  the  DCAT  example:  

 

118  

@prefix  os:        <hfp://a9.com/-­‐/spec/opensearch/1.1/>  .  @prefix  dct:      <hfp://purl.org/dc/terms/>  .  @prefix  xsd:      <hfp://www.w3.org/2001/XMLSchema#>  .  @prefix  api:      <hfp://purl.org/linked-­‐data/api/vocab#>  .  @prefix  rdf:      <hfp://www.w3.org/1999/02/22-­‐rdf-­‐syntax-­‐ns#>  .  @prefix  xhv:      <hfp://www.w3.org/1999/xhtml/vocab#>  .    <h`p://your.dataset.com/>                  a                            dct:Dataset  ;                  dct:license        <h`p://purl.org/NET/rdflicense/ukogl1.0>  ;                  dct:source          “Descrip2on  of  the  dataset  source"  ;                  <hfp://www.w3.org/2002/07/owl#sameAs>                                    <h`p://datahub.io/dataset/XXX>  .                  dct:publisher    “The  publisher  of  the  dataset”;                dct:language    <h`p://id.loc.gov/vocabulary/iso639-­‐1/en>    ;                dct:accrualPeriodicity    <h`p://purl.org/linked-­‐data/sdmx/2009/code#freq-­‐W>    ;  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Publish  metadata  and  online  documentaMon  

•  TASK  4:  To  describe  the  dataset  in  DCAT  /VoID  vocabularies  Following  the  VoID  example:  

 

119  

@prefix  rdf:  <hfp://www.w3.org/1999/02/22-­‐rdf-­‐syntax-­‐ns#>  .  @prefix  rdfs:  <hfp://www.w3.org/2000/01/rdf-­‐schema#>  .  @prefix  foaf:  <hfp://xmlns.com/foaf/0.1/>  .  @prefix  dcterms:  <hfp://purl.org/dc/terms/>  .  @prefix  void:  <hfp://rdfs.org/ns/void#>  .  @prefix  xsd:  <hfp://www.w3.org/2001/XMLSchema#>  .    ##  your  dataset    <h`p://your.dataset.com/>  rdf:type  void:Dataset  ;    foaf:homepage  <h`p://your.dataset.com/homepage>  ;    dcterms:Mtle  “Title  of  your  dataset"  ;    dcterms:descripMon  “Descrip2on  of  your  dataset."  ;    void:sparqlEndpoint  <h`p://your.dataset.com/sparql>  ;    void:uriSpace  "h`p://your.dataset.com/resource/";    void:exampleResource  <h`p://your.dataset.com/resource/URI/XXXX>  .    dcterms:source  "  Descrip2on  of  the  dataset  source"  ;    dcterms:created  “XXXX-­‐XX-­‐XX"^^xsd:date;    dcterms:license  <h`p://crea2vecommons.org/licenses/by/3.0/>                  dcterms:subject  <h`p://dbpedia.org/resource/Building>;  

 void:triples  150297  ;    void:enMMes  18890  ;    void:classes  65  ;    void:properMes  100  ;    void:disMnctSubjects  18962  ;    void:disMnctObjects  26097  ;      ##  datasets  you  link  to    :Anotherdataset  rdf:type  void:Dataset  ;    foaf:homepage  <  h`p://another.dataset.com/homepage>  ;    dcterms:Mtle  “Another  2tle"  ;    dcterms:descripMon  “Another  descrip2on."  ;      void:exampleResource  <  h`p://another.dataset.com/resource/URI/XXXX  >  .    :Yourdataset-­‐Anotherdataset  rdf:type  void:Linkset  ;    void:linkPredicate  <h`p://your.dataset.com/predicate  used  for  linking>  ;    void:target  <h`p://your.dataset.com/>  ;    void:target  :Anotherdataset  .  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Publish  metadata  and  online  documentaMon  

•  TASK  4:  To  describe  the  dataset  in  DCAT  /VoID  vocabularies    

 

120  

Resources:    -­‐  hSp://www.w3.org/TR/vocab-­‐dcat/  -­‐  hSp://www.w3.org/TR/void/  -­‐  hSps://code.google.com/p/void-­‐impl/wiki/SPARQLQueriesForSta&s&cs  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Publish  metadata  and  online  documentaMon  

•  TASK  5:  To  create  human-­‐oriented  documentaMon  With  Widoco  

121  

Deliverable:  a  HTML  document  describing  your  ontology  

op2onal  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Publish  metadata  and  online  documentaMon  

•  TASK  5:  To  create  human-­‐oriented  documentaMon  With  Widoco  

122  

1.  Setup  the  config  file:              config/config.proper&es  

2.  Invoke  this  method:      

 java  -­‐jar  widoco-­‐0.0.1-­‐jar-­‐with-­‐dependencies.jar  -­‐ontFile  {you_ontology_file.owl}      

op2onal  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Enable  dataset  discovery  

123  

•  To  enable  the  mechanisms  to  allow  both  human  and  machines  to  discover  and  befer  use  the  dataset.  

–  To  create  a  sitemap  to  inform  search  engines  about  the  page  structure.  

–  To  register  the  dataset  in  dataset  catalogues  (READY4SmartCiMes,  Datahub,    Reegle,  OpenEI…)  

–  To  ensure  the  fulfilment  of  requirements  for  addiMon  to  the  LOD  cloud.  

 

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Enable  dataset  discovery  

•  TASK  6:  To  create  a  sitemap  With  sitemap4rdf  

124  

Deliverable:  a  XML  document  with  the  site  map  of  your  dataset  

op2onal  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Enable  dataset  discovery  

•  TASK  6:  To  create  a  sitemap  With  sitemap4rdf  

125  

1.  Invoke  this  method:      

 sitemap4rdf  {your_sparql_endpoint}  {prefix_of  your_url}      

op2onal  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Enable  dataset  discovery  

•  TASK  7:  To  register  the  dataset  in  dataset  catalogues    In  READY4SmartCi&es,  Datahub,  Reegle,  OpenEI  

126  

Deliverable:  a  new  record  in  a  dataset  catalogue  for  your  dataset  

op2onal  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Enable  dataset  discovery  

•  TASK  7:  To  register  the  dataset  in  dataset  catalogues    In  READY4SmartCi&es,  Datahub,  Reegle,  OpenEI  

127  

1.  Go  to  :  hfp://smartcity.linkeddata.es/datasets/    2.  Click  on  through  a  detailed  form  and  fill  the  form    

op2onal  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Enable  dataset  discovery  

•  TASK  8:  To  ensure  the  fulfilment  of  requirements  for  addiMon  to  the  LOD  cloud  Using    Data  Hub  LOD  Datasets  

Deliverable:  a  report  describing  the  level  of  fulfilment  of  the  LOD  requirements  of  your  dataset  

op2onal  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Enable  dataset  discovery  

•  TASK  8:  To  ensure  the  fulfilment  of  requirements  for  addiMon  to  the  LOD  cloud  Using    Data  Hub  LOD  Datasets  

129  

1.  Go  to  :  hfp://validator.lod-­‐cloud.net/    2.  Validate  your  dataset  (previously  uploaded  in  Data  Hub  repository)  using  the  name  of  the  dataset    

op2onal  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

What  are  we  going  to  do?  

Specification

Modelling

Generation Publication

Exploitation

Linking

130  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

ExploitaMon  Index  

1.  Define  a  use  case  2.  Use  your  data  

131  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Define  a  use  case  

•  To  describe  a  use  case  for  energy  data  exploitaMon.  

–  To  define  the  use  case  for  data  exploitaMon  and  its  requirements  

 

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

•  TASK  9:  To  select    an  exisMng  use  case  or  define  a  new    use  case  for  your  dataset    

Deliverable:  a  detailed  descrip2on  of  a  use  case  for  your  dataset.  

Define  a  use  case  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

•  TASK  9:  To  select    an  exisMng  use  case  or  define  a  new    use  case  for  your  dataset    

Define  a  use  case  

Fill  the  template:    •   Use  case  2tle:  •   Reference:  •   Use  case  descrip2on:  •   Objec2ves:  •   Domain:  •   Stakeholders:  •   Requirements:  •   Linked  Data  benefits:  •   Challenges:  •   External  sources:    

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Use  your  data  

•  To  implement  the  use  case.  

–  To  provide  a  report/visualizaMon  for  your  data    

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

•  TASK  10:  To  provide  a  report/visualizaMon  for  your  data    

Deliverable:  a  report/visualiza2on  in  HTML/PDF  explaining  something  about  your  use  case  using  your  dataset  and  the  links  to  other  datasets.  

Define  a  use  case  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

•  TASK  10:  To  provide  a  report/visualizaMon  for  your  data    

Define  a  use  case  

Materials/sparqlhtml/Example  1.html  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Define  a  use  case  

Materials/sparqlhtml/Example  2.html  

•  TASK  10:  To  provide  a  report/visualizaMon  for  your  data    

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Define  a  use  case  

Materials/sparqlhtml/Example  3.html  

•  TASK  10:  To  provide  a  report/visualizaMon  for  your  data    

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Define  a  use  case  

Materials/LodLive  

•  TASK  10:  To  provide  a  report/visualizaMon  for  your  data    

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Define  a  use  case  

•  TASK  10:  To  provide  a  report/visualizaMon  for  your  data    

Materials/rapidminer/rapidminer_clustering_leeds.xml  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

Define  a  use  case:  summary  

•  TASK  1:  Check  if  your  dataset  preserves  privacy  

•  TASK  2:  To  store  the  RDF  data  into  a  persistent  repository    

•  TASK  3:  To  enable  resolvable  HTTP  URIs  and  content  negoMaMon    

•  TASK  4:  To  describe  the  dataset  in  DCAT  /VoID  vocabularies  

•  TASK  5:  To  create  human-­‐oriented  documentaMon  

•  TASK  6:  To  create  a  sitemap  

•  TASK  7:  To  register  the  dataset  in  dataset  catalogues    

•  TASK  8:  To  ensure  the  fulfilment  of  requirements  for  addiMon  to  the  LOD  cloud  

•  TASK  9:  To  select    an  exisMng  use  case  or  define  a  new    use  case  for  your  dataset  

•  TASK  10:  To  provide  a  report/visualizaMon  for  your  data  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

References  /  Bibliography  

The  contents  of  the  presentaMon  are  based  on:    •  F.  Radulovic,  R.  García-­‐Castro,  V.  Suero,  T.  Tryferidis,  K,Z,  Tsagkari,  D.  

Meskos.  Deliverable  D4.2:  Requirements  and  guidelines  for  energy  data  publicaMon.  Technical  report,  READY4SmartCiMes  ConsorMum.  (January  2015)  

•  F.  Radulovic,  R.  García-­‐Castro,  M.  Poveda.  Deliverable  D4.3:  Requirements  and  guidelines  for  energy  data  exploitaMon.  Technical  report,  READY4SmartCiMes  ConsorMum.  (January  2015)  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

References  /  Bibliography  

References    •  EECITIES  -­‐  hfp://www.eeciMes.com  

•  SEÍS  system  -­‐  hfp://www.seis-­‐system.org  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

References  /  Bibliography  

Links    •  Virtuoso  Server  -­‐  hfps://github.com/openlink/virtuoso-­‐opensource  •  Pubby  -­‐  hfp://wifo5-­‐03.informaMk.uni-­‐mannheim.de/pubby/  •  Elda  -­‐  hfps://github.com/epimorphics/elda  •  Puelia  -­‐  hfps://code.google.com/p/puelia-­‐php/  •  LDP4j  -­‐  hfp://www.ldp4j.org/  •  Apache  Marmofa  -­‐  hfp://marmofa.apache.org/  •  Sitemap4rdf  -­‐  hfp://lab.linkeddata.deri.ie/2010/sitemap4rdf/  •  LOD  cloud  validator  -­‐    hfp://validator.lod-­‐cloud.net  •  SPARQL  proxy:  hfps://logd.tw.rpi.edu/ws/sparqlproxy.php  •  Google  VisualizaMon  API  -­‐  hfps://developers.google.com/chart/  •  Rapidminer  -­‐  hfps://rapidminer.com/products/studio/  •  Widoco  -­‐  hfps://github.com/dgarijo/Widoco  •  LOD  live  app  –  hfp://en.lodlive.it  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

LD4SC  Summer  School  7th  -­‐  12th  June,  Cercedilla,  Spain  

1st  Summer  School  on    Smart  Ci2es  and  Linked  Open  Data  (LD4SC-­‐15)  

Thank  you  for  your  afenMon!