Copyright 2007 Digital Enterprise Research Institute. All rights reserved. SEMEDIA PARENTAL...

26
Copyright 2007 Digital Enterprise Research Institute. All rights reserved. www.deri.org SEMEDIA PARENTAL ADVISORY: Neither formulas nor inference rules in this presentation

Transcript of Copyright 2007 Digital Enterprise Research Institute. All rights reserved. SEMEDIA PARENTAL...

Page 1: Copyright 2007 Digital Enterprise Research Institute. All rights reserved.  SEMEDIA PARENTAL ADVISORY: Neither formulas nor inference rules.

Copyright 2007 Digital Enterprise Research Institute. All rights reserved.

www.deri.org

SEMEDIA

PARENTAL ADVISORY: Neither formulas nor inference rules in this presentation

Page 2: Copyright 2007 Digital Enterprise Research Institute. All rights reserved.  SEMEDIA PARENTAL ADVISORY: Neither formulas nor inference rules.

Copyright 2007 Digital Enterprise Research Institute. All rights reserved.

www.deri.org

SEMEDIA

Who the FOAF knows Alice? Towards Semantic Web Pipes

Axel Polleres, DERI Galway

Christian Morbidoni, SEMEDIA – University of Ancona, Italy

Joint work with:

Giovanni Tummarello, DERI Galway

Page 3: Copyright 2007 Digital Enterprise Research Institute. All rights reserved.  SEMEDIA PARENTAL ADVISORY: Neither formulas nor inference rules.

3

Outline

• Web Pipes, Web 2.0 + Semantic Web Pipes

• Revocations in RDF

• MSG Theory

• DBin 2.0: a Semantic Web Client implementing RDF revocations

• Conclusions

Page 4: Copyright 2007 Digital Enterprise Research Institute. All rights reserved.  SEMEDIA PARENTAL ADVISORY: Neither formulas nor inference rules.

4

Semantic Web as a “quad space”

• A semantic model (RDF) can be published on the web, at a specific web location (URL): Resolve(http://polleres.net/foaf.rdf) RDF/XML

• The collection of all the RDF graphs published on the web is today referred to by some as being the “Semantic Web”

• The Semantic Web can be therefore see as a huge quad store where:– Any graph is readable (in general, but HTTP access control possible)– It is possible to write, but only in controlled web spaces (e.g. one’s own

homepage)

A big plus: it ties to the URL/DNS name ownership mechanism, ie. One could say: “I own my RDF statements” in some sense.

Page 5: Copyright 2007 Digital Enterprise Research Institute. All rights reserved.  SEMEDIA PARENTAL ADVISORY: Neither formulas nor inference rules.

5

Additionally, plenty of RDF data on the Semantic Web already

• Personal FOAF, SIOC… etc.• Databases : DBpedia, DBLP, Geonames, etc. • Most of these use the “Linked Open Data” approach, e.g.

http://dbpedia.org/resource/Berlin

“Building the Semantic Web is about integrating data, that is already out there.”(found on a blog)

Page 6: Copyright 2007 Digital Enterprise Research Institute. All rights reserved.  SEMEDIA PARENTAL ADVISORY: Neither formulas nor inference rules.

6

Now how do I aggregate/process this RDF data?

Available “ingredients”• Services: SWSE (for full queries), Sindice (find the sources, query yourself) • Browsers: Tabulator, Disco (browsers) etc..• Clients: Protégé (read/local edit), DBin 2.0 (read/write)

• Missing: Data processors for the Web a la Pipes!

Page 7: Copyright 2007 Digital Enterprise Research Institute. All rights reserved.  SEMEDIA PARENTAL ADVISORY: Neither formulas nor inference rules.

7

• Charles doesn’t trust Alice’s information,• He wants to provide a common view over his and Bob’s information.• But: He also want’s to counter some of Bob’s statements though, ie.

Charles wants to say:I don’t know Alice!

• Additionaly, he might want to do other stuff, like RDFS materialiation, a SPARQL query over this joint view,etc.

An Example: Aggregating & Patching RDF data

3 FOAF files, which contain the personal information stored by Alice, Bob, and Charles:

Page 8: Copyright 2007 Digital Enterprise Research Institute. All rights reserved.  SEMEDIA PARENTAL ADVISORY: Neither formulas nor inference rules.

8

Outline

• Web Pipes, Web 2.0 + Semantic Web Pipes

• Revocations in RDF

• MSG Theory

• DBin 2.0: a Semantic Web Client implementing RDF revocations (Demo)

• Conclusions & Outlook!

Page 9: Copyright 2007 Digital Enterprise Research Institute. All rights reserved.  SEMEDIA PARENTAL ADVISORY: Neither formulas nor inference rules.

9

How to revoke statements?

• How can Bob state I don’t know Alice!

– Such that others know he revokes that statement?– Without ending up in overall inconsistency (“explosive” semantics

of negation in classical logic?)

– Let’s see what means current Semantic Web languages provide…

Page 10: Copyright 2007 Digital Enterprise Research Institute. All rights reserved.  SEMEDIA PARENTAL ADVISORY: Neither formulas nor inference rules.

10

How to state “I don’t know Alice?” - Attempt 1:

• In RDF, using a new “counter”-property, extending the FOAF vocabulary

+ : Easy to write down, no overall inconsistency

-: No semantics:– How would someone else know that doesntknow is the opposite of knows?

– For any revocation, one would need to extend the vocabulary.

Page 11: Copyright 2007 Digital Enterprise Research Institute. All rights reserved.  SEMEDIA PARENTAL ADVISORY: Neither formulas nor inference rules.

11

How to state “I don’t know Alice?” - Attempt 2:

• In RDF, making revoked statements in a separate file, e.g., badstatements.rdf

+ : Easy to write down, no overall inconsistency

-: No semantics:– how would a crawler be able to disambiguate “good” and “bad” graphs?

– for actual revocation, one would need some pretty ugly SPARQL query:

Page 12: Copyright 2007 Digital Enterprise Research Institute. All rights reserved.  SEMEDIA PARENTAL ADVISORY: Neither formulas nor inference rules.

12

How to state “I don’t know Alice?” - Attempt 3:

• In OWL/RDF, I can express “I don’t know Alice” !!

+ : clear semantics, W3C standard language

-: Not necessarily what was intended:– Global inconsistency when combined with Bob’s file.

– We wanted to “patch” only thus only remove the single statement.

– BTW, the RDF triples for this in OWL/RDF are quite verbose

• One could use para-consistent reasoning with OWL though, see [Z. Huang, F. van Harmelen, and A. ten Teije., 2005]– Still, needs full OWL DL reasoning, with nominals!

Page 13: Copyright 2007 Digital Enterprise Research Institute. All rights reserved.  SEMEDIA PARENTAL ADVISORY: Neither formulas nor inference rules.

13

How to state “I don’t know Alice?” - Attempt 4:

• In RDF, using reification and a special property a la N3:

+ : Easy to write down in N3 – Maybe still verbose though (by RDF reification vocabulary),

but alternative forms of writing down reficied statements, e.g. as XMLLiteral possible (which has other nasty side effects.)

-: Again, undesired semantics:– overall inconsistency

– semantics only defined in terms of cwm, non-standard, not really written down properly

Page 14: Copyright 2007 Digital Enterprise Research Institute. All rights reserved.  SEMEDIA PARENTAL ADVISORY: Neither formulas nor inference rules.

14

Outline

• Web Pipes, Web 2.0 + Semantic Web Pipes

• Revocations in RDF

• MSG Theory

• DBin 2.0: a Semantic Web Client implementing RDF revocations (Demo)

• Conclusions & Outlook!

Page 15: Copyright 2007 Digital Enterprise Research Institute. All rights reserved.  SEMEDIA PARENTAL ADVISORY: Neither formulas nor inference rules.

15

Our solution: Describe revocations based on MSG theory!

• Practical approach• Concise to write down• Non (yet) standard semantics, admittedly• Implemented in the Dbin 2.0 system for collaborative

management and patching of RDF graphs

Page 16: Copyright 2007 Digital Enterprise Research Institute. All rights reserved.  SEMEDIA PARENTAL ADVISORY: Neither formulas nor inference rules.

16

KEY CONCEPT:Minimum Self-contained Graph (MSG)

MSG (Def). Given an RDF statement s and a graph G, the Minimum Self-contained Graph (MSG) containing that statement, written MSG(s,G), is the set of RDF statements comprised of the following:

– The statement in question– Recursively, for all the blank nodes involved by statements included

in the description so far, the MSG of all the statements involving such blank nodes

Important Properties:– Each RDF Graph can be decomposed in a canonical set of MSGs– Each MSG has a unique (blank-node agnostic) hash sum

For a deeper discussion see:G. Tummarello, C. Morbidoni, R. Bachmann-Gmur, O. Erling, "RDFSync: efficient remote synchronization of RDF models", Proceedings

of the 6th International Semantic Web Conference 2007, Busan, Korea

--> Idea: We use exactly this hash sum to revoke MSGs

Page 17: Copyright 2007 Digital Enterprise Research Institute. All rights reserved.  SEMEDIA PARENTAL ADVISORY: Neither formulas nor inference rules.

17

RDF graph decomposition and identifiers

MSG ID = Base64(MD5(Canonical(MSG)))MSG ID = Base64(MD5(Canonical(MSG)))

= 45FA76B61FC0= 45FA76B61FC0

Graph ID list = [MSG ID 1 , MSG ID 2, ..]Graph ID list = [MSG ID 1 , MSG ID 2, ..]

Page 18: Copyright 2007 Digital Enterprise Research Institute. All rights reserved.  SEMEDIA PARENTAL ADVISORY: Neither formulas nor inference rules.

18

RDF modeling of MSG based revocations

• So, if Charles wants to revoke<http://ex.org/ charles#me> foaf:knows <http://alice.example.org/i>

(stated in Bob’s FOAF file)

• …he computes a base64 of the MSG hash “123JHG…”

• …then adds to his own FOAF a triple:_:x pipes:revokesMSGHash "“123JHG…"^^xsd:string.

• … plus optionally additional metadata about the revocation:_:x pipes:statedBy :me ._:x pipes:date "2007-11-11T16:20:00+9:00"^^xsd:dateTime. _:x pipes:revocationDescription "Who the FOAF said that I know Alice? I don’t know her."^^xsd:string.

_:x pipes:involvedResource :me.

* In the paper, we called this slightly different,_:a <http://sw.deri.org/09/2007/states_not> “123JHG…”

but in between, we have refined our revocation vocabulary under the namespace http://pipes.deri.org/2007/10/ns#.

Page 19: Copyright 2007 Digital Enterprise Research Institute. All rights reserved.  SEMEDIA PARENTAL ADVISORY: Neither formulas nor inference rules.

19

RDF revocation based on MSG hashes summary:

• In RDF, using our revocation vocabulary:_:x pipes:revokesMSGHash "“123JHG…"^^xsd:string.

+ : – Concise to write down,– no overall inconsistency, specific revocations possible. – revocations on the level of MSGs– We have defined a well-defined semantics what “revocation” means,– allows revocation as a modular operator in a pipe.

-:– not visible WHAT was revoked, before revocations are applied, but this may be

viewed a a feature– MSG hashes need to be computed every time (in Dbin we do this anyway, and

we suggest to avoid unnecessary computations by using additional metadata in the paper)

– Theoretically hash-collision possible (proved not to be the case in current use case examples, can again be minimized by additional metadata)

Page 20: Copyright 2007 Digital Enterprise Research Institute. All rights reserved.  SEMEDIA PARENTAL ADVISORY: Neither formulas nor inference rules.

20

Revocation in Dbin 2.0 “Pipes”

• In DBin, a pipe is currently a simple ordered sequence of RDF graphs, where the revocation operator is subsequently applied:

• Applying revocation in the order of the files, allows “preferential views” on RDF.• A non-naïve version, allows additionally “revocations of revocations”, and arbitrary

pipes, see:

Christian Morbidoni, Axel Polleres, Giovanni Tummarello, Danh Le Phuoc. Semantic Web Pipes. Tech. Report. DERI-TR-2007-11-07, 2007.

Page 21: Copyright 2007 Digital Enterprise Research Institute. All rights reserved.  SEMEDIA PARENTAL ADVISORY: Neither formulas nor inference rules.

21

Outline

• Web Pipes, Web 2.0 + Semantic Web Pipes

• Revocations in RDF

• MSG Theory

• DBin 2.0: a Semantic Web Client implementing RDF revocations (Demo)

• Conclusions & Outlook!

Page 22: Copyright 2007 Digital Enterprise Research Institute. All rights reserved.  SEMEDIA PARENTAL ADVISORY: Neither formulas nor inference rules.

22

DBin 2.0 overview

• Is a desktop client• Provides a rich user interfaces which can be

customized for specific domains (Brainlet model) • As a basis, it Reads and Writes from/to the Semantic

Web• Implements “Semantic Web Pipes”: workflows which

combine Semantic Web sources in specific ways

2.0

Page 23: Copyright 2007 Digital Enterprise Research Institute. All rights reserved.  SEMEDIA PARENTAL ADVISORY: Neither formulas nor inference rules.

23

DBin 2.0, the Pipe Engine and the User Interface

Page 24: Copyright 2007 Digital Enterprise Research Institute. All rights reserved.  SEMEDIA PARENTAL ADVISORY: Neither formulas nor inference rules.

24

…Let’s show it as a Semantic Web Pipe

Revocation operator

Source 1

Source 2

Revocation operator

Source 3

RDFSinference

User Interface

RDF Sources and operators piped togheterfor processing RDF data in a specific way

RDF

RDF RDF

RDF RDF

RDF

Page 25: Copyright 2007 Digital Enterprise Research Institute. All rights reserved.  SEMEDIA PARENTAL ADVISORY: Neither formulas nor inference rules.

25

Outline

• Web Pipes, Web 2.0 + Semantic Web Pipes

• Revocations in RDF

• MSG Theory

• DBin 2.0: a Semantic Web Client implementing RDF revocations (Demo)

• Conclusions & Outlook!

Page 26: Copyright 2007 Digital Enterprise Research Institute. All rights reserved.  SEMEDIA PARENTAL ADVISORY: Neither formulas nor inference rules.

26

Conclusions and Outlook:

• Novel concept of “revocations” in RDF takes into account that data on the web may be conflicting, but without introducing overall inconsistency

• Allows targeted application of revocations in any sep of a pipe.

DBin 2.0 makes use of it for common views of RDF collaboratively

maintained graphs.

pipes.deri.org is the next step which will allow you to store, execute and

save pipes involving operators like: RDF materialize, revoke, SPARQL, XSLT, merge, etc.

2.0

Discussion points for specifically this workshop:

• We believe that small, practical, small reasoning components such as provided by the revocations operator, work, for certain (not all) applications to get the Semantic Web going more quickly. Do we really need full OWL DL machinery for the aggregation and patching task?

• BTW: In our experiments so far, we experience that a large part of the aggregation time for pipes is taken for fetching data… distribution!