Joint work with Emilien Antoine, Gerome  Miklau , Julia  Stoyanovich and Vera  Zaychik...

49
Joint work with Emilien Antoine, Gerome Miklau, Julia Stoyanovich and Vera Zaychik Moffitt Mai 30, 2012 Introducing Access Control in Webdamlog Serge Abiteboul INRIA Saclay & ENS Cachan

description

Introducing Access Control in Webdamlog Serge Abiteboul INRIA Saclay & ENS Cachan. Joint work with Emilien Antoine, Gerome  Miklau , Julia  Stoyanovich and Vera  Zaychik Moffitt. Mai 30 , 2012. ICDE 2012. The Web as a distributed knowledge base - PowerPoint PPT Presentation

Transcript of Joint work with Emilien Antoine, Gerome  Miklau , Julia  Stoyanovich and Vera  Zaychik...

Page 1: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Joint work with Emilien Antoine, Gerome Miklau, Julia Stoyanovich and Vera Zaychik Moffitt

ICDE 2012Mai 30, 2012

Introducing Access Control in Webdamlog

Serge AbiteboulINRIA Saclay & ENS Cachan

Page 2: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 2

•The Web as a distributed knowledge base•Webdamlog: a rule-based language for

the Web•Access control in Webdamlog•The Webdamlog system•Conclusion

Page 3: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 3

A typical Web user’s data•What kinds of data?- data: photos, music, movies, reports, email

- metadata: photo taken by Alice in Paris on ...

- ontologies: Alice’s ontology and mapping with other ontologies

- localization: Alice’s pictures are on Picasa, back-ups are at INRIA

- security: Facebook credentials (Alice, 123456)

- annotations: Alice likes Elvis’ website

- beliefs: Alice believes Elvis is alive

- external knowledge: Bob keeps copies of Alice’s pictures

- time, provenance, ...

all kinds

Social data

Page 4: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 4

A typical Web user’s data•What kinds of data?•Where is the data?

- laptop, desktop, smartphone, tablet, car computer

- mail, address book, agenda

- Facebook, LinkedIn, Picasa, YouTube, Tweeter

- svn, Google docs

- also access to data / information of family, friends, companies associations

all kindseverywhere

Page 5: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 5

A typical Web user’s data•What kinds of data?•Where is the data?

all kinds everywhere•What kind of organization?

- terminology: different ontologies

- systems: personal machines, social networks

- distribution: different localization

- security: different protocols

- quality: incomplete / inconsistent information

heterogeneous

Page 6: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 6

Example of processingAlice and Bob are getting engaged. Their friends want to offer them an album of photos where they are together

To make such a photo album• Find friends of Alice & Bob (say with Facebook)• for each friend, find where she keeps her photos

(say, Picassa)• find the means to access her photos possibly

via friends• find the photos that feature Bob and Alice

together, e.g., using tags or face recognition software

• possibly ask someone to verify the resultsSome reasoning is needed to execute these tasks

automatically!

Page 7: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 7

A typical Web user

• Overwhelmed by the mass of information

• Cannot find the information needed

• Is not aware of important events

• Cannot manage/control how others access and use his/her own data

Page 8: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 8

YOU need help!

How can systems help?• We need to move from a Web of text

to a Web of knowledge

- In the spirit of semantic Web

• To better support user needs,

- Systems need to analyze what is happening and construct knowledge

- Systems should exchange knowledge

- Systems should reason and infer knowledge

Page 9: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 9

Thesis

All this forms a distributed knowledge basewith processing based on automated

reasoning

Page 10: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 1

0

Our topic

•Distributed reasoning Exchanging facts and rules Webdamlog

• Access control with access control

Page 11: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 1

1

•The Web as a distributed knowledge base•Webdamlog: a rule-based language for

the Web•Access control in Webdamlog•The Webdamlog system•Conclusion

Page 12: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 1

2

Webdamlog: a datalog-style language

Datalog A prehistoric language by Web time...

+ nice and compact syntax+ well-studied with many extensions+ recursion essential: network cycles

WebdamlogNot as simple/beautiful & proceduralNeeded for real Web applications!

Webdamlog is not datalog

Page 13: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 1

3

Webdamlog: an extension of datalog

Datalog program fof(x,y) :- friend(x,y)fof(x,y) :- friend(x,z), fof(z,y)

Extensional facts (stored in the database)friend(“peter”,”paul”) friend(“paul”, “mary”) friend(“mary”,”sue”)

Intentional facts (derived)fof(“peter”,”paul”) fof(“peter”,”mary”) fof(“peter”, “sue”)fof(“paul”, “mary”) fof(“paul”, “sue”) fof(“mary”,”sue”)

Page 14: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 1

4

Webdamlog: an extension of datalogExtends datalog

• negation, updates, distribution, delegation, time

For a world that is• distributed: autonomous and asynchronous

peers• dynamic: knowledge evolves; peers come and

goInfluenced by

• Active XML (INRIA) - for distribution & intentional data

• Dedalus (UC Berkeley) - for time & implementation

Page 15: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 1

5

FactsFacts are of the form m@p(a1, ..., an), where

m is a relation name & p is a peer namea1, ..., an are data values (n is the arity of m@p)the set of data values includes the relations and peer names

Examplesfriend@my-iphone(“peter”, “paul”) extensionalfof@my-iphone(“adam”, “paul”) intentional

Page 16: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 1

6

Examples of factsdata & metadata: pictures@alice-iphone(1771.jpg,

“Paris”, 11/11/2011)ontology: [email protected]("Elvis”, theKing)annotations: [email protected](“wikipedia.org”,

encyclopedia)localization: where@alice(pictures, picasa/alice)access rights: right@picasa(pictures, friends, read)security: secret@picasa/alice; public@picasa/alice

Page 17: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 1

7

RulesRules are of the form

$R@$P($U) :- (not) $R1@$P1($U1), ..., (not) $Rn@$Pn($Un)where

$R, $Ri are relation terms$P, $Pi are peer terms $U, $Ui are tuples of terms

Safety condition$R and $P must appear positively bound in the bodyeach variable in a negative literal must appear positively bound in the body

A term is a variable

or a constant

Examples coming up, stay tuned

Page 18: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 1

8

State transition

Choose some peer p randomly – asynchronouslyCompute the transition of p

the database updates at pthe messages sent to other peersthe delegations of rules to other peers

Keep going forever(I0, Γ0, ∅) ➝ (I1, Γ1, Γ1*) ➝... ➝ (In, Γn, Γn*) ➝...

Fair sequence: each peer is selected infinitely often

Page 19: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 1

9

The semantics of rulesClassification based on locality and nature of head predicates (intentional or extensional)• Local rule at my-laptop: all predicates in the

body of the rules are from my-laptop

Local with local intentional head classic datalogLocal with local extensional head database updateLocal with non-local extensional head messaging between peersLocal with non-local intentional head view delegationNon-local general delegation

Page 20: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 2

0

Local rules with local intentional head

Example: Rule at peer my-laptopfriend is extensional, fof is intentional

fof@my-iphone($x, $y) :- friend@my-iphone($x,$y)fof@my-iphone($x,$y) :- friend@my-iphone($x,$z), fof@my-iphone($z,$y)

fof is the transitive closure of friend

Datalog = Webdamlog with only local rules and local intentional head

Page 21: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 2

1

Local rules with local extensional head

A new fact is inserted into the local database

believe@my-iphone(“Alice”, $loc) :- tell@my-iphone($p,”Alice”, $loc),friend@my-iphone($p)

Page 22: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 2

2

Local rules with non-local extensional head

A new fact is sent to an external peer via a message$message@$peer($name, “Happy birthday!”) :-

today@my-iphone($date),birthday@my-iphone($name, $message, $peer, $date)

Extensional facts:today@my-iphone(March 6)birthday@my-iphone("Manon”, “sendmail”, “gmail.com”, March 6)

[email protected]("Manon”, “Happy birthday”)

Page 23: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 2

3

Local rules with non-local intentional head

View delegation!

boyMeetsGirl@gossip-site($girl, $boy) :- girls@my-iphone($girl, $loc),boys@my-iphone($boy, $loc)

Semantics of boyMeetGirl@gossip-site is a join of relations girls and boys from my-iphoneFormally, my-iphone delegates a rule boyMeetGirl@gossip-site(g,b) for each g, b, l, girls@my-iphone(g,l), boys@my-iphone(b,l)

Page 24: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 2

4

Non-local rules: general delegation

(at my-iphone): boyMeetsGirl@gossip-site($girl, $boy) :- girls@my-iphone($girl,

$loc), boys@alice-

iphone($boy, $loc)

Suppose that girls@my-iphone(“Alice”, “Julia's birthday”) holds.

Then my-iphone installs the following rule at alice-iphone(at alice-iphone): boyMeetsGirl@gossip-site(“Alice”,

$boy) :- boys@alice-iphone($boy, “Julia's

birthday”)

When girls@my-iphone(“Alice”, “Julia's birthday”) no longer holds, my-iphone uninstalls the rule

Page 25: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 2

6

Complexity of delegation: illustration

fof(x,y) :- friend(x,y)(at p) fof@p(x,y) :- peers@p($q), friend@$q(x,y)

If peers@p contains 100 000 tuples peers@p(q1), ...., peers@p(q100 000)

This rule will install 100 000 rules!for i=1 to 100 000 (at qi) fof@p(x,y) :- friend@qi(x,y)

Data complexity transformed into program complexity

Page 26: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 2

7

Summary of results [PODS 2011]

• Formal definition of the semantics of Webdamlog

• Results on expressivity- the model with delegation is more general,

unless all peers and programs are known in advance

• Convergence is very hard to achieve- positive Webdamlog- strongly stratified programs with negation

Page 27: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 2

8

•The Web as a distributed knowledge base•Webdamlog: a rule-based language for

the Web•Access control in Webdamlog•The Webdamlog system•Conclusion

Page 28: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013

RequirementsData access Users would like to control who can read and modify their information

Data dissemination Users would like to control how their data are transferred from one participant to another, and how they are combined, with the owner of each piece of data keeping some control over it

Application control Users would like to control which applications can run on their behalf, and what information these applications can access.

29

Page 29: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 3

0

The general picture

• The privileges we consider: read, write, grant

• For read:• Coarse grained access control: at the

relation level• Fine grain access control: at the tuple level 

Page 30: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 3

1

Insertion in extentional relations

Definition of intensional relations• Requires write privilege on the target relation

• [at Alice] alicePhotos@Bob($f) :- person@Alice($p,

“Friend”),personInPhoto@Alice($pid, $p),

photo@Alice($pid,−, $f)

• [at Alice] allPhotos@Alice($f) : alicePhotos@Alice($f)

• [at Bob] allPhotos@Alice($f) :- bobPhotos@Bob($f)

Page 31: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 3

2

Who can read a fact ? – default

• Extensional relations: if you have read privilege to the relation

• Intensional relations: if you have read privilege to the relation &if you can read all the tuples that have been used to create this fact – provenance of the fact

Page 32: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 3

3

Digression: provenance

• Provenance of a tuple•How it was constructed: conjunction• Alternatives: disjunction

Page 33: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 3

4

Digression: provenance graph

gossip@p(Jane, John)rule3

×

girls@p(Jane, Julia’s birthday) boys@p(John, Julia's birthday)rule1

× ×

boyMeetsGirl@p(Jane, John)

×+

(Also used for maintenance in case of update)

Page 34: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 3

5

Coarse grain access control• [at Alice] alicePhotos@Bob($f) :-

person@Alice($p, “Friend”),personInPhoto@Alice($pid, $p),

photo@Alice($pid,−, $f)

• alicePhotos@Bob is extensional•Whoever has read access to alicePhotos@Bob

sees all the relation

Page 35: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 3

6

Fine grain access control• [at Alice] allPhotos@Alice($f) :

alicePhotos@Alice($f)• [at Bob] allPhotos@Alice($f) :-

bobPhotos@Bob($f)

• allPhotos@Alice is intensional• Sue who has read privilege to

allPhotos@Alice and alicePhotos only, can see only the photos of Alice in allPhotos

• Lili who has read privilege to the three relations, sees everything

Page 36: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 3

7

Overwriting the default for intensional data

• Let us change the rule to:• [at Alice] allPhotos@$x($f) :-

alicePhotos@Alice($f), friends@Alice($x)

• Issue: you can read the photos only if you also have read privilege to friends@Alice

Page 37: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 3

8

Overwriting the default for intensional data

• [at Alice] allPhotos@$x($f) :-alicePhotos@Alice($f),

[hide friends@Alice($x)]

•Hide: block the provenance from friends@Alice

• Similar mechanism for extensional data – expose

Page 38: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 3

9

Issues with non local rules• [at Bob]

message@Sue(“I hate you”) :- date@Alice(d)aliceSecret@Bob(x) :- date@Alice(d), secret@Alice(x)

Ignoring access rights, by delegation, this results in running• [at Alice]

message@Sue(“I hate you”) :- date@Alice(d)

aliceSecret@Bob(x) :- date@Alice(d), secret@Alice(x)

Page 39: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 4

0

Default solution: sand boxWe run the rule at Alice in a Sandbox

•We use the access rights of BobSo the second rule does not succeed in sending secrets

• The message specifies that this is done at Bob’s requestSo requires authentication/signatures

• Alternative: delegation without sandbox. Possible if the peer that asks for the delegation is given the privilege to install rules at the other peer – Here if Alice gives Bob the right to install a rule in her environment

Page 40: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 4

1

Access control implementation • A program with access control is compiled

locally in a Webdamlog program without that is executed

• Access control data is managed like any other data

Relation acl (defines relation access)

Relation kind (ext or int)• Based on provenance implemented as a

distributed graph•On-going work on optimization

Page 41: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 4

2

•The Web as a distributed knowledge base•Webdamlog: a rule-based language for

the Web•Access control in Webdamlog•The Webdamlog system

Page 42: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 4

3

The Webdamlog engineBased on Bud

• developed at UC BerkeleyManages knowledge

- Stores facts and rules- exchanges knowledge

with other engines- performs reasoning

Page 43: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 4

4

The engine: beyond Bud• Compilation of

(Bud’s language)• Main Webdamlog features

not supported by Bud1. Variable relation and peer

names2. Delegations with dynamic

changes of the program

Webdamlog+AC ⇒ Webdamlog ⇒ Bloom

Page 44: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 4

5

The Webdamlog peer

Support communication with other peers and with usersSupport common security protocolsSupport wrappers to external systems such as FacebookProvides Web interfaces

Page 45: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 4

6

Provenance graphs

• Records the history of derivation• Provenance semiring semantics [Green et al.

07]• Used for performance optimization • Used for fine grain access control• Other possible uses such as explanation of

results

Page 46: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 4

7

•The Web as a distributed knowledge base•Webdamlog: a rule-based language for

the Web•Access control in Webdamlog•The Webdamlog system•Conclusion

Page 47: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013

ThesisLet us turn the Web into a distributed knowledge base

with billions of userssupported by billions of systems

analyzing informationextracting knowledge

exchanging knowledgeinferring knowledge

48

Page 48: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Abiteboul – DBPL - 2013 4

9

WebdamlogLanguage•A language for distributed data management

[PODS 2011]•Datalog with distribution, updates, messaging •Main novelty: delegation

Implementation •WebdamExchange peer in Java [demo ICDE

2011]•Webdamlog engine based on Bud [demo

Sigmod 2013]Access control: on-going work with Miklau-StoyanovichProbabilistic Webdamlog: on-going work with Deutch-Vianu

Page 49: Joint work with  Emilien Antoine,  Gerome  Miklau ,  Julia  Stoyanovich and  Vera  Zaychik Moffitt

Cambridge University Press, 2012

http://webdam.inria.fr/Jorge

Grazie !