Database Tuples Play Cooperative Games

42
Database Tuples Play Cooperative Games Ester Livshits Joint work with: Leopoldo Bertossi, Benny Kimelfeld, Alon Reshef, Moshe Sebag

Transcript of Database Tuples Play Cooperative Games

Page 1: Database Tuples Play Cooperative Games

Database Tuples Play Cooperative Games

Ester Livshits

Joint work with:

Leopoldo Bertossi, Benny Kimelfeld, Alon Reshef, Moshe Sebag

Page 2: Database Tuples Play Cooperative Games

Ester Livshits Oxford Data and Knowledge Seminar 2

AUTHOR

Name Affiliation

Alice UCLA

Bob NYU

Cathy MIT

David UCSD

Ellen NYU

INSTITUTE

Name STATE

UCLA CA

UCSD CA

NYU NY

MIT MA

PUBLICAION

Author Paper

Alice A

Alice B

Bob C

Cathy C

Cathy D

David C

CITATIONS

PAPER CITS

A 18

B 2

C 8

D 12

๐‘ž ๐‘ง, ๐‘ค :โˆ’AUTHOR ๐‘ฅ, ๐‘ฆ , INSTITUE ๐‘ฆ, โ€ฒCAโ€ฒ , PUBLICATION ๐‘ฅ, ๐‘ง , CITATIONS ๐‘ง, ๐‘ค

PAPER CITS

A 18

B 2

C 8

Why we obtained a

particular answer?

Why we did not obtain

some other answer?

Page 3: Database Tuples Play Cooperative Games

Ester Livshits Oxford Data and Knowledge Seminar 3

AUTHOR

Name Affiliation

Alice UCLA

Bob NYU

Cathy MIT

David UCSD

Ellen NYU

INSTITUTE

Name STATE

UCLA CA

UCSD CA

NYU NY

MIT MA

PUBLICAION

Author Paper

Alice A

Alice B

Bob C

Cathy C

Cathy D

David C

CITATIONS

PAPER CITS

A 18

B 2

C 8

D 12

๐‘ž ๐‘ง, ๐‘ค :โˆ’AUTHOR ๐‘ฅ, ๐‘ฆ , INSTITUE ๐‘ฆ, โ€ฒCAโ€ฒ , PUBLICATION ๐‘ฅ, ๐‘ง , CITATIONS ๐‘ง, ๐‘ค

PAPER CITS

A 18

B 2

C 8

Why we obtained a

particular answer?

Why we did not obtain

some other answer?

Page 4: Database Tuples Play Cooperative Games

Ester Livshits Oxford Data and Knowledge Seminar 4

Which tuples in the database

explain the query result?

Page 5: Database Tuples Play Cooperative Games

Measuring Contribution

โžข Causal responsibility [Meliou et al. 2010]

โ– ๐‘ก is a counterfactual cause for ๐‘ž if ๐ท โŠจ ๐‘ž and ๐ท โˆ– {๐‘ก} โŠจ ๐‘ž

โ– ๐‘ก is an actual cause for ๐‘ž if ๐ท โˆ– ฮ“ โŠจ ๐‘ž and ๐ท โˆ– {ฮ“ โˆช {๐‘ก}} โŠจ ๐‘žfor some ฮ“ โŠ† ๐ท โˆ– {๐‘ก}

โ– The responsibility of ๐‘ก is 1

1+|ฮ“min|

โžข Not extendable to aggregate queries

โžข May be counterintuitive

Ester Livshits Oxford Data and Knowledge Seminar 5

Is there a path from a to b?

Contingency

set

Page 6: Database Tuples Play Cooperative Games

Measuring Contribution

โžข Causal effect [Salimi et al. 2016]

โ– See the database as a probabilistic database

โ– CE ๐‘ก = E ๐‘ž ๐‘ก โˆˆ ๐ท) โˆ’ E ๐‘ž ๐‘ก โˆ‰ ๐ท)

Ester Livshits Oxford Data and Knowledge Seminar 6

What makes the choice of a contribution score a good one?

Page 7: Database Tuples Play Cooperative Games

Shapley Value

โžข A widely known profit-sharing formula in cooperative game theory

โžข Introduced by Lloyd Shapley in 1953

โžข Applied in various areas beyond cooperative game theory:

โ– Pollution responsibility in environmental management

โ– Influence measurement in social network analysis

โ– Identifying candidate autism genes

โ– Bargaining foundations in economics

โ– Takeover corporate rights in law

โ– Explanations in machine learning

Ester Livshits Oxford Data and Knowledge Seminar 7

Page 8: Database Tuples Play Cooperative Games

Shapley Value

Ester Livshits 8

Set ๐ด of players: Wealth function ๐‘ฃ:๐’ซ ๐ด โ†’ โ„:

3

7

42

How to distribute the total

wealth among the players?

Machine learning

Query answering

Inconsistency

Features Prediction

Tuples Answer

Tuples Measure

[Lundberg, Lee 2017]

[L, Kimelfeld 2021]

[L et al. 2020]

Oxford Data and Knowledge Seminar

Page 9: Database Tuples Play Cooperative Games

Shapley Value

Ester Livshits Oxford Data and Knowledge Seminar 9

Shapley ๐ด, ๐‘ฃ, ๐‘Ž =

๐ตโŠ†๐ดโˆ–{๐‘Ž}

๐ต ! ๐ด โˆ’ ๐ต โˆ’ 1 !

๐ด !๐‘ฃ ๐ต โˆช ๐‘Ž โˆ’ ๐‘ฃ ๐ต

72

21 25

+4

The Shapley value is the expected delta

due to the addition in a random permutation

Page 10: Database Tuples Play Cooperative Games

Shapley Value for Database Queries

โžข Which tuples in the database explain the query result?

Ester Livshits Oxford Data and Knowledge Seminar 10

AUTHOR

Name Affiliation

Alice UCLA

Bob NYU

Cathy MIT

David UCSD

Ellen NYU

INSTITUTE

Name STATE

UCLA CA

UCSD CA

NYU NY

MIT MA

PUBLICAION

Author Paper

Alice A

Alice B

Bob C

Cathy C

Cathy D

David C

CITATIONS

PAPER CITS

A 18

B 2

C 8

D 12

๐‘ž ๐‘ง, ๐‘ค :โˆ’AUTHOR ๐‘ฅ, ๐‘ฆ , PUBLICATION ๐‘ฅ, ๐‘ง , CITATIONS(๐‘ง, ๐‘ค)

SUM๐‘คโŸจ๐‘ž ๐‘ง, ๐‘ค โŸฉ

Players

Wealth function

๐‘†๐‘‰ ๐ด๐‘™๐‘–๐‘๐‘’ = 20๐‘†๐‘‰ ๐ถ๐‘Ž๐‘กโ„Ž๐‘ฆ = 14.67๐‘†๐‘‰ ๐ต๐‘œ๐‘ = 2.67๐‘†๐‘‰ ๐ท๐‘Ž๐‘ฃ๐‘–๐‘‘ = 2.67๐‘†๐‘‰ ๐ธ๐‘™๐‘™๐‘’๐‘› = 0

Page 11: Database Tuples Play Cooperative Games

Ester Livshits Oxford Data and Knowledge Seminar 11

AUTHOR

Name Affiliation

Alice UCLA

Bob NYU

Cathy MIT

David UCSD

Ellen NYU

INSTITUTE

Name STATE

UCLA CA

UCSD CA

NYU NY

MIT MA

PUBLICAION

Author Paper

Alice A

Alice B

Bob C

Cathy C

Cathy D

David C

CITATIONS

PAPER CITS

A 18

B 2

C 8

D 12

๐‘ž ๐‘ง, ๐‘ค :โˆ’AUTHOR ๐‘ฅ, ๐‘ฆ , INSTITUE ๐‘ฆ, โ€ฒCAโ€ฒ , PUBLICATION ๐‘ฅ, ๐‘ง , CITATIONS ๐‘ง, ๐‘ค

PAPER CITS

A 18

B 2

C 8

Page 12: Database Tuples Play Cooperative Games

Ester Livshits Oxford Data and Knowledge Seminar 12

AUTHOR

Name Affiliation

Alice UCLA

Bob NYU

Cathy MIT

David UCSD

Ellen NYU

INSTITUTE

Name STATE

UCLA CA

UCSD CA

NYU NY

MIT MA

PUBLICAION

Author Paper

Alice A

Alice B

Bob C

Cathy C

Cathy D

David C

CITATIONS

PAPER CITS

A 18

B 2

C 8

D 12

๐‘ž ๐‘ง, ๐‘ค :โˆ’AUTHOR ๐‘ฅ, ๐‘ฆ , INSTITUE ๐‘ฆ, โ€ฒCAโ€ฒ , PUBLICATION ๐‘ฅ, ๐‘ง , CITATIONS ๐‘ง, ๐‘ค

PAPER CITS

A 18

B 2

C 8

๐‘ž():โˆ’AUTHOR ๐‘ฅ, ๐‘ฆ , INSTITUE ๐‘ฆ, โ€ฒCAโ€ฒ , PUBLICATION ๐‘ฅ, โ€ฒAโ€ฒ , CITATIONS โ€ฒAโ€ฒ, 18

Page 13: Database Tuples Play Cooperative Games

โžข Explaining Query Answers

โžข Computational Complexity

โžข Responsibility to Inconsistency

Outline

Ester Livshits Oxford Data and Knowledge Seminar 13

Page 14: Database Tuples Play Cooperative Games

Computational Complexity

Ester Livshits Oxford Data and Knowledge Seminar 14

โžข A CQ ๐‘ž is hierarchical if for every two existential variables ๐‘ฅ and ๐‘ฆ:

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ โŠ† ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ or

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ โŠ† ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ or

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ โˆฉ ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ = โˆ…

๐‘ž1():โˆ’๐‘… ๐‘ฅ, ๐‘ฆ , ๐‘†(๐‘ฅ, ๐‘ง)

Query Hierarchical Non-hierarchical

SJFCQ PTIME FP#P-complete

SJFCQ with

negationsPTIME FP#P-complete

sum \ count PTIME FP#P-complete

[L et al.

ICDT 2020]

[Reshef et al.

PODS 2020]

๐‘ž():โˆ’AUTHOR ๐‘ฅ, ๐‘ฆ , INSTITUTE ๐‘ฆ, โ€ฒCAโ€ฒ , PUBLICATION ๐‘ฅ, ๐‘ง

๐‘ฆ ๐‘ง

๐‘ฅ

Page 15: Database Tuples Play Cooperative Games

Computational Complexity

Ester Livshits Oxford Data and Knowledge Seminar 15

โžข A CQ ๐‘ž is hierarchical if for every two existential variables ๐‘ฅ and ๐‘ฆ:

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ โŠ† ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ or

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ โŠ† ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ or

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ โˆฉ ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ = โˆ…

Query Hierarchical Non-hierarchical

SJFCQ PTIME FP#P-complete

SJFCQ with

negationsPTIME FP#P-complete

sum \ count PTIME FP#P-complete

๐‘ž2():โˆ’๐‘… ๐‘ฅ , ๐‘† ๐‘ฅ, ๐‘ฆ , ๐‘‡(๐‘ฆ)

[L et al.

ICDT 2020]

[Reshef et al.

PODS 2020]

๐‘ž():โˆ’AUTHOR ๐‘ฅ, ๐‘ฆ , INSTITUTE ๐‘ฆ, โ€ฒCAโ€ฒ , PUBLICATION ๐‘ฅ, ๐‘ง

๐‘ฆ

๐‘ฅ

Page 16: Database Tuples Play Cooperative Games

Conjunctive Queries

โžข To prove hardness, we consider the simplest non-hierarchical query

๐‘ž๐‘…๐‘†๐‘‡(): โˆ’๐‘… ๐‘ฅ , ๐‘† ๐‘ฅ, ๐‘ฆ , ๐‘‡(๐‘ฆ)

โžข Reduction from counting independent sets in a bipartite graph

Ester Livshits Oxford Data and Knowledge Seminar 16

R S T

Page 17: Database Tuples Play Cooperative Games

Conjunctive Queries

โžข Each instance provides us with an equation over |IS(๐‘”, ๐‘˜)|

โžข |IS(๐‘”, ๐‘˜)| - number of independent sets of size ๐‘˜ in ๐‘”

Ester Livshits Oxford Data and Knowledge Seminar 17

Page 18: Database Tuples Play Cooperative Games

Computational Complexity

Ester Livshits Oxford Data and Knowledge Seminar 18

โžข A CQ ๐‘ž is hierarchical if for every two existential variables ๐‘ฅ and ๐‘ฆ:

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ โŠ† ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ or

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ โŠ† ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ or

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ โˆฉ ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ = โˆ…

Query Hierarchical Non-hierarchical

SJFCQ PTIME FP#P-complete

SJFCQ with

negationsPTIME FP#P-complete

sum \ count PTIME FP#P-complete

[L et al.

ICDT 2020]

[Reshef et al.

PODS 2020]

๐‘ž():โˆ’AUTHOR ๐‘ฅ, ๐‘ฆ , ยฌINSTITUTE ๐‘ฆ, โ€ฒCAโ€ฒ , PUBLICATION ๐‘ฅ, ๐‘ง

Page 19: Database Tuples Play Cooperative Games

Computational Complexity

Ester Livshits Oxford Data and Knowledge Seminar 19

โžข A CQ ๐‘ž is hierarchical if for every two existential variables ๐‘ฅ and ๐‘ฆ:

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ โŠ† ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ or

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ โŠ† ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ or

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ โˆฉ ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ = โˆ…

Query Hierarchical Non-hierarchical

SJFCQ PTIME FP#P-complete

SJFCQ with

negationsPTIME FP#P-complete

sum \ count PTIME FP#P-complete

[L et al.

ICDT 2020]

[Reshef et al.

PODS 2020]

๐‘ž ๐‘ง, ๐‘ค :โˆ’AUTHOR ๐‘ฅ, ๐‘ฆ , PUBLICATION ๐‘ฅ, ๐‘ง , CITATIONS(๐‘ง, ๐‘ค)SUM๐‘คโŸจ๐‘ž ๐‘ง, ๐‘ค โŸฉ

Page 20: Database Tuples Play Cooperative Games

Computational Complexity

Ester Livshits Oxford Data and Knowledge Seminar 20

โžข A CQ ๐‘ž is hierarchical if for every two existential variables ๐‘ฅ and ๐‘ฆ:

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ โŠ† ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ or

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ โŠ† ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ or

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ โˆฉ ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ = โˆ…

Query Hierarchical Non-hierarchical

SJFCQ PTIME FP#P-complete

SJFCQ with

negationsPTIME FP#P-complete

sum \ count PTIME FP#P-complete

[L et al.

ICDT 2020]

[Reshef et al.

PODS 2020]

๐‘ž ๐‘ง, ๐‘ค :โˆ’AUTHOR ๐‘ฅ, ๐‘ฆ , PUBLICATION ๐‘ฅ, ๐‘ง , CITATIONS(๐‘ง, ๐‘ค)MAX๐‘คโŸจ๐‘ž ๐‘ง, ๐‘ค โŸฉ, MIN๐‘คโŸจ๐‘ž ๐‘ง, ๐‘ค โŸฉ, AVERAGE๐‘คโŸจ๐‘ž ๐‘ง, ๐‘ค โŸฉ

Hardness can be extended to

general numerical queries

Page 21: Database Tuples Play Cooperative Games

โžข Computing the Shapley value is often hard

โžข The picture is more positive when allowing approximation

โžข Generalizes to unions of CQs

Approximation Complexity

Ester Livshits Oxford Data and Knowledge Seminar 21

Pr๐‘“(๐‘ฅ)

1 + ๐œ–โ‰ค ๐ด ๐‘ฅ, ๐œ–, ๐›ฟ โ‰ค (1 + ๐œ–)๐‘“(๐‘ฅ) โ‰ฅ 1 โˆ’ ๐›ฟ

Query Hierarchical Non-hierarchical

SJFCQ PTIME FPRAS

sum \ count PTIME FPRAS

Page 22: Database Tuples Play Cooperative Games

โžข Additive approximation via Monte Carlo sampling

โžข Also a multiplicative approximation due to the โ€œgap propertyโ€

โžข Does not hold when allowing negation

โžข Negation fundamentally changes the complexity picture!

Approximation Complexity

Ester Livshits Oxford Data and Knowledge Seminar 22

Pr ๐‘“ ๐‘ฅ โˆ’ ๐œ– โ‰ค ๐ด ๐‘ฅ, ๐œ–, ๐›ฟ โ‰ค ๐‘“ ๐‘ฅ + ๐œ– โ‰ฅ 1 โˆ’ ๐›ฟ

For every tuple ๐‘ก in the database ๐ท:

Shapley(๐‘ก)=0 or Shapley(๐‘ก)โ‰ฅ1

๐‘(|๐ท|)

Page 23: Database Tuples Play Cooperative Games

โžข With negation, the contribution can be negative

Approximation Complexity

Ester Livshits Oxford Data and Knowledge Seminar 23

Register

Student Course

Alice OS

Alice AI

Bob OS

Cathy DB

Cathy IC

Student

Name

Alice

Bob

Cathy

David

TA

Name

Alice

Bob

David

๐‘ž(): โˆ’Student ๐‘ฅ , ยฌTA ๐‘ฅ , Register(๐‘ฅ, ๐‘ฆ)

In some cases, deciding whether Shapley(๐‘ก)โ‰ 0 is hard

Page 24: Database Tuples Play Cooperative Games

โžข Causal effect [Salimi et al. 2016]

โ– See the database as a probabilistic database

โ– CE ๐‘ก = E ๐‘ž ๐‘ก โˆˆ ๐ท) โˆ’ E ๐‘ž ๐‘ก โˆ‰ ๐ท)

โžข Coincides with the Banzhaf Power Index [Banzhaf 1965]

โžข Our complexity results extend to this measure

Ester Livshits Oxford Data and Knowledge Seminar

Banzhaf Power Index

24

Page 25: Database Tuples Play Cooperative Games

โžข Explaining Query Answers

โžข Computational Complexity

โžข Responsibility to Inconsistency

Outline

Ester Livshits Oxford Data and Knowledge Seminar 25

Page 26: Database Tuples Play Cooperative Games

Inconsistent Databasesโžข A database is inconsistent if it violates integrity constraints

Ester Livshits Oxford Data and Knowledge Seminar 26

Cullen Douglas

dbo:birthPlace

โ–ช dbr:California

โ–ช dbr:Florida

Marion Jones

dbo:height

โ–ช 1.524

โ–ช 1.778

Irene Tedrow

dbo:deathPlace

โ–ช dbr:California

โ–ช dbr:Hollywood,_Los_Angeles

โ–ช dbr:New_York_City

Page 27: Database Tuples Play Cooperative Games

Inconsistent Databases

Ester Livshits Oxford Data and Knowledge Seminar 27

โžข Imprecise data sources

โ– Crowd, Web pages, social encyclopedias, sensors, โ€ฆ

โžข Imprecise data generation

โ– natural-language processing, sensor/signal processing, image recognition, โ€ฆ

โžข Conflicts in data integration

โ– Crowd + enterprise data + KB + Web + ...

โžข Data staleness

โ– Entities change address, status, ...

โžข And so onโ€ฆ

Page 28: Database Tuples Play Cooperative Games

Ester Livshits Oxford Data and Knowledge Seminar 28

Idea:

Quantify the extent to which

integrity constraints are violated

Reliability estimationHow reliable is a new data source?

Progress indicationProgress bar for data repairing

Action prioritizationWhich tuples are mostly

responsible for inconsistency?

Page 29: Database Tuples Play Cooperative Games

Ester Livshits Oxford Data and Knowledge Seminar 29

How can we quantify the

responsibility of individual tuples

to inconsistency?

Inconsistency measure

Responsibility sharing

mechanism

Page 30: Database Tuples Play Cooperative Games

Ester Livshits Oxford Data and Knowledge Seminar 30

How can we quantify the

responsibility of individual tuples

to inconsistency?

Inconsistency measure

Responsibility sharing

mechanism

Page 31: Database Tuples Play Cooperative Games

How to Measure Inconsistency?

โžข Several measures proposed by the KR and DB communities

โ– The drastic measure โ€“ 1 if inconsistent, 0 otherwise [Thimm 2017]

โ– #minimal inconsistent subsets [Hunter and Konieczny 2008]

โ– #problematic tuples [Grant and Hunter 2011]

โ– Minimal #tuples to remove to satisfy the constraints [Grant and Hunter 2013], [Bertossi 2018]

โ– #maximal consistent subsets [Grant and Hunter 2011]

โžข What makes a measure a good one? [L et al. SIGMOD 2021]

Ester Livshits Oxford Data and Knowledge Seminar 31

Page 32: Database Tuples Play Cooperative Games

Ester Livshits Oxford Data and Knowledge Seminar 32

How can we quantify the

responsibility of individual tuples

to inconsistency?

Inconsistency measure

Responsibility sharing

mechanism

Shapley Value

Page 33: Database Tuples Play Cooperative Games

Computational Complexity

Ester Livshits Oxford Data and Knowledge Seminar 33

Measure lhs chainNo lhs chain,

tractable c-repairother

drastic PTIME FP#P-complete

#min-

inconsistentPTIME

#problematic

tuplesPTIME

cardinality

repairPTIME Open NP-hard

#repairs PTIME FP#P-complete

FD: birthCity โ†’ birthState

Page 34: Database Tuples Play Cooperative Games

Tractable Measures

Ester Livshits Oxford Data and Knowledge Seminar 34

โžข ๐ผ๐‘€๐ผ - Number of minimal inconsistent subsets

๐‘“4

Train Departs Arrives Time Duration

๐‘“1 16 NYP BBY 1030 315

๐‘“2 16 NYP PVD 1030 250

๐‘“3 16 PHL WIL 1030 20

๐‘“4 16 PHL BAL 1030 70

๐‘“5 16 PHL WAS 1030 120

๐‘“6 16 BBY PHL 1030 260

๐‘“7 16 BBY NYP 1030 260

๐‘“8 16 BBY WAS 1030 420

๐‘“9 16 WAS PVD 1030 390

Train Time โ†’ Departs

Train Time Duration โ†’ Arrives

๐‘“7 ๐‘“1 ๐‘“3 ๐‘“9 ๐‘“2 ๐‘“5 ๐‘“8 ๐‘“6

Page 35: Database Tuples Play Cooperative Games

Tractable Measures

Ester Livshits Oxford Data and Knowledge Seminar 35

โžข ๐ผ๐‘€๐ผ - Number of minimal inconsistent subsets

๐‘“4

Train Departs Arrives Time Duration

๐‘“1 16 NYP BBY 1030 315

๐‘“2 16 NYP PVD 1030 250

๐‘“3 16 PHL WIL 1030 20

๐‘“4 16 PHL BAL 1030 70

๐‘“5 16 PHL WAS 1030 120

๐‘“6 16 BBY PHL 1030 260

๐‘“7 16 BBY NYP 1030 260

๐‘“8 16 BBY WAS 1030 420

๐‘“9 16 WAS PVD 1030 390

Train Time โ†’ Departs

Train Time Duration โ†’ Arrives

๐‘“7 ๐‘“1 ๐‘“3 ๐‘“9 ๐‘“2 ๐‘“5 ๐‘“8 ๐‘“6

+2

๐‘“ increases the value of ๐ผ๐‘€๐ผ by ๐‘˜ if

๐‘˜ of the previous tuples conflict with it

Page 36: Database Tuples Play Cooperative Games

Tractable Measures

Ester Livshits Oxford Data and Knowledge Seminar 36

โžข ๐ผ๐‘ƒ - Number of problematic tuples

๐‘“4

Train Departs Arrives Time Duration

๐‘“1 16 NYP BBY 1030 315

๐‘“2 16 NYP PVD 1030 250

๐‘“3 16 PHL WIL 1030 20

๐‘“4 16 PHL BAL 1030 70

๐‘“5 16 PHL WAS 1030 120

๐‘“6 16 BBY PHL 1030 260

๐‘“7 16 BBY NYP 1030 260

๐‘“8 16 BBY WAS 1030 420

๐‘“9 16 WAS PVD 1030 390

Train Time โ†’ Departs

Train Time Duration โ†’ Arrives

๐‘“7 ๐‘“1 ๐‘“3 ๐‘“9 ๐‘“2 ๐‘“5 ๐‘“8 ๐‘“6

+1

๐‘“ increases the value of ๐ผ๐‘ by ๐‘˜ if

(๐‘˜ โˆ’ 1) of the previous tuples:

(1) conflict with ๐‘“,

(2) do not conflict with other

tuples that occur before ๐‘“.

Page 37: Database Tuples Play Cooperative Games

Computational Complexity

Ester Livshits Oxford Data and Knowledge Seminar 37

Measure lhs chainNo lhs chain,

tractable c-repairother

drastic PTIME FP#P-complete

#min-

inconsistentPTIME

#problematic

tuplesPTIME

cardinality

repairPTIME Open NP-hard

#repairs PTIME FP#P-complete

{๐‘ฉ โ†’ ๐ด,๐‘ฉ๐‘ช โ†’ ๐ท,๐‘ฉ๐‘ช๐‘ฎ โ†’ ๐ธ,๐‘ฉ๐‘ช๐‘ญ โ†’ ๐ป}

๐ต โŠ† ๐ต, ๐ถ โŠ† {๐ต, ๐ถ, ๐น} ๐ต, ๐ถ, ๐บ โŠˆ ๐ต, ๐ถ, ๐น , {๐ต, ๐ถ, ๐น}โŠˆ ๐ต, ๐ถ, ๐บ

{๐‘ฉ โ†’ ๐ด,๐‘ฉ๐‘ช โ†’ ๐ท,๐‘ฉ๐‘ช๐‘ญ โ†’ ๐ธ}

Page 38: Database Tuples Play Cooperative Games

Left-Hand Side Chain

Ester Livshits Oxford Data and Knowledge Seminar 38

Train Departs Arrives Time Duration

๐‘“1 16 NYP BBY 1030 315

๐‘“2 16 NYP PVD 1030 250

๐‘“3 16 PHL WIL 1030 20

๐‘“4 16 PHL BAL 1030 70

๐‘“5 16 PHL WAS 1030 120

๐‘“6 16 BBY PHL 1030 260

๐‘“7 16 BBY NYP 1030 260

๐‘“8 16 BBY WAS 1030 420

๐‘“9 16 WAS PVD 1030 390

Train Time โ†’ Departs

Train Time Duration โ†’ Arrives

PVD

Train, Time

Departs

Duration

NYP PHL BBY WAS

16, 1030

315 250 20 70 120 260 420 390

BBY PVD WIL BAL WAS PHL NYP WAS

Arrives

Independent

branchesConflicting

branches

Page 39: Database Tuples Play Cooperative Games

Computational Complexity

Ester Livshits Oxford Data and Knowledge Seminar 39

Measure lhs chainNo lhs chain,

tractable c-repairother

drastic PTIME FP#P-complete

#min-

inconsistentPTIME

#problematic

tuplesPTIME

cardinality

repairPTIME Open NP-hard

#repairs PTIME FP#P-complete

{๐‘ฉ โ†’ ๐ด,๐‘ฉ๐‘ช โ†’ ๐ท,๐‘ฉ๐‘ช๐‘ฎ โ†’ ๐ธ,๐‘ฉ๐‘ช๐‘ญ โ†’ ๐ป}

๐ต โŠ† ๐ต๐ถ โŠ† {๐ต๐ถ๐น} ๐ต๐ถ๐บ โŠˆ ๐ต๐ถ๐น , {๐ต๐ถ๐น}โŠˆ ๐ต๐ถ๐บ

{๐‘ฉ โ†’ ๐ด,๐‘ฉ๐‘ช โ†’ ๐ท,๐‘ฉ๐‘ช๐‘ญ โ†’ ๐ธ}

Efficiency: ฯƒ๐‘Žโˆˆ๐ด Shapley ๐ด, ๐‘ฃ, ๐‘Ž = ๐‘ฃ(๐ด)

Page 40: Database Tuples Play Cooperative Games

Approximation Complexity

Ester Livshits Oxford Data and Knowledge Seminar 40

Measure lhs chainNo lhs chain,

tractable c-repairother

drastic PTIME FPRAS

#min-

inconsistentPTIME

#problematic

tuplesPTIME

cardinality

repairPTIME FPRAS No FPRAS

#repairs PTIME Open

Would imply an FPRAS for #MIS in a bipartite

graph โ€“ long standing open problem

Page 41: Database Tuples Play Cooperative Games

โžข Two situations where we wish to quantify the responsibility of tuples:

โ– Query answering

โ– Database inconsistency

โžข We treat the contribution from the viewpoint of game theory

โžข We investigated the computational complexity

Ester Livshits Oxford Data and Knowledge Seminar

Concluding Remarks

41

Page 42: Database Tuples Play Cooperative Games

Ester Livshits Oxford Data and Knowledge Seminar 42

Thank you for listening!

Questions?