Incentive compatible Assured Data Sharing & Mining
-
Upload
troy-parks -
Category
Documents
-
view
25 -
download
1
description
Transcript of Incentive compatible Assured Data Sharing & Mining
UT DALLAS Erik Jonsson School of Engineering & Computer Science
FEARLESS engineering
Incentive compatible Assured Data Sharing & Mining
Murat Kantarcioglu
FEARLESS engineering
Incentives and Trust in Assured Information Sharing
Combining intelligence through a loose allianceBridges gaps due to sovereign boundariesMaximizes yield of resourcesDiscovery of new information through correlation, analysis of
the ‘big picture’Information exchanged privately between two participants
Drawbacks to sharingMisinformationFreeloading
Goal: Create means of encouraging desirable behavior within an environment which lacks or cannot support a central governing agent
FEARLESS engineering
Possible Scenarios
• You may verify the shared data, and issue fines if the data is wrong– This is easy
• You may verify the share data but cannot issue fines– Little bit harder
• You may only verify some aggregate result– Hardest
FEARLESS engineering
Game Matrix
Play (agent j) Do Not Play
Truth Lie
Play(Agent i)
Truth0
0
Lie0
0
Do Not Play0
00
00
0
ijiv tPC
)(2
maxmin
ijiv tPC )(
jijv tPC )(
2maxmin
jijv tPC
)(2
maxmin
ijiv tPC
)(2
maxmin
ijiv tPC )(
jijv tPC )( jijv tPC )(
Value of information
Minimal verification probability
Cost of Verificatio
n
Trust value
Agent type
FEARLESS engineering
Behaviors Analyzed in Data Sharing SimulationsName Strategy Verification? Punishment? Comments
Honest Truth No No Optimistic, maximizes returns
Dishonest Lie No No Takes advantage of other players, trumps Honest in 1 on 1
Random Truth, Lie No No Chaotic, chooses either with equal probability
Tit-for-Tat Truth, Lie Always Special Mirrors other players’ actions, starts by selecting Truth
LivingAgent Truth Trust-based No trading Verifies activity according to trust ratings, will cease activity for number of rounds with player who is caught lying
Liar Truth, Lie Trust-based No trading Identical to LivingAgent but lies with small probability
SubtleLie Truth, Lie Trust-based No trading Identical to Liar, except lies whenever information value reaches certain threshold
FEARLESS engineering
Simulation Results
We set δmin = 3, δmax = 7, CV = 2
Lie threshold is set 6.9
Honest behavior wins %97 percent of the time if all behaviors exist.
Experiments show without LivingAgent behavior, Honest
behavior cannot flourish.
Please see the following paper for mode details:
“Incentive and Trust Issues in Assured Information Sharing”Ryan Layfield, Murat Kantarcioglu, and Bhavani ThuraisinghamInternational Conference on Collaborative Computing 2008
FEARLESS engineering
Verifying Final Result: Our Model
• Players P1....P
n:
• Each has some data (x1...xn), and • Goal: compute a data mining function, D(x1,...,xn) that maximizes
the sum of the participants valuation function.
• Player Pt: Mediator between parties, computes the
function securely, and has test data xt
• Players value privacy, correctness, exclusivity
• Problem: How do we ensure that players share data truthfully?
FEARLESS engineering
Assumption
• The best model that maximizes sum of the valuation function is the model built by using the submitted input data.
• Formally: Given submitted valuation functions and submitted data
– D(x) = argmaxmM ({k}
vk(m) ) for any set of players
FEARLESS engineering
Mechanism
• Reservation utility normalized to 0
• ui(m) = v
i(m) – p
i(vi,v-i)
• [u = utility] [v = valuation] [p = payment]
• pi(vi,v-i) = argmaxm’M (
{k!=i}(v
k(m’)) –
{k!=i}(v
k(m))
• vi(m) = max{0,acc(m)-acc(D(x
i)} – c(D)
– c is the cost of computation, acc is accuracy
FEARLESS engineering
Mechanism
• We compute pi using the independent test set
held by Pt
• Intuitively, mechanism rewards players based on their contribution to the overall model
• This is a VCG mechanism, proved incentive compatible, under our assumption
FEARLESS engineering
Experiments
• Does this assumption hold for normal data?• Methodology
• 4 data sets from UCI Repository• 3-party vertical partitioning, naïve-Bayes classifiers• Determine accuracy and payouts
• Payouts estimated by acc(classifier) – acc(classifier without player i’s data) – constant cost
• Once with all players truthful• Once for each player and for each amount of perturbation
• (1%, 2%, 4%, 8%, 16%, 32%, 64%, 100%)
• 50 runs on each
FEARLESS engineering
Census-Income (Adult)
TL(1%)
L(2%)L(4%)
L(8%)L(16%)
L(32%)L(64%)
L(100%)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1Overall Accuracy
Player 1 LyingPlayer 2 LyingPlayer 3 Lying
FEARLESS engineering
Census-Income (Adult)
TL(1%)
L(2%)L(4%)
L(8%)L(16%)
L(32%)L(64%)
L(100%)
-0.6
-0.5
-0.4
-0.3
-0.2
-0.1
0Payouts based on Overall Accuracy
Player 1 LyingPlayer 2 LyingPlayer 3 Lying
FEARLESS engineering
Census-Income (Adult)
TL(1%)
L(2%)L(4%)
L(8%)L(16%)
L(32%)L(64%)
L(100%)
-0.6
-0.4
-0.2
0
0.2
0.4
0.6Payouts - Overall Accuracy - Player 1 Lying
Player 1Player 2Player 3
FEARLESS engineering
Census-Income (Adult)
TL(1%)
L(2%)L(4%)
L(8%)L(16%)
L(32%)L(64%)
L(100%)
-0.6
-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4Payouts - Overall Accuracy - Player 2 Lying
Player 1Player 2Player 3
FEARLESS engineering
Census-Income (Adult)
TL(1%)
L(2%)L(4%)
L(8%)L(16%)
L(32%)L(64%)
L(100%)
-0.6
-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4Payouts - Overall Accuracy - Player 3 Lying
Player 1Player 2Player 3
FEARLESS engineering
Breast-Cancer-Wisconsin
TL(1%)
L(2%)L(4%)
L(8%)L(16%)
L(32%)L(64%)
L(100%)
0.91
0.92
0.93
0.94
0.95
0.96
0.97Overall Accuracy
Player 1 LyingPlayer 2 LyingPlayer 3 Lying
FEARLESS engineering
Conclusions
• Does the assumption hold?• Not always, but it is very close, and would work as a practical
assumption
• If better model is found through lying, does this hurt or help?• Consideration: change the goal; not to prevent lying but to
build the most accurate classifier• Finding the “right” lie may take too much computation for
profitability