©Intelligent Agent Technology and Application, 2008, Ai Lab NJU Agent Technology Negotiation.
-
Upload
jasper-briggs -
Category
Documents
-
view
228 -
download
10
Transcript of ©Intelligent Agent Technology and Application, 2008, Ai Lab NJU Agent Technology Negotiation.
©Intelligent Agent Technology and Application, 2008, Ai Lab NJU
Agent Technology
Negotiation
Sept. 2008©Gao Yang, Ai Lab NJU2
Outline
1. Introduction
2. Vote
3. Bid
4. Bargain
5. Summary
Sept. 2008©Gao Yang, Ai Lab NJU3
从田忌赛马谈起
Sept. 2008©Gao Yang, Ai Lab NJU4
田忌赛马
齐王以“上马、中马、下马”出赛; 齐王的马优于田忌的马; 每一场胜者赢黄金 100 两,负者输黄金 100 两; 田忌以何种次序出马呢?
f1 f2 f3 f4 f5 f6
< 上,中,下 >
< 上,下,中 >
< 中,上,下 >
< 中,下,上 >
< 下,中,上 >
< 下,上,中 >
f1
< 上,中,下 >
-300 -100 -100 -100 -100 +100
Sept. 2008©Gao Yang, Ai Lab NJU5
囚徒困境
如果齐王的策略也是变化的?如果从零和扩展到非零和?
囚徒困境问题
– 每个囚徒如果只考虑自身的利益,则会选择‘坦白’行为;
– 而囚徒困境的最优策略是双方都选择‘抗拒’行为。
Prisoner 2
Cooperate Defect
Prisoner 1
Cooperate (-9,-9) (0,-10)
Defect (-10,0) (-1,-1)
Sept. 2008©Gao Yang, Ai Lab NJU6
囚徒困境问题的本质
在多 Agent 系统中,如果每个 Agent 都是自利的(使自身获利最大),那么每个 Agent 的最优策略的组合未必是多 Agent 系统的最优策略。
多 Agent 协商– 在自利多 Agent 系统中,通过协商获得最优策略。– 不同于分布式系统,设计者无须为每个 agent 指定策略。
个体利益与集体利益冲突的矛盾本质
究竟什么是最优策略?
Sept. 2008©Gao Yang, Ai Lab NJU7
三个术语
– Coordination Is a property of a system of agents performing some act
ivity in a shared environment.
– Cooperation (Collaborative) Is coordination among nonantagonistic agents.
– Negotiation Is coordination among competitive or simply self-intere
sted agents.
Distributed rational decision making
Sept. 2008©Gao Yang, Ai Lab NJU8
协商研究目标
Classical DAI
– System designer fixes an Interaction-Protocol
which is uniform for all agents. The designer also
fixes a strategy for each agent.
MAS
– Interaction-Protocol is given up. Each agent
determines its own strategy (maximizing its own
good, via a utility function, without looking at the
global task)
Sept. 2008©Gao Yang, Ai Lab NJU9
四条不同准则
We need to compare negotiation protocols. Each such protocol leads to a solution. So we determine how good these solutions are.
社会福利( Social Welfare ) : 最大化所有 Agent 的效益– Sum of all utilities.– It requires inter-agent utility comparisons.
Sept. 2008©Gao Yang, Ai Lab NJU10
帕利脱最优( Pareto Efficiency)
帕利脱最优( Pareto Efficiency ) : 一个方案 x是帕利脱最优,当且仅当不存在另一个方案 x’满足 A solution x is Pareto-optimal (also called efficient), if there is no solution x’ with
– Pareto efficiency measures global good, and it does not require questionable inter-agent utility comparisons.
– Social welfare maximizing solutions are a subset of Pareto efficient ones. (Why?)
– Answer: Once the sum of the payoffs is maximized, an agent’s payoff can increase only if another agent’s payoff decreases.
1 agent :
2 agent :
ag ag
ag ag
ag ut x ut x
ag ut x ut x
Stability???
Sept. 2008©Gao Yang, Ai Lab NJU11
四条不同准则(续)
纳什均衡( Nash-equilibrium )Agent 的策略依赖于其他 agent. 如果 S*
A=< S*1, S*
2,……, S*| A |
> 为纳什均衡策略 , 当且仅当对 Agent i : S*i 对于 agent i 是
最优策略当其他 Agent 选择以下策略时 < S*1, S*
2,…, S*i-1, S*
i+1 …,
S*| A |>
– Two main problems in applying Nash equilibrium; No Nash equilibrium exists in some games. Multiple Nash equilibria in some games, agents should play whi
ch? (See next page)
优超( dominant ) Agent 的策略不依赖于其他 agent. 这样的策略称之为优超。
Sept. 2008©Gao Yang, Ai Lab NJU12
无纯 Nash均衡解
Sept. 2008©Gao Yang, Ai Lab NJU13
多个 Nash均衡解
When b>a>c>db>a>c>d
Agent 2
Action 1 Action 2
Agent 1
Action 1
Action 2
(a,a) (c,b)
(b,c) (d,d)
Sept. 2008©Gao Yang, Ai Lab NJU14
不同准则下囚犯困境的最优策略
社会福利 : Both defect.
帕利脱最优 : All are Pareto optimal, except when bot
h cooperate.
优超策略 : Both cooperate.
纳什均衡 : Both cooperate.
Prisoner 2
cooperate defect
Prisoner 1cooperate (-9,-9) (0,-10)
defect (-10,0) (-1,-1)
How to escape from PD?
Sept. 2008©Gao Yang, Ai Lab NJU15
多 Agent系统的类型
合作多 Agent 系统
– 典型例:合作推箱子,有共同的目标
半竞争多 Agent 系统
– 典型例:机器人行走,有不同的目标,但目标不冲突
竞争多 Agent 系统
– 典型例:下棋,有冲突的目标
Sept. 2008©Gao Yang, Ai Lab NJU16
三种协商机制
投票机制
拍卖机制
谈判机制
Sept. 2008©Gao Yang, Ai Lab NJU17
投票机制
Agents give input to a mechanism and the outcome of it is taken as a solution for the agents.
Motivation: 3 candidates, 3 voters
Comparing A and B: majority for A. Comparing A and C: majority for C. Comparing B and C: majority for B. Desired Preference ordering:
A>B>C>A????
1 2 3
V1 A B C
V2 B C A
V3 C A B
Nonexistence of desired preference ordering
How to design this software?
Sept. 2008©Gao Yang, Ai Lab NJU18
投票机制
Let A the set of agents, O the set of possible
outcomes (O could be equal to A, or a set of laws).
The voting of agent i is described by a binary
relation
which we assume to be asymmetric, strict and
transitive. We denote by Ordering the set of all such
binary relations.
i O O
Sept. 2008©Gao Yang, Ai Lab NJU19
投票机制的六准则
Six properties of a social choice rule– A social preference ordering should exist for
all possible inputs (individual preferences) .– should be defined for every pair .– should be asymmetric and transitive over O.– The outcome should be Pareto efficient:
– The scheme should be independent of irrelevant alternatives.
– No agent should be a dictator in the sense that implies for all preferences of the other agents.
*
* ,o o O*
* , , .iif i A o o then o o
io o*o o
Sept. 2008©Gao Yang, Ai Lab NJU20
阿罗定理
Arrow’s impossibility theorem
– No social choice rule satisfies all of these six
conditions.
没有任何一种投票机制是完全公平,民主!
Nobel Prize in Economics, 1972
Sept. 2008©Gao Yang, Ai Lab NJU21
2010年世界杯申办
侯选国家– 埃及– 利比亚– 摩洛哥– 尼日利亚– 南非– 突尼斯
投票方法– ???
Sept. 2008©Gao Yang, Ai Lab NJU22
多数投票
Plurality protocol– A majority voting protocol where all alternatives
are compared simultaneously, and the one with the highest number of votes wins.
– Don’t satisfies the irrelevant rule.– For example:
60% , 40% ,
.
30% , 30% , 40% ,
.
,
6
0
agents a b and agents a b
The first social choice result is a
a
Introduce
gents a c b agents c a b and agents c a b
The second social choice result is b
Howeve
alternativ
r
es c
% .agents think a is favor of b
Sept. 2008©Gao Yang, Ai Lab NJU23
二叉投票
Binary protocol
– The alternatives are voted on pairwise, and the wi
nner stays to challenge further alternatives while
the loser is eliminated.
– As in plurality protocol, could’t satisfy irrelevant r
ule.
– Further, the agenda can change the socially chos
en outcome.
Sept. 2008©Gao Yang, Ai Lab NJU24
二叉投票(续)
Binary protocol example– 35% of agents have preferences– 33% of agents have preferences– 32% of agents have preferences
c d b a
a c d b
b a c d b d
a b
c a b c
d a
c dc a
c a
c b
c d b d
d a
b da b
a d
c a
c d a d
b c
b dd c
c a
c b
c d b d
a b
a db d
Sept. 2008©Gao Yang, Ai Lab NJU25
记分投票
Borda protocol– The Borda count assigns an alternative |O| points
whenever it is highest in some agent’s preference list, |O|-1 whenever it is second and so on.
– The alternative with the highest count becomes the social choice.
– Can also lead to paradoxical result, for example via irrelevant alternatives.
Sept. 2008©Gao Yang, Ai Lab NJU26
记分投票(续)
Borda protocol exampleAgent Preferences a b c d Points
1 a > b > c > d 4 3 2 1
2 b > c > d > a 1 4 3 2
3 c > d > a > b 2 1 4 3
4 a > b > c > d 4 3 2 1
5 b > c > d > d 1 4 3 2
6 c > d > a > b 2 1 4 3
7 a > b > c > d 4 3 2 1
Borda count C wins with 20, b has 19, a has 18, d loses with 13
Borda count with d removed
A wins with 15, b has 14, c loses with 13
Sept. 2008©Gao Yang, Ai Lab NJU27
不诚实投票
How to design a social choice mechanisms in
insincere voting?
But if an agent can benefit from insincerely declaring his preferences, he will do so.
Sept. 2008©Gao Yang, Ai Lab NJU28
投票的对策论分析
效用– 首先构造每个 Agent 关于选民的序结构;– 通过函数给出序的值(在 0 , 1 之间);– 定义损失函数 dj(oi)= (在方案中 oi与 agent j 理想分配之间的欧氏距离)。
方案比较– 重心模型– Pareto 模型
基于 Game theory 的理论分析
Sept. 2008©Gao Yang, Ai Lab NJU29
拍卖机制
In voting, the protocol designer is assumed to want
to enhance the social good.
While in auctions, the auctioneer wants to maximize
his own profit.
– The auctioneer wants to sell an item and get the
highest possible payment for it.
– The bidders wants to acquire the item at the
lowest possible price.
Sept. 2008©Gao Yang, Ai Lab NJU30
拍卖机制的环境设置
– Private valueThe value of the good depends only on the agent’s own
preferences.
The key is that the winning bidder will not resell the item
in order to get utility.
For example: auctioning off a cake.
In other words, value depends only on the bidder.
Sept. 2008©Gao Yang, Ai Lab NJU31
拍卖机制的环境设置(续)
– Common valueAn agent’s value of an item depends entirely on other
agents’ values of it.
For example: auctioning treasury bills.
In other words: value depends only on other bidders.
Sept. 2008©Gao Yang, Ai Lab NJU32
拍卖机制的环境设置(续)
– Correlated valueAn agent’s value depends partly on its own preferences
and partly on others’ values.
For example: a negotiation within a contracting setting.
– An agent decreases the cost of task.
– An agent can recontract out the task.
In other words, partly on own’s value, partly on others.
Sept. 2008©Gao Yang, Ai Lab NJU33
英格兰拍卖
English (first-price open-cry)– Each bidder is free to raise his bid. When no bidder is
willing to raise anymore, the auction ends, and the highest
bidder wins the item.
– An agent’s strategy Is a series of bids as a function of his private value, his
prior estimates of other bidder’s valuations, the past
bids of others.
Sept. 2008©Gao Yang, Ai Lab NJU34
英格兰拍卖(续)
– In private value English auctionsAn agent’s dominant strategy is to always bid a small
more than the current highest bid, and stop when his
private value price is reached.
– In correlated value auctionsThe rules are often varied to make the auctioneer
increase the price at a constant rate or at a rate he
thinks appropriate.
Sept. 2008©Gao Yang, Ai Lab NJU35
密封拍卖
First-price sealed-bid auction– Each bidder submits one bid without knowing the others’
bids. The highest bidder wins the item.
– An agent’s strategyas a function of his private value, his prior estimates of
other bidder’s valuations.
– No dominant strategy for bidding in this auction.
– Best strategy Is to bid less than his true valuation, but how much less
depends on what the others bid.
Sept. 2008©Gao Yang, Ai Lab NJU36
密封拍卖
– In private value auction (Common knowledge assumptions) The probability
distributions such as uniform distribution of the agents’
value.
Nash equilibrium for every agent i is
1i
Av
A
Sept. 2008©Gao Yang, Ai Lab NJU37
荷兰式拍卖
Dutch (descending) auction– The seller continuously lowers the price until one of the
bidders takes the item at the current price.
– An agent’s strategyThe Dutch auction is strategically equivalent to the first-
price sealed-bid auction.
Sept. 2008©Gao Yang, Ai Lab NJU38
Vickery拍卖
Vickrey (second-price sealed-bid) auction– Each bidder submits one bid without knowing the others’ bi
ds. The highest bidder wins, but at the price of the second
highest bid.
– An agent’s strategyas a function of his private value, his prior estimates of
other bidder’s valuations.
– Theorem
A bidder’s dominant strategy in a private value Vi
ckrey auction is to bid his true valuation.
Sept. 2008©Gao Yang, Ai Lab NJU39
Vickery拍卖(续)
– Vickrey auctions are used to
Allocate computation resources in operation systems,
Allocate bandwith in computer networks,
Control building heating.
Sept. 2008©Gao Yang, Ai Lab NJU40
Vickery拍卖(续)
Are first-price auctions better for the auctioneer than second-
price auctions?
Theorem: All 4 types of protocol produce the same expected
revenue to the auctioneer (assuming (1) private value auctions,
(2) values are independently distributed and (3) bidders are risk-
neutral).
Why are second prices not so popular among humans?
– Lying auctioneer.
– When the result are published, subcontractors know the true
valuations and what they saved. So they might want to share
the profit.
Sept. 2008©Gao Yang, Ai Lab NJU41
拍卖的对策论分析
拍卖的对策假定、要素和过程分析– 拍卖的对策理论假定
拍卖是具有不完全信息的非合作对策Agent I 对其他 Agent 私有价值的主观概率
Agent 的决策是独立作出的,不存在协议。
11 2 1 1
,..., ,...,, ,..., , ,..., i n
i i n ii
P v v vP v v v v v v
P v
Sept. 2008©Gao Yang, Ai Lab NJU42
拍卖的对策论分析(续)
– 拍卖的要素– 效用函数:用 vi表示第 i 个 Agent 对拍卖品的私人价值,而 bi为报价。则效用值为未获得标的 ui=0
获得标的
– 预期所获支付
1 1 1max ,..., , ,...,i i
ii i i n
v bu
v b b b b
第一,荷兰,英式拍卖
第二价格拍卖
i i i iv b P b
Sept. 2008©Gao Yang, Ai Lab NJU43
拍卖的对策论分析(续)
– 拍卖的过程分析
*
0 1
[0 1]
1, ,
20
max 2.
i i
i i i j
i i j i i i i j
i j
i j
ib
Agent v v
Agent
v b if b b
u b b v v b if b b
if b b
u v b P b b
u b v
两个 ,只知道自己的私有价值( ),其他 的私有价值是 ,上的均匀分布。
期望支付
最优策略 ,解微分方程
如果是 n 个 Agent ,则为( n-1 ) v/n 。因此招标中对拍卖者来讲,投标人数越多越有利。
Sept. 2008©Gao Yang, Ai Lab NJU44
相关资源的拍卖
Inefficient Allocation and Lying in Interrelated Aucti
ons
– Extended Auctioning
auction of multiple items of a homogeneous good,
auction of heterogeneous irrelated goods,
auction of heterogeneous interrelated goods.
Sept. 2008©Gao Yang, Ai Lab NJU45
相关资源的拍卖(续)
Example (Task Allocation)– Two delivery tasks t1, t2, two agents 1, 2,
1.0
0. 5 0. 5
t2
t1
A1 A2
c1(t1)=2
c1(t2)=1
c1(t1,t2)=2
c2(t1)=1.5
c2(t2)=1.5
c2(t1,t2)=2.5
Sept. 2008©Gao Yang, Ai Lab NJU46
相关资源的拍卖(续)
The global optimal solution is not reached by
auctioning independently and truthful bidding.
– t1 goes to agent 2 (for a price of 1.5) and t2 goes
to agent 1 (for a price of 1).
– Even if agent 2 considers (when bidding for t2)
that he already got t1 (so he bids cost(t1,t2)-
cost(t1)=2.5-1.5=1) he will get it only with a
probability of 0.5.
Sept. 2008©Gao Yang, Ai Lab NJU47
相关资源的拍卖(续)
What about full lookahead?
Therefore:
– It pays off for agent 1 to bid more for t1 (up to 1.5 m
ore than truthful bidding).
– It does not pay off for agent 2, because agent 2 doe
s not make a profit at t2 anyway.
– Agent 1 bids 0.5 for t1 (instead of 2), agent 2 bids 1.
5. Therefore agent 1 gets it for 1.5. Agent 1 also get
s t2 for 1.5.
Sept. 2008©Gao Yang, Ai Lab NJU48
相关资源的拍卖(续)
If a1 have t1, c1(t1,t2)-c1(t1)=2-2=0. else c1(t2)=1
If a2 have t1, c2(t1,t2)-c2(t1)=2.5-1.5=1 else c2(t2)=1.5
So when a1 have t1, it bids t2 will get extra profit 1.5-0=1.5
when a2 have t1, it bids t2 will get extra profit 1-1=0
So when a1 bids t1, it will bid c1(t1)-extra profit=2-1.5=0.5
when a2 bids t1, it will bid c2(t1)-extra profit=1.5-0=1.5
A1 wins!
Sept. 2008©Gao Yang, Ai Lab NJU49
拍卖中的调查
Does it make sense to counterspeculate at private value Vickery auctions?
Vickery auctions were invented to avoid counterspeculation. But what if the private value for a bidder is uncertain? The bidder might be able to determine it, but he needs to invest c.
Example– Suppose bidder 1 does not know the (private-) val
ue v1 of the item to be auctioned. To determine it, he need to invest cost. We also assume that v1 is uniformly distributed: satisfies 0 <= v1 <= 1.
Sept. 2008©Gao Yang, Ai Lab NJU50
拍卖中的调查(续)
– For bidder 2, the private value v2 of the item is fixed: 0 <=v2 <= ½. So his dominant strategy is to bid v2.
Should bidder 1 try to invest cost to determine his private value? How does this depend on knowing v2?
Answer: Bidder 1 should invest cost if and only if
1
22 2v cost
Sept. 2008©Gao Yang, Ai Lab NJU51
拍卖中的调查(续)
Proof
2
2
1
1 2 1 20
1 21 1 2 1 2 20
22 2 2
22 2
1
21 1
2 21 1 1
2 2 21
22
noinfo
v
info v
info noinfo
E v v dv v
E cdv v v cdv v v c
E E v v c v
v c v c
Sept. 2008©Gao Yang, Ai Lab NJU52
谈判机制
Axiomatic Bargaining Theory
We assume two agent 1,2 , each with a utility function ui: O->R. if the agents do not agree on the result o the fallback ofallback is tacken.
Example (Sharing 1 apple)– How to share 1 apple?– Agent 1 offers p (0 <p <1), agent 2 agrees!
– Such deals are individually rational and each one is in Nas
h-equilibrium!
Therefore we need axioms!
Sept. 2008©Gao Yang, Ai Lab NJU53
公理谈判理论
Axioms on the global solutions u*=<u1(o*), u2(o*) >.
Invariance: Absolute values of the utility functions
do not matter, only relative values.
Symmetry: Changing the agents does not influence
the solution o*.
Irrelevant Alternatives: If O is made smaller but o*
still remains, then o* remains the solution.
Pareto: The players can not get the higher utility
than u*=<u1(o*), u2(o*) >.
Sept. 2008©Gao Yang, Ai Lab NJU54
可行解的说明
u2
u1
c
Sept. 2008©Gao Yang, Ai Lab NJU55
公理谈判理论的解
Theorem (Unique solution)– The four axioms above uniquely determine a
solution. This solution is given by
*1 1 2 2arg maxo fallback fallbacko u o u o u o u o
Sept. 2008©Gao Yang, Ai Lab NJU56
策略谈判理论
No axioms: view it as a game!
Example revisited: Sharing 1 apple. Protocol with finitely many steps: The last offerer ju
st offers e, This should be accepted, so the last offerer gets 1-e.
This is unsatisfiable. Ways out:– 1. Add a discountfactor δ: in round n, only the δn-1
th part of the original value is available.– 2. Bargaining costs: bargaining is not for free – fe
es have to be paid.
Sept. 2008©Gao Yang, Ai Lab NJU57
策略谈判理论(续)
Strategic Bargaining Theory
Finite Games: Suppose δ =0.9. Then the outcome depends on # rounds.
Round I’s share2’s share
Total value
Offerer
…
n-3
…
0.819
…
0.181
…
0.9n-4
…
2
n-2 0.91 0.09 0.9n-3 1
n-1 0.9 0.1 0.9n-2 2
n 1 0 0.9n-1 1
Sept. 2008©Gao Yang, Ai Lab NJU58
策略谈判理论(续)
Infinite games: δ1 factor for agent 1, δ2 factor for agent 2.
Theorem: (Unique solution for infinite games)
In a discounted infinite round setting, there exists a unique Nash equilibrium :Agent 1 gets (1- δ2)/(1- δ1
δ2). Agent 2 gets the rest. Agreement is reached in the first round.
Sept. 2008©Gao Yang, Ai Lab NJU59
策略谈判理论(续)
Proof:
Round 1‘s share 2‘s share Offerer
… … … …
t-2 1- δ2(1- δ1 π1) 1
t-1 1- δ1 π1 2
t π1 1
… … … …
21 2 1 1 1
1 2
11 1
1
Sept. 2008©Gao Yang, Ai Lab NJU60
策略谈判理论(续)
Bargaining Costs
Agent 1 pays c1, agent 2 pays c2.
Time t: 1 get p, 2 get 1-p; Time t-1: 2 thinks: 1 get p+c2, 2 get 1-p-c2; Time t-2: 1 thinks: 1get p+c2-c1, 2 get 1-p-c2+c1; Time t-2k: 1 thinks: 2 get 1-p-k(c2-c1).
Sept. 2008©Gao Yang, Ai Lab NJU61
策略谈判理论(续)
c1=c2: Any split is in Nash-equilibrium.
c1<c2: Agent 1 gets all.
c1>c2: Agent 1 gets c2, agent 2 gets 1-c2.
Sept. 2008©Gao Yang, Ai Lab NJU62
Choiced paper
Cooperative vs. Competitive Multi-Agent Negotiatios in retail Electronic Commerce by Guttman and Maes (PDF)
M. Beer, M. d'Inverno, M. Luck, N. R. Jennings, C. Preist and M. Schroeder (1999) "Negotiation in Multi-Agent Systems" Knowledge Engineering Review 14 (3) 285-289. (PS)
N. R. Jennings, S. Parsons, C. Sierra and P. Faratin (2000) "Automated Negotiation" Proc. 5th Int. Conf. on the Practical Application of Intelligent Agents and Multi- Agent Systems (PAAM-2000), Manchester, UK, 23-30. (PS)
Automated Negotiation: Prospects, Methods and Challenges by Jennings et al. (PS)
Color : read optional. Color : translate optional.