Some Models of Information Aggregation and Consensus in ... · Audio Download (Audible.com) $25.95...

Some Models of Information Aggregation

and Consensus in Networks

John N. TsitsiklisMIT

ACCESS, KTHJanuary 2011

Overview

• Information propagation/aggregation in networks

– Engineered versus social networks

– Bayesian versus naive updates

• Review a few key models and results

• Is this applied probability?

Theoretical AP Applied APrandom graphs ad. campaigns, etc.

Narrative AP Recreational AP“explain” social phenomena play with toy models

1

5/30/09 4:14 PMAmazon.com: The Wisdom of Crowds: James Surowiecki: Books

Page 1 of 8http://www.amazon.com/Wisdom-Crowds-James-Surowiecki/dp/0385721706/ref=sr_1_1?ie=UTF8&s=books&qid=1243714434&sr=8-1

John's Amazon.com Today's Deals Gifts & Wish Lists Gift Cards

Sponsored Content

Quantity: 1

or

Sign in to turn on 1-Click ordering.

or

Amazon Prime Free Trial

required. Sign up when you

check out. Learn More

More Buying Choices

111 used & new from

$5.59

Have one to sell?

Share with Friends

Share your own customer images

Search inside another edition of this book

Start reading The Wisdomof Crowds on your Kindlein under a minute.

Don’t have a Kindle? Getyours here.

+

Hello, John Tsitsiklis. We have recommendations for you. (Not John?) Buy Any CD, Get $1 Off MP3s

Your Account | Help

Search Books

BooksAdvanced

SearchBrowseSubjects

Hot NewReleases

BestsellersThe New York

Times® Best SellersLibros EnEspañol

BargainBooks

Textbooks

The Wisdom of Crowds and over 285,000 other books are available for Amazon Kindle – Amazon’s new wireless reading device.

Learn more

The Wisdom of Crowds (Paperback)by James Surowiecki (Author) "If, years hence, people remember anything aboutthe TV game show Who Wants to Be a Millionaire?, they will probably remember

the contestants' panicked phone..." (more) Key Phrases: bowling bubble, bowling stocks, foam strike, United States,New York, Wall Street (more...)

(173 customer reviews)

List Price: $14.95

Price: $10.17 & eligible for FREE Super Saver Shipping

on orders over $25. Details

You Save: $4.78 (32%)

In Stock.Ships from and sold by Amazon.com. Gift-wrap available.

Want it delivered Tuesday, June 2? Order it in the next 50 hours and 30minutes, and choose One-Day Shipping at checkout. Details

59 new from $5.59 51 used from $5.92 1 collectible from $21.00

Best Value

Buy The Wisdom of Crowds and get Crowdsourcing: Why the Power of the Crowd Is Driving the Future of

Business at an additional 5% off Amazon.com's everyday low price.

Buy Together Today: $27.07

Leadership and Business Seminars in Boston (What's this?) | Change location

Business Degree Online eLearners.com/Business Boost Your Salary with a Business/ Administration B.S. Get Info Today!

Provider: Education Dynamics - elearners (Details) | No reviews yet. Be the first

Shop All Departments Cart Your Lists

Also Available in: ListPrice:

OurPrice:

Other Offers:

Kindle Edition (Kindle Book) $9.99

Hardcover (Bargain Price) $24.95 $16.475 used & new from$15.02

Hardcover $24.95 $16.4791 used & new from$3.39

Paperback29 used & new from$5.83

Audio Download(Audible.com)

$25.95 $13.63

Show more editions and formats

Get Free Two-Day ShippingGet Free Two-Day Shipping for three months with a special extended free trial of Amazon Prime. Add this eligible textbook to your cart to qualify. Sign up atcheckout. See details.

Customers Who Bought This Item Also Bought Page 1 of 20

Wikinomics: How Mass

Collaboration Changes

Everyt... by Don Tapscott

(103) $18.45

The Tipping Point: How

Little Things Can

Make... by Malcolm

Gladwell

(1,000) $8.54

Blink: The Power of

Thinking Without

Thinking by Malcolm

Gladwell

(1,035) $9.11

Made to Stick: Why

Some Ideas Survive

and Others Die by Chip

Heath

(274) $16.50

Back Next

The Wisdom of Crowds

(Paperback)

by James Surowiecki

(173) $10.17

111 used & new from $5.59

Book sections

Front Cover

Front Flap

Copyright

Table of Contents

First Pages

Back Flap

Back Cover

Surprise Me!

Sample searches in this book:

bowling bubble

bowling stocks

foam strike

Search Inside This Book

Feedback | Help | Expanded View | Close

Myopia and Herding

The Wisdom or Madness of Crowds


Wisdom versus Herding

of Rational but Selfish Agents

Consensus and Averaging

12

Social sciences

• Merging of “expert” opinions

• Evolution of public opinion

• Reputation systems

• Modeling of jurors

• Language evolution

• Charles Mackay (London, 1841):Extraordinary Popular Delusionsand the Madness of Crowds

– “Men, it has been well said,think in herds; it will be seenthat they go mad in herds,. . .”

5

Myopia and Herding

Wisdom or Madness of Crowds?





12

Sensor and other Engineered Networks

• Fusion of available information

– we get to design the nodes’ behavior

12

The Basic Setup

• Each node i endowed with private information Xi

– Nodes form opinions, make decisions

– Nodes observe opinions/decisions or receive messages

– Nodes update opinions/decisions

– Everyone wants a “good”decision; common objective

• Questions

– Convergence? To what? How fast? Quality of limit?

– Does the underlying graph matter? (Tree, acyclic, random,. . .)

• Variations

– Hypothesis testing (binary decisions): P(H0 | Xi)

– Parameter estimation: E[Y | Xi]

– Optimization: minu E[c(u, Y ) | Xi]

13

The Basic Setup






• Questions



• Variations




13

• Social science: postulate update mechanism

• Engineering: design/optimize update mechanism

14

The Basic Setup






• Questions



• Variations




13

The Basic Setup






• Questions



• Variations




13

• Social science: postulate update mechanism

• Engineering: design/optimize update mechanism

14

Myopia and Herding

Wisdom or Madness of Crowds?





Bayesian Models15

Choosing the Best Tavern

• Private info points to the better one,with some probability of error

• Information is there, but is not used

13

Where do we eat tonight?


• Information is there


13


(Bickchandani, Hisrchleifer, Welch, 92; Banerjee, 92)




13




13





13






13

A

A BA B

A AA




• Prior slightlyin favor of A



copy

16




13





13






13





13

A

A BA B

A AA







copy

16




13





13






13





13

A

A BA B

A AA







copy

16







copy

16

Tandem networks

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 1)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 1)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

0 1

0 1

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

0 1

0 1

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

0 1

0 1

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

0 1

0 1

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

0 1

0 1

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

0 1

0 1

Tandem networks

Let !j > 0 be the prior probability of hypothesis Hj and Pt(n) = !0P0(Yn,n = 1)+!1P1(Yn,n =

0) be the probability of error at sensor n. The goal of a system designer is to design a strategy

so that the probability of error Pt(n) is minimized. Let P !t (n) = inf Pt(n), where the infimum is

taken over all possible strategies.

1 2 n

X1 X2 Xn

Figure 1: A tandem network.

The problem of finding optimal strategies has been studied in [1–3], while the asymptotic

performance of a long tandem network (i.e., n ! ") is considered in [2, 4–8] (some of these

works do not restrict the message sent by each sensor to be binary). In the case of binary

communications, [4, 8] find necessary and su!cient conditions under which the error probability

goes to zero in the limit of large n. To be specific, the error probability stays bounded away

from zero i" there exists a B < " such that | log dP1

dP0| # B. In the case when the log-likelihood

ratio is unbounded, numerical examples have indicated that the error probability goes to zero

much slower than an exponential rate. In a parallel configuration, in which all sensors send their

messages directly to a single fusion center (i.e., "i is now a function of only Xi), the rate of decay

of the error probability is known to be exponential [9]. This suggests that a tandem configuration

performs worse than a parallel configuration, when n is large. It has been conjectured in [2,8,10,11]

that indeed, the rate of decay of the error probability is sub-exponential. However, no rigorous

proof is known for this result. The goal of this paper is to prove this conjecture.

We first note that there is a caveat to the sub-exponential decay conjecture: the probability

measures P0 and P1 are assumed to be equivalent measures, i.e., they are absolutely continuous

w.r.t. each other. Indeed, if there exists a measurable set A such that P0(A) > 0 and P1(A) = 0,

then an exponential decay rate can be achieved as follows: each sensor always declares 1 till the

2

• Estimation with limited memory(Cover, 1969; Hellman & Cover, 1970; Koplowitz, 1975)

• en = optimal P(Yn is incorrect)

• en ! 0? How fast?

10

Tandem networks





1 2 n

X1 X2 Xn




















2




10

Y1 Y2 Yn!1

• Xi are i.i.d.

• H0 : Xi " P0, H1 : Xi " P1

• ! : R #$ {0,1}

• minimize P(error) over {!i}

H0 H1

Yn = 0 Yn = 1

p0(0$ 1) p1(0$ 1)

p1(1$ 0) p0(1$ 0)

0 1

0 1

Y1 Y2 Yn!1

• Xi are i.i.d.

• H0 : Xi " P0, H1 : Xi " P1

• ! : R #$ {0,1}


H0 H1

Yn = 0 Yn = 1

p0(0$ 1) p1(0$ 1)

p1(1$ 0) p0(1$ 0)

0 1

0 1

Y1 Y2 Yn!1

• Xi are i.i.d.

• H0 : Xi " P0, H1 : Xi " P1

• ! : R #$ {0,1}


H0 H1

Yn = 0 Yn = 1

p0(0$ 1) p1(0$ 1)

p1(1$ 0) p0(1$ 0)

0 1

0 1

• en = min P(Yn is incorrect)

• Private info: likelihood ratio Li =P1(Xi)

P0(Xi)

• Two regimes:

• 0 < a ! Li < b <"Engineered en #$ 0 =% Myopic en #$ 0

• Li ranges over (0,")Myopic en $ 0 =% Engineered en #$ 0 en $ 0(but slowly)

Tandem networks

• Xi are i.i.d.; H0 : Xi ! P0, H1 : Xi ! P1

• binary message/decision functions !i: Yi = !i(Xi, Yi"1)

• “Social network” view: each !i is myopic–Bayes optimal

• “Engineered system view”: {!i} designed for best end-decision

– simple sensor network

– estimation with limited memory [Cover, 1969]

Tandem networks







Tandem networks

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 1)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 1)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

0 1

0 1

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

0 1

0 1

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

0 1

0 1

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

0 1

0 1

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

0 1

0 1

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

0 1

0 1

Tandem networks





1 2 n

X1 X2 Xn




















2




10

Tandem networks





1 2 n

X1 X2 Xn




















2




10

Y1 Y2 Yn!1

• Xi are i.i.d.

• H0 : Xi " P0, H1 : Xi " P1

• ! : R #$ {0,1}


H0 H1

Yn = 0 Yn = 1

p0(0$ 1) p1(0$ 1)

p1(1$ 0) p0(1$ 0)

0 1

0 1

Y1 Y2 Yn!1

• Xi are i.i.d.

• H0 : Xi " P0, H1 : Xi " P1

• ! : R #$ {0,1}


H0 H1

Yn = 0 Yn = 1

p0(0$ 1) p1(0$ 1)

p1(1$ 0) p0(1$ 0)

0 1

0 1

Y1 Y2 Yn!1

• Xi are i.i.d.

• H0 : Xi " P0, H1 : Xi " P1

• ! : R #$ {0,1}


H0 H1

Yn = 0 Yn = 1

p0(0$ 1) p1(0$ 1)

p1(1$ 0) p0(1$ 0)

0 1

0 1



P0(Xi)

• Two regimes:



Tandem networks







Tandem networks







Tandem Network Results

• en = P(Yn is incorrect) ! 0 ?

19

Tandem networks





– simple sensor network [Cover, 1969]

L H1 H0 copy

0! 1

1! 0

P1(Xi) P0(Xi)x

Bounded likelihood ratios (B-LR): 0 < a < Li < b <"

Unbounded likelihood ratios (U-LR): 0 < Li <"

The Two Regimes

19

L H1 H0 copy 0! 1 1! 0 P1(Xi) P0(Xi)x

Bounded likelihood ratios (B-LR): 0 < a < Li < b <"

never get compelling evidence

compelling evidence has positive probability

Unbounded likelihood ratios (U-LR): 0 < Li <"

The Two Regimes

19

L H1 H0 copy 0! 1 1! 0 P1(Xi) P0(Xi)x

Bounded likelihood ratios (B-LR): 0 < a <d P1

d P0(Xi) < b <"



Unbounded likelihood ratios (U-LR): 0 <d P1

d P0(Xi) <"

20

L H1 H0 copy 0! 1 1! 0 P1(Xi) P0(Xi)x


d P0(Xi) < b <"




d P0(Xi) <"

20

L H1 H0 copy 0! 1 1! 0 P1(Xi) P0(Xi)x


d P0(Xi) < b <"


arbitrarily compelling evidence is possible


d P0(Xi) <"

20

Tandem networks

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 1)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 1)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

0 1

0 1

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

0 1

0 1

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

0 1

0 1

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

0 1

0 1

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

0 1

0 1

H0 H1

Yn = 0 Yn = 1

p0(0! 1) p1(0! 1)

p1(1! 0) p0(1! 0)

0 1

0 1

Tandem networks





1 2 n

X1 X2 Xn




















2




10

Tandem networks





1 2 n

X1 X2 Xn




















2




10

Y1 Y2 Yn!1

• Xi are i.i.d.

• H0 : Xi " P0, H1 : Xi " P1

• ! : R #$ {0,1}


H0 H1

Yn = 0 Yn = 1

p0(0$ 1) p1(0$ 1)

p1(1$ 0) p0(1$ 0)

0 1

0 1

Y1 Y2 Yn!1

• Xi are i.i.d.

• H0 : Xi " P0, H1 : Xi " P1

• ! : R #$ {0,1}


H0 H1

Yn = 0 Yn = 1

p0(0$ 1) p1(0$ 1)

p1(1$ 0) p0(1$ 0)

0 1

0 1

Y1 Y2 Yn!1

• Xi are i.i.d.

• H0 : Xi " P0, H1 : Xi " P1

• ! : R #$ {0,1}


H0 H1

Yn = 0 Yn = 1

p0(0$ 1) p1(0$ 1)

p1(1$ 0) p0(1$ 0)

0 1

0 1



P0(Xi)

• Two regimes:



Tandem Network Results

• en = P(Yn is incorrect)


P0(Xi)

Two regimes:

• range of Li: [a, b] 0 < a < b <!Engineered en "# 0 =$ Myopic en "# 0(Cover, 1969)

• range of Li: (0,!)Myopic en # 0 =$ Engineered en # 0(Papastavrou & Athans, 1992) (Cover, 1969)

(but slowly) (Tay, JNT & Win, 2008)

The Two Regimes

Social Engineered

U-LR: en ! 0 =" en ! 0(Papastavrou (Cover, 69)

& Athans, 92) but slowly(Tay, Win & JNT, 08)

B-LR: en #! 0 $= en #! 0(Koplowitz, 75)

B-LR, en #! 0 en ! 0ternary (Dia & JNT, 09) (Koplowitz, 75)

messages

21

Engineered — Architectural Comparisons

(Tay, Win & JNT, 08)

(Zoumpoulis & JNT, 09)

unclear unclear unclear

less informationbut “same performance”(Tay, Win & JNT, 08)

more informationbut “same performance”(Zoumpoulis & JNT, 09)

but same exponent(Neyman-Pearson version)

24

General Acyclic Networks — Engineered



unclear

en ! exp{"!n}

! is known

(JNT, 88)

23




unclear

en ! exp{"!n}

! is known

(JNT, 88)

23




unclear

en ! exp{"!n}

! is known

(JNT, 88)

23








24




unclear

en ! exp{"!n}

! is known

(JNT, 88)

23




unclear

en ! exp{"!n}

! is known

(JNT, 88)

23




unclear

en ! exp{"!n}

! is known

(JNT, 88)

23

Another example

vmv2v1

f

2 2 2

• Leaves make Bernoulli observations Xi

• Easy to verify that !!tree,NP < !!star,NP.

• Maybe low branching factor near the leaves is the culprit

15





information loss


worse error exponent

en ! exp{"!n}

! is known

(JNT, 88)

23








24




unclear

en ! exp{"!n}

! is known

(JNT, 88)

23




unclear

en ! exp{"!n}

! is known

(JNT, 88)

23




unclear

en ! exp{"!n}

! is known

(JNT, 88)

23

Another example

vmv2v1

f

2 2 2




15





information loss



en ! exp{"!n}

! is known

(JNT, 88)

23





information loss



en ! exp{"!n}

! is known

(JNT, 88)

23

A Simple Tree

f

v1 v2

m m

• Each subtree: optimal NP testfor: P0(false alarm) = !/2

• P1(missed detection) ! e"m"#star,NP

• Fusion rule:declare H1 i! [Yv1 = 1 or Yv2 = 1]

13








23








24




unclear

en ! exp{"!n}

! is known

(JNT, 88)

23




unclear

en ! exp{"!n}

! is known

(JNT, 88)

23




unclear

en ! exp{"!n}

! is known

(JNT, 88)

23

Another example

vmv2v1

f

2 2 2




15





information loss



en ! exp{"!n}

! is known

(JNT, 88)

23





information loss



en ! exp{"!n}

! is known

(JNT, 88)

23

A Simple Tree

f

v1 v2

m m




13








23








24

A Simple Tree

f

v1 v2

m m




13






more informationbut “same performance”(Kreidl, Zoumpoulis & JNT, 10)


24

A Perspective on Bayesian Optimality

• Engineering

– Optimal rules are hard to find

– Can design “good” schemes (asymptotic, etc.)

• Social Networks

– Is Bayesian updating plausible?

• More realistic settings?

– Correlated private information

– Multiple hypotheses

– Complex topologies or interaction sequences

26

“Naive” Information Aggregation(Consensus and Averaging)

“Naive” Information Aggregation

(Consensus and Averaging)

26

The Setting

• n agents

– starting values xi(0)

• reach consensus on some x!, with either:

– mini xi(0) " x! " maxi xi(0) (consensus)

– x! =x1(0) + · · · + xn(0)

n(averaging)

– averaging when xi # {$1,+1} (voting)

• interested in:

– genuinely distributed algorithm

– no synchronization

– no “infrastructure” such as spanning trees

• simple updates, such as: xi :=xi + xj

229

The Setting

• n agents




– x! =x1(0) + · · · + xn(0)

n(averaging)


• interested in:





229

The Setting

• n agents




– x! =x1(0) + · · · + xn(0)

n(averaging)


• interested in:





229

Social sciences



• Evolution of reputation



• Preference for “simple” models

– behavior described by “rules of thumb”

– less complex than Bayesian updating

• interested in modeling, analysis (descriptive theory)

– . . . and narratives

Social sciences



• Evolution of reputation



• Preference for “simple” models

– behavior described by “rules of thumb”

– less complex than Bayesian updating

• interested in modeling, analysis (descriptive theory)

– . . . and narratives

Engineering• Distributed computation and sensor networks

– Fusion of individual estimates– Distributed Kalman filtering– Distributed optimization– Distributed reinforcement learning

• Networking– Load balancing and resource allocation– Clock synchronization– Reputation management in ad hoc networks– Network monitoring

• Multiagent coordination and control– Coverage control– Monitoring– Creating virtual coordinates for geographic routing– Decentralized task assignment– Flocking

30

Distributed optimization

minx

n!

i=1fi(x)

• centralized gradient iteration: x := x! !n!

i=1"fi(x)

• distributed gradient algorithm: xi := xi ! !"fi(xi)

• reconciling distributed updates: xi :=xi + xj

2

converges (when stepsize ! is small enough)under minimal assumptions

– time-varying interconnection topology

minx

f(x)

• centralized x := x! !"f(x) + noise

• distributed xi := x! !"f(x) + noisei

• reconcile xi :=xi + 2xj

3

• consensus algorithm su!ces

averaging algorithm needed


minx

n!

i=1fi(x)


i=1"fi(x)



2



minx

f(x)




3




minx

n!

i=1fi(x)


i=1"fi(x)



2




minx

n!

i=1fi(x)


i=1"fi(x)



2




minx

n!

i=1fi(x)


i=1"fi(x)



2




minx

n!

i=1fi(x)


i=1"fi(x)



2




minimizen!

i=1fi(x)

• agent i: update xi in direction of improvement of fi

reconcile updates through a consensus or averaging algorithm

minimize f(x1, . . . , xn)

• agent i: update xi, based on (possibly outdated) values of xj

9

minx

f(x)




3




minx

n!

i=1fi(x)


i=1"fi(x)



2




minx

n!

i=1fi(x)


i=1"fi(x)



2




minx

n!

i=1fi(x)


i=1"fi(x)



2



Asynchronous computation model


(JNT, Bertsekas, Athans, 1986)

xi(t + 1) =!

j

aij(t)xj

"t! dij(t)

#

aij(t): nonzero whenever i receives message from j

dij(t): delay of that message

6



xi(t + 1) =!

j

aij(t)xj

"t! dij(t)

#



6



24

The DeGroot opinion pooling model (1974)

xi(t + 1) =!

j

aij xj(t) aij ! 0,"

j aij = 1

x(t + 1) = Ax(t) A: stochastic matrix

• Markov chain theory + “mixing conditions”

"# convergence of At, to matrix with equal rows

"# convergence of xi to"

j !jxj

"# convergence rate estimates

•

• x(t + 1) = A(t)x(t):mixing conditions for nonstationary Markov chains(Chatterjee and Seneta, 1977)

!(n2) for line graphs




xi(t + 1) =!

j

aij(t)xj

"t! dij(t)

#



6



xi(t + 1) =!

j

aij(t)xj

"t! dij(t)

#



6



24

Averaging algorithms

• Averaging algorithms

– A doubly stochastic: 1! A x = 1! x, where 1! = [1 1 . . . 1]

– x1 + · · · + xn is conserved

– convergence to x" =x1(0) + · · · + xn(0)

n

34


xi(t + 1) =!

j

aij xj(t) aij ! 0,"

j aij = 1





j !jxj


•


!(n2) for line graphs




xi(t + 1) =!

j

aij(t)xj

"t! dij(t)

#



6



xi(t + 1) =!

j

aij(t)xj

"t! dij(t)

#



6



24

Convergence time (time to get close to “steady-state”)

Equal weight to all neighborsUndirected graph: exponential(n)

Directed graph: O(n3), tight


Convergence time of consensus algorithms

xi(t + 1) =!

j

aij xj(t) aij ! 0,"

j aij = 1





j !jxj


•





xi(t + 1) =!

j

aij(t)xj

"t! dij(t)

#



6



xi(t + 1) =!

j

aij(t)xj

"t! dij(t)

#



6



24


Equal weight to all neighborsUndirected graph: exponential(n)

Directed graph: O(n3), tight


Equal weight to all neighborsDirected graphs: exponential(n)

Undirected graphs: O(n3), tight(Landay and Odlyzko, 1981)

Better results for special graphs(Erdos-Renyi, geometric, small world)


Equal weight to all neighborsDirected graphs: exponential(n)

Undirected graphs: O(n3), tight(Landau and Odlyzko, 1981)

Better results for special graphs(Erdos-Renyi, geometric, small world)


Convergence time of consensus algorithms

xi(t + 1) =!

j

aij xj(t) aij ! 0,"

j aij = 1





j !jxj


•


A critique

• Social Networks

– “The process that it describes is intuitively appealing.”(DeGroot, 74)

– How plausible is this type of synchronism?

• Engineering

– If the graph is indeed fixed:elect a leader, form a spanning tree, accumulate on tree

– Want simplicity, and robustness w.r.t. changing topologies,failures, etc.

– Di!erent models and specs?

45




xi(t + 1) =!

j

aij(t)xj

"t! dij(t)

#



6



xi(t + 1) =!

j

aij(t)xj

"t! dij(t)

#



6



zero-delay case: x(t + 1) = A(t)x(t) (inhomogeneous chain)

27


Time-Varying/Chaotic Environments



xi(t + 1) =!

j

aij(t)xj

"t! dij(t)

#



6



xi(t + 1) =!

j

aij(t)xj

"t! dij(t)

#



6



zero-delay case: x(t + 1) = A(t)x(t) (inhomogeneous chain)

27




xi(t + 1) =!

j

aij(t)xj

"t! dij(t)

#



6



xi(t + 1) =!

j

aij(t)xj

"t! dij(t)

#



6

xi(t + 1) =!

j

aij(t)xj(t)



24




xi(t + 1) =!

j

aij(t)xj

"t! dij(t)

#



6



xi(t + 1) =!

j

aij(t)xj

"t! dij(t)

#



6

xi(t + 1) =!

j

aij(t)xj(t)



24

Variants

xi(t + 1) =!

j

aij xj(t) aij ! 0,"

j aij = 1


• Fixed matrix A, subject to given graph/zero patternsoptimize A via SDP (Boyd & Xiao, 2003)

• i.i.d. random graphs: same (in expectation) as fixed graphs;convergence rate "# “mixing times” (Boyd et al., 2005)

• Fairly arbitrary sequence of graphs/matrices A(t):worst-case analysis

• “equal-neighbor model”: xi := average of messages and own value

• bidirectional model: $ t: i# j i! i" j

• doubly stochastic A(t): sum & average preserving

Consensus convergence

xi(t + 1) =!

j

aij(t)xj

"t)

• aii(t) > 0; aij(t) > 0 =! aij(t) " ! > 0

• “strong connectivity in bounded time”:over B time steps “communication graph”is strongly connected

• Convergence to consensus:# i : xi(t)$ x% =convex combination of initial values(JNT, Bertsekas, Athans, 86; Jadbabaie et al., 03)

• “convergence time”: exponential in n and B

– even with:symmetric graph at each timeequal weight to each neighbor(Cao, Spielman, Morse, 05)

48

Averaging in Time-Varying Setting

• x(t + 1) = A(t)x(t)

– A(t) doubly stochastic, for all t,

– nonzero aij(t) ! ! > 0

– O(n2/!)

• Improved convergence rate

– exchange “load” with up to two neighbors at a time

– can use ! = O(1)

– convergence time: O(n2)

• Is there a !(n2) bound to be discovered?

52

! 1/n1.5

Averaging algorithms – analysis

• Suppose initially!

i xi(0) = 0,!

i x2i = 1

some node has |xi| ! 1/"

n some node has opposite sign

total gap at least 1/"

n some gap at least 1/n1.5

• Over B steps, communicate across gap: V improves by # !/n3

– Convergence time: O(n3/!) ! # 1/degree $% O(n4)– Account for simultaneous contributions of all gaps: O(n2/!)– Keep degree bounded $% 1/! bounded

• Averaging in time-varying bidirectional graphs:no harder than consensus on fixed graphs

• Various convergence proofs of optimization algs. remain valid– Improves the convergence time estimate for subgradient

methods [Nedic, Olshevsky, Ozdaglar, JNT, 09]


• x(t + 1) = A(t)x(t)



– O(n2/!)



– can use ! = O(1)



52


• x(t + 1) = A(t)x(t)



– O(n2/!)



– can use ! = O(1)



52

! 1/n1.5



i xi(0) = 0,!

i x2i = 1











• x(t + 1) = A(t)x(t)



– O(n2/!)



– can use ! = O(1)



52


• x(t + 1) = A(t)x(t)



– O(n2/!)



– can use ! = O(1)



52

! 1/n1.5



i xi(0) = 0,!

i x2i = 1











• x(t + 1) = A(t)x(t)



– O(n2/!)



– can use ! = O(1)



52

Closing Thoughts

Engineering

– Bayesian: near-optimal designs possible

– Naive: interesting, manageable design questions

Social Networks

– What are the plausible models?

!ank y"!

Some Models of Information Aggregation and Consensus in ... · Audio Download (Audible.com) $25.95...

Documents

Transcript of Some Models of Information Aggregation and Consensus in ... · Audio Download (Audible.com) $25.95...