Time-Space Trade-offs in Population Protocolspeople.csail.mit.edu/rivest/pubs/AAEGR17.pdf ·...

20
Time-Space Trade-offs in Population Protocols Dan Alistarh ETH Zurich [email protected] James Aspnes Yale [email protected] David Eisenstat Google [email protected] Rati Gelashvili MIT [email protected] Ronald L. Rivest MIT [email protected] Abstract Population protocols are a popular model of distributed computing, in which randomly-interacting agents with little computational power cooperate to jointly perform computational tasks. Inspired by developments in molecular computation, and in particular DNA computing, recent algorithmic work has focused on the complexity of solving simple yet fundamental tasks in the population model, such as leader election (which requires convergence to a single agent in a special “leader” state), and majority (in which agents must converge to a decision as to which of two possible initial states had higher initial count). Known results point towards an inherent trade-off between the time complexity of such algorithms, and the space complexity, i.e. size of the memory available to each agent. In this paper, we explore this trade-off and provide new upper and lower bounds for majority and leader election. First, we prove a unified lower bound, which relates the space available per node with the time complexity achievable by a protocol: for instance, our result implies that any protocol solving either of these tasks for n agents using O(log log n) states must take Ω(n/polylogn) expected time. This is the first result to characterize time complex- ity for protocols which employ super-constant number of states per node, and proves that fast, poly-logarithmic running times require protocols to have relatively large space costs. On the positive side, we give algorithms showing that fast, poly-logarithmic convergence time can be achieved using O(log 2 n) space per node, in the case of both tasks. Overall, our re- sults highlight a time complexity separation between O(log log n) and Θ(log 2 n) state space size for both majority and leader elec- tion in population protocols, and introduce new techniques, which should be applicable more broadly. 1 Introduction Population protocols [AAD + 06] are a model of distributed computing in which agents with little computational power and no control over the interaction schedule cooperate to collectively perform computational tasks. While ini- tially introduced to model animal populations [AAD + 06], they have proved a useful abstraction for wireless sen- sor networks [PVV09, DV12], chemical reaction net- works [CCDS14], and gene regulatory networks [BB04]. A parallel line of applied research has shown that popu- lation protocols can be implemented at the level of DNA molecules [CDS + 13], and that they are equivalent to com- putational tasks solved by living cells in order to function correctly [CCN12]. Specifically, a population protocol consists of a set of n finite-state agents, interacting in pairs, where each inter- action may update the local state of both participants. The protocol starts in a valid initial configuration, and defines the outcomes of pairwise interactions. The goal is to have all agents converge to some state, representing the output of the computation, which satisfies some predicate over the ini- tial state of the system. For example, one fundamental task is majority (consensus) [AAE08b, PVV09, DV12], in which agents start in one of two input states A and B, and must converge on a decision as to which state has a higher initial count. 1 A complementary fundamental task is leader elec- tion [AAE08a, AG15, DS15], which requires the system to converge to final states in which a single agent is in a special leader state. The set of interactions occurring in each step is usually assumed to be decided by a probabilistic scheduler, which picks the next pair to interact uniformly at random. One complexity measure is parallel time, defined as the number of pairwise interactions until convergence, divided by n, the number of agents. Another is space complexity, defined as the number of distinct states that an agent can represent internally. There has been considerable interest in the complexity of fundamental tasks such as leader election and consensus in the population model. In particular, a progression of deep technical results [Dot14, CCDS14] has culminated in showing that leader election in sublinear time is impossible for protocols which are restricted to a constant number of 1 In this paper, we will focus on the exact majority task, in which the protocol must return the correct decision in all executions, as opposed to approximate majority, where are wrong decision might be returned with low probability [AAE08b]. 2560 Copyright © by SIAM Unauthorized reproduction of this article is prohibited Downloaded 07/14/17 to 74.104.162.184. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php

Transcript of Time-Space Trade-offs in Population Protocolspeople.csail.mit.edu/rivest/pubs/AAEGR17.pdf ·...

Page 1: Time-Space Trade-offs in Population Protocolspeople.csail.mit.edu/rivest/pubs/AAEGR17.pdf · trade-off between the time complexity of such algorithms, and the space complexity, i.e.

Time-Space Trade-offs in Population Protocols

Dan AlistarhETH Zurich

[email protected]

James AspnesYale

[email protected]

David EisenstatGoogle

[email protected]

Rati GelashviliMIT

[email protected]

Ronald L. RivestMIT

[email protected]

AbstractPopulation protocols are a popular model of distributed computing,in which randomly-interacting agents with little computationalpower cooperate to jointly perform computational tasks. Inspiredby developments in molecular computation, and in particular DNAcomputing, recent algorithmic work has focused on the complexityof solving simple yet fundamental tasks in the population model,such as leader election (which requires convergence to a singleagent in a special “leader” state), and majority (in which agentsmust converge to a decision as to which of two possible initial stateshad higher initial count). Known results point towards an inherenttrade-off between the time complexity of such algorithms, and thespace complexity, i.e. size of the memory available to each agent.

In this paper, we explore this trade-off and provide new upperand lower bounds for majority and leader election. First, we provea unified lower bound, which relates the space available per nodewith the time complexity achievable by a protocol: for instance,our result implies that any protocol solving either of these tasksfor n agents using O(log logn) states must take Ω(n/polylogn)expected time. This is the first result to characterize time complex-ity for protocols which employ super-constant number of states pernode, and proves that fast, poly-logarithmic running times requireprotocols to have relatively large space costs.

On the positive side, we give algorithms showing thatfast, poly-logarithmic convergence time can be achieved usingO(log2 n) space per node, in the case of both tasks. Overall, our re-sults highlight a time complexity separation between O(log logn)

and Θ(log2 n) state space size for both majority and leader elec-tion in population protocols, and introduce new techniques, whichshould be applicable more broadly.

1 Introduction

Population protocols [AAD+06] are a model of distributedcomputing in which agents with little computational powerand no control over the interaction schedule cooperateto collectively perform computational tasks. While ini-tially introduced to model animal populations [AAD+06],they have proved a useful abstraction for wireless sen-sor networks [PVV09, DV12], chemical reaction net-works [CCDS14], and gene regulatory networks [BB04].

A parallel line of applied research has shown that popu-lation protocols can be implemented at the level of DNAmolecules [CDS+13], and that they are equivalent to com-putational tasks solved by living cells in order to functioncorrectly [CCN12].

Specifically, a population protocol consists of a set ofn finite-state agents, interacting in pairs, where each inter-action may update the local state of both participants. Theprotocol starts in a valid initial configuration, and definesthe outcomes of pairwise interactions. The goal is to haveall agents converge to some state, representing the output ofthe computation, which satisfies some predicate over the ini-tial state of the system. For example, one fundamental taskis majority (consensus) [AAE08b, PVV09, DV12], in whichagents start in one of two input states A and B, and mustconverge on a decision as to which state has a higher initialcount.1 A complementary fundamental task is leader elec-tion [AAE08a, AG15, DS15], which requires the system toconverge to final states in which a single agent is in a specialleader state.

The set of interactions occurring in each step is usuallyassumed to be decided by a probabilistic scheduler, whichpicks the next pair to interact uniformly at random. Onecomplexity measure is parallel time, defined as the numberof pairwise interactions until convergence, divided by n, thenumber of agents. Another is space complexity, definedas the number of distinct states that an agent can representinternally.

There has been considerable interest in the complexityof fundamental tasks such as leader election and consensusin the population model. In particular, a progression ofdeep technical results [Dot14, CCDS14] has culminated inshowing that leader election in sublinear time is impossiblefor protocols which are restricted to a constant number of

1In this paper, we will focus on the exact majority task, in which theprotocol must return the correct decision in all executions, as opposed toapproximate majority, where are wrong decision might be returned withlow probability [AAE08b].

2560 Copyright © by SIAM

Unauthorized reproduction of this article is prohibited

Dow

nloa

ded

07/1

4/17

to 7

4.10

4.16

2.18

4. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 2: Time-Space Trade-offs in Population Protocolspeople.csail.mit.edu/rivest/pubs/AAEGR17.pdf · trade-off between the time complexity of such algorithms, and the space complexity, i.e.

states per node [DS15]. At the same time, it is now knownthat leader election can be solved in O(log3 n) time via aprotocol requiringO(log3 n) states per node [AG15]. For themajority task, the space-time complexity gap is much wider:sublinear time is impossible for protocols restricted to havingat most four states per node [AGV15], but there exists a poly-logarithmic time protocol which requires a number of statesper node that is linear in n [AGV15].

These results hint towards a trade-off between therunning time of a population protocol and the space, ornumber of states, available at each agent. This relationis all the more important since time efficiency is criti-cal in practical implementations, while technical constraintslimit the number of states currently implementable in amolecule [CDS+13]. (One such technical constraint is thepossibility of leaks, i.e. spurious creation of states followingan interaction [TWS15]. In DNA implementations, the morestates a protocol implements, the higher the likelihood of aleak, and the higher the probability of divergence.) However,the characteristics of the time-space trade-off in populationprotocols are currently an open question.Contribution: In this paper, we take a step towards answer-ing this question. First, we exhibit a general trade-off be-tween the number of states available to a population proto-col and its time complexity, which characterizes which deter-ministic predicates can be computed efficiently with limitedspace. Further, we give new and improved algorithms formajority and leader election, tight within poly-logarithmicfactors. Our results, and their relation to previous work, aresummarized in Figure 1.Lower Bounds: When applied to majority, our lower boundproves that there exist constants c ∈ (0, 1) and K ≥ 1 suchthat any protocol using λn ≤ c log log n states must takeΩ(n/(Kλn+εn)2)) time, where εn is the difference betweenthe initial counts of the two competing states. For example,any protocol using constant states and supporting a constantinitial difference necessarily takes linear time.

For leader election, we show that there exist constantsc ∈ (0, 1) and K ≥ 1 such that any protocol usingλn ≤ c log log n states and electing at most `(n) leaders,requires Ω(n/(Kλn · `(n)2)) expected time. Specifically,any protocol electing one leader using ≤ c log log n statesrequires Ω(n/polylog n) time.Algorithms: On the positive side, we give new poly-logarithmic-time algorithms for majority and leader electionwhich use O(log2 n) space. Our majority algorithm, calledSplit-Join, runs in O(log3 n) time both in expectation andwith high probability, and uses O(log2 n) states per node.The only previosly known algorithms to achieve sublineartime required Θ(n) states per node [AGV15], or exponen-tially many states per node [BFK+16]. Further, we give anew leader election algorithm called Lottery-Election, whichusesO(log2 n) states, and converges inO(log5.3 n log logn)

parallel time in expectation and O(log6.3 n) parallel timewith high probability. This reduces the state space size bya logarithmic factor over the best known algorithm [AG15],at the cost of a poly-logarithmic running time increase withrespect to the O(log3 n) bound of [AG15].

A key improvement with respect to previous work isthat these time-space bounds hold independently of the ini-tial configuration: for instance, the AVC algorithm [AGV15]could converge in poly-logarithmic time and using poly-logarithmic space under a restricted set of initial configura-tions, e.g. assuming that the discrepancy ε between the twoinitial states is large. Our Split-Join algorithm can achievethis for worst-case initial configurations, i.e. ε = 1/n.Techniques: The core of the lower bound is a two-step ar-gument. First, we prove that a hypothetical algorithm whichwould converge faster than allowed by the lower bound mayreach “stable” configurations2 in which certain low-countstates can be “erased,” i.e., may disappear completely fol-lowing a sequence of reactions. In the second step, we engi-neer examples where these low-count states are exactly theset of all possible leaders (in the case of leader election pro-tocols), or a set of nodes whose state may sway the outcomeof the majority computation (in the case of consensus). Thisimplies a contradiction with the correctness of the compu-tation. Technically, our argument employs the method ofbounded differences to obtain a stronger version of the maindensity theorem of [Dot14], and develops a new technicalcharacterization of the stable states which can be reached bya protocol, which does not require constant bounds on statespace size, generalizing upon [DS15]. The argument pro-vides a unified analysis: the bounds for each task in turn arecorollaries of the main theorem characterizing the existenceof certain stable configurations.

On the algorithmic side, we introduce a new syntheticcoin technique, which allows nodes to generate almost-uniform local coins within a constant number of interactions,by exploiting the randomness in the scheduler, and in partic-ular the properties of random walks on the hypercube. Syn-thetic coins are useful for instance by allowing nodes to esti-mate the total number of agents in the system, and may be ofindependent interest as a way of generating randomness in aconstrained setting.

The Split-Join majority protocol starts from the follow-ing idea: nodes can encode their output opinions and theirrelative strength as integer values: the value is positive ifthe node supports a majority of A, and negative if the nodesupports a majority of B. The higher the absolute value,the higher the “confidence” in the corresponding outcome.Whenever two nodes meet, they average their values, round-

2Roughly, a configuration is stable if it may not generate any new typesof states.

2561 Copyright © by SIAM

Unauthorized reproduction of this article is prohibited

Dow

nloa

ded

07/1

4/17

to 7

4.10

4.16

2.18

4. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 3: Time-Space Trade-offs in Population Protocolspeople.csail.mit.edu/rivest/pubs/AAEGR17.pdf · trade-off between the time complexity of such algorithms, and the space complexity, i.e.

Problem Type Expected Time Bound Number of States Reference

Exact Majorityε = 1/n

Algorithm O(n logn) 4 [DV12, MNRS14]Algorithm O(log2 n) Θ(n) [AGV15]

Lower Bound Ω(n) ≤ 4 [AGV15]Lower Bound Ω(logn) any [AGV15]

Leader ElectionAlgorithm O(log3 n) O(log3 n) [AG15]

Lower Bound Ω(n) O(1) [DS15]Exact Majority

Lower Bound Ω(n/polylogn) < 1/2 log logn This paperLeader ElectionExact Majority Algorithm O(log3 n) O(log2 n) This paperLeader Election Algorithm O(log5.3 n log logn) O(log2 n) This paper

Figure 1: Summary of results and relation to previous work.

ing to integers. This template has been used successfullyin [AGV15] to achieve consensus in poly-logarithmic time,but doing so requires a state space of size at least linear n, asall values between 1 and n may need to be represented. Herewe introduce a new quantized averaging technique, by whichnodes represent their output estimates by encoding them aspowers of two. Again, opinions are averaged on each inter-action. We prove that our quantization preserves correctness,and allows for fast (poly-logarithmic) convergence, while re-ducing the size of the state space almost exponentially.

The Lottery-Election protocol follows the basic andcommon convention that every agent starts as a potentialleader, and whenever two leaders interact, one drops outof contention. However, once only a constant number ofpotential leaders remain, they take a long time to interact,implying super-linear convergence time. To overcome thisproblem, [AG15] introduced a propagation mechanism, bywhich contenders compete by comparing their seeding, andthe nodes who drop out of contention assume the identity oftheir victor, causing nodes still in contention but with lowerseeding to drop out. Here we employ synthetic coins to“seed” potential leaders randomly, which lets us reduce thenumber of leaders at an accelerated rate. This in turn reducesthe maximum seeding that needs to be encoded, and hencethe number of states required by the algorithm.Implications: Our lower bound can be seen as bad news foralgorithm designers, since it show that convergence is sloweven if the protocol implements a super-constant number ofstates per node. On the positive side, the achievable conver-gence time improves quickly as the size of the state spacenears the logarithmic threshold: in particular, fast, poly-logarithmic time can be achieved using poly-logarithmicspace.

It is interesting to note that previous work by Chatzi-giannakis et al. [CMN+11] identified Θ(log log n) as a spacecomplexity threshold in terms of the computational powerof population protocols, i.e. the set of predicates that suchalgorithms can compute. In particular, their results showthat variants of such systems in which nodes are limited

to o(log log n) space per node are limited to only comput-ing semilinear predicates, whereas O(log n) space is suffi-cient to compute general symmetric predicates. By contrast,we show a complexity separation between algorithms whichuse O(log log n) space per node, and algorithms employingΩ(log n) space per node.

2 Preliminaries

Population Protocols: A population protocol is a set of n ≥2 agents, or nodes, each executing as a deterministic statemachine with states from a finite set Λn, whose size mightdepend on n. The agents do not have identifiers, so twoagents in the same state are identical and interchangeable.Thus, we can represent any set of agents simply by thecounts of agents in every state, which we call a configuration.Formally, a configuration c is a function c : Λn → N,where c(s) represents the number of agents in state s inconfiguration c. We let |c| stand for the sum, over all statess ∈ Λn, of c(s), which is the same as the total number ofagents in configuration c. For instance, if c is a configurationof all agents in the system, then c describes the global stateof the system, and |c| = n.

Further, we define the set In of all allowed initialconfigurations of the protocol for n agents, a finite set ofoutput symbols O, a transition function δn : Λn × Λn →Λn × Λn, and an output function γn : Λn → O. Thesystem starts in one of the initial configurations in ∈ In(clearly, |in| = n), and each agent keeps updating its localstate following interactions with other agents, according tothe transition function δn. The execution proceeds in steps,where in each step a new pair of agents is selected uniformlyat random from the set of all pairs. Each of the two chosenagents updates its state according to the function δn.

A population protocol P will be a sequence of proto-cols P2,P3, . . . , where for each n ≥ 2 we have Pn =(Λn, In, δn, γn), defining the initial configurations, proto-col states, transitions and output mapping, respectively, for nagents. We say that a protocol is monotonic if, for all i ≥ 1,(1) the number of states cannot decrease for higher node

3

2562 Copyright © by SIAM

Unauthorized reproduction of this article is prohibited

Dow

nloa

ded

07/1

4/17

to 7

4.10

4.16

2.18

4. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 4: Time-Space Trade-offs in Population Protocolspeople.csail.mit.edu/rivest/pubs/AAEGR17.pdf · trade-off between the time complexity of such algorithms, and the space complexity, i.e.

count i, i.e., |Λi| ≤ |Λi+1|, and (2) if the state counts arethe same, then the protocol is the same, i.e. if |Λi| = |Λj |,then Λi = Λj , Ii = Ij , δi = δj and γi = γj .

We say that a monotonic population protocol P is input-additive, if for all in ∈ In with |Λn| = |Λ2n| (which bymonotonicity impliesPn = P2n), it holds that in+in ∈ I2n,i.e. doubling the number of agents in each state still leads toa valid initial configuration in a system of 2n agents.

A configuration c′ is reachable from a configuration c,denoted c =⇒ c′, if there exists a sequence of consecutivesteps (interactions from δn between pairs of agents) leadingfrom c to c′. If the transition sequence is p, we will alsowrite c =⇒p c′. We call a configuration c the sum ofconfigurations c1 and c2 and write c = c1 + c2, iff c(s) =c1(s) + c2(s) for all states s ∈ Λn.Leader Election: Fix an n ≥ 2. In the leader electionproblem, all agents start in the same initial state An, i.e.In = in with in(An) = n. The output set is O =Win,Lose.

We say that a configuration c has a single leader ifthere exists some state s ∈ Λn with γn(s) = Win andc(s) = 1, such that for any other state s′ 6= s, c(s′) > 0implies γn(s′) = Lose . A population protocol Pn solvesleader election within r steps with probability 1− φ, if, withprobability 1 − φ, any configuration c reachable from in bythe protocol within ≥ r steps has a single leader.The Majority Problem: Fix an n ≥ 2. In the majorityproblem, we have two initial states An, Bn ∈ Λn, and Inconsists of all configurations where each of the n agents iseither in An or Bn. The output set is O = WinA,WinB.Given an initial configuration in ∈ In let a = in(An) andb = in(Bn) be the count of the two initial states, and letεn = |a − b| denote the initial relative advantage of themajority state.

We say that a configuration c correctly outputs themajority decision for in, when for any state s ∈ Λn, ifa > b then c(s) > 0 implies γn(s) = WinA, and ifb > a then c(s) > 0 implies γn(s) = WinB . (The outputin case of an initial tie is arbitrary.) A population protocolPn solves the majority problem from in within ` steps withprobability 1 − φ, if for any configuration c reachable fromin by the protocol with ≥ ` steps, with probability 1 − φ,the protocol correctly outputs the majority for in. In thispaper we consider the exact majority task, as opposed toapproximate majority [AAE08b], which allows the wrongoutput with some probability.

Finally, a population protocol P solves a task if, forevery n ≥ 2, the protocol instantiation Pn solves the task.The complexity measures for a protocol (defined below), arefunctions of n.Stabilization: A configuration c of n agents has a stableleader, if for all c′ reachable from c, it holds that c′ has asingle leader. Analogously, a configuration c has a stable

correct majority decision for in, if for all c′ with c =⇒ c′, c′

correctly outputs the majority decision for in. We say thata protocol stabilizes when it reaches such a stable outputconfiguration.Complexity measures: The above setup considers sequen-tial interactions; however, interactions between pairs of dis-tinct agents are independent, and are usually considered asoccurring in parallel. It is customary to define one unit ofparallel time as n consecutive steps of the protocol. We areinterested in the expected parallel time for a population pro-tocol to stabilize, i.e. the total number of (sequential) in-teractions from the initial configuration required to reach aconfiguration with a stable leader or with a stable correct ma-jority decision, divided by n. We also refer to this quantityas the convergence time. If the expected time is finite, thenwe say that population protocol stably elects a leader (or sta-bly computes majority decision). The space complexity of aprotocol is the number of states in the protocol with respectto n, i.e. |Λn|.

3 Lower BoundOur definition of population protocols allows the protocolto arbitrarily change its structure as the number of stateschanges. This generality strengthens the lower bound, andto our knowledge, the definition covers all known populationprotocols.Technical machinery: We now lay the groundwork for ourmain argument by proving a set of technical lemmas. In thefollowing, we assume n to be fixed; Λn is the set of all statesof the protocol. Let S be an arbitrary set of states. Abusingnotation, we denote by δn(S, S) the set of all states whichcan be generated by transitions between states in S. Then,we let S0 be the set of states in the initial configurations ofthe protocol and for all integers k ≥ 1, we define inductivelythe set of states Sk = Sk−1 ∪ δn(Sk−1, Sk−1). Simply put,Sk is the set of all states which can be generated by theprotocol after k steps.

Assume without loss of generality that all states inΛn actually occur in some configurations reachable by theprotocol Pn from some initial configuration. Then, it holdsthat S|Λn|−1 = S|Λn| = . . . , and S|Λn|−1 = Λn. Wesay that a configuration c is X-rich with respect to a set ofstates S if in the configuration c all states in S have count≥ X . A configuration is dense with respect to a set of statesS if it is n/M -rich with respect to S, for some constantM > 1. We call an initial configuration fully dense, if it isdense with respect to the set of initial states S0: practically,such configurations contain at least n/M agents in each ofthe initial states. We now prove the following statement,which says that all protocols with a limited state count willend up in a configuration in which all reachable states arepresent in large count. This lemma generalizes the mainresult of [Dot14] to a super-constant state space, and the

2563 Copyright © by SIAM

Unauthorized reproduction of this article is prohibited

Dow

nloa

ded

07/1

4/17

to 7

4.10

4.16

2.18

4. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 5: Time-Space Trade-offs in Population Protocolspeople.csail.mit.edu/rivest/pubs/AAEGR17.pdf · trade-off between the time complexity of such algorithms, and the space complexity, i.e.

proof is provided in the appendix.

LEMMA 3.1. (DENSITY LEMMA) For all population pro-tocolsA using |Λn| ≤ 1/2 log log n states, starting in a fullydense initial configuration, with probability≥ 1−(1/n)0.99,there exists a step j such that the configuration reached afterj steps is n0.99-rich with respect to Λn.

Fix a function f : N → R+. Fix a configuration c,and states r1 and r2 in c. A transition α : (r1, r2) →(z1, z2) is an f -bottleneck for c, if c(r1) · c(r2) ≤ f(|c|).This bottleneck transition implies that the probability of atransition (r1, r2) → (z1, z2) is bounded. Hence, provingthat transition sequences from initial configuration to finalconfigurations contain a bottleneck implies a lower boundon the convergence time. Conversely, the next lemma showsthat, if a protocol converges fast, then it must be possible toconverge using a transition sequence which does not containany bottleneck. The proof can be found in the full version ofthe paper 3.

LEMMA 3.2. Let P be a population protocol with |Λn| ≤1/2 log log n states, and let Dn ⊆ In be a non-empty set offully dense initial configurations. Fix a function f . Assumethat for sufficiently large n, P stabilizes in expected timeo(

nf(n)|Λn|2

)from all in ∈ Dn. Then, for all sufficiently

large m ∈ N there is a configuration xm with m agents,reachable from some i ∈ Dm and a transition sequence pm,such that:

1. xm(s) ≥ m0.99 for all s ∈ Λm,

2. xm =⇒pm ym, where ym is a stable output configura-tion, and

3. pm has no f -bottleneck.

Hence, fast convergence from a sufficiently rich config-uration requires the existence of a bottleneck-free transi-tion sequence. The next transition ordering lemma, dueto [CCDS14], proves a property of such a transition se-quence: there exists an order over all states whose countsdecrease more than some set threshold such that, for each ofthese states dj , the sequence contains at least a certain num-ber of a specific transition that consumes dj , but does notconsume or produce any states d1, . . . , dj−1 that are earlierin the ordering. This proof is identical to [CCDS14, DS15].

LEMMA 3.3. Fix b ∈ N, and let B = |Λn|2 · b + |Λn| · b.Given configurations x, y : Λn → N in a system of nagents, such that for all states s ∈ Λn we have x(s) ≥ B2

and x =⇒q y via a transition sequence q without a B2-bottleneck. Define

∆ = d ∈ Λn | y(d) ≤ b

3arxiv.org/abs/1602.08032

to be the set of states whose count in configuration y is atmost b. Then there is an order ∆ = d1, d2, . . . , dk, suchthat, for all j ∈ 1, . . . , k, there is a transition αj of theform (dj , sj) → (oj , o

′j) with sj , oj , o′j 6∈ d1, . . . , dj, and

αj occurs at least b times in q.

The Lower Bound Argument: Given a population protocolPn, a configuration c : Λn → N and a function g : N→ N+,we define the sets Γg(c) = s ∈ Λn | c(s) > g(|c|) and∆g(c) = s ∈ Λn | c(s) ≤ g(|c|). Intuitively, Γg(c)contains states above a certain count, while ∆g(c) containsstate below that count. Notice that Γg(c) = Λn −∆g(c).

The proof strategy is to first show that if a protocolconverges “too fast,” then it can also reach configurationswhere all agents are in states in Γg(c). Recall that aconfiguration c is defined as a function Λn → N. LetS ⊆ Λn be some subset of states such that all agents inconfiguration c are in states from S, formally, s ∈ Λn |c(s) > 0 ⊆ S. For notational convenience, we will writec>0 ⊆ S to mean the same.

THEOREM 3.1. Let P be a monotonic input-additive pop-ulation protocol using |Λn| ≤ 1/2 log log n states. Letg : N → N+ be a function such that g(n) ≥ 2|Λn| for alln and 6|Λn| · |Λn|2 · g(n) = o(n0.99).

Suppose P stabilizes in o(

n(6|Λn|·|Λn|3·g(n))2

)time from

any initial configuration, and that it has fully dense initialconfigurations for all sufficiently large n. Then, for infinitelymany m with |Λm| = . . . = |Λ3m|, we can find an initialconfiguration of 2m agents i ∈ I2m and stable outputconfiguration y of m agents, such that for any configurationu that satisfies a requirement below, i+u =⇒ z holds, wherez>0 ⊆ Γg(y).

The requirement on configuration u is that it containsbetween 0 and m agents, u>0 ⊆ ∆g(y) (all agents in u arein states from ∆g(y)), and that y(s) + u(s) ≤ g(m) for allstates s ∈ ∆g(y).

Proof. For simplicity, set b(n) = (6|Λn| + 2|Λn|) · g(n),b2(n) = |Λn|2 · b(n) + |Λn| · b(n), and f(n) = (b2(n))2.The theorem statement implies that the protocol stabilizes ino(

nf(n)|Λn|2

)time. By Lemma 3.2, for all sufficiently large

mwe can find configurations ofm agents im, xm, y : Λm →N, such that:• im ∈ Im is a fully dense initial configuration of m

agents.• im =⇒ xm =⇒pm y, where y is a stable final

configuration, as desired, and the transition sequencepm does not contain an f -bottleneck (i.e. a (b2)2-bottleneck).

• ∀s ∈ Λm : xm(s) ≥ b2(m). (Here, we use theassumption on the function g.)Moreover, because |Λn| ≤ 1/2 log log n for sufficiently

large n, for infinitely many m it also additionally holds that

5

2564 Copyright © by SIAM

Unauthorized reproduction of this article is prohibited

Dow

nloa

ded

07/1

4/17

to 7

4.10

4.16

2.18

4. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 6: Time-Space Trade-offs in Population Protocolspeople.csail.mit.edu/rivest/pubs/AAEGR17.pdf · trade-off between the time complexity of such algorithms, and the space complexity, i.e.

|Λm| = |Λm+1| = . . . = |Λ3m| (otherwise |Λn| would growat least logarithmically in n). This, due to monotonicityimplies that population protocols Pm,Pm+1, . . . ,P3m areall the same.

Consider any such m. Then, we can invoke Lemma 3.3with xm, y, transition sequence pm and parameter b = b(m).The definition of ∆ in the lemma statement matches ∆b(y),and b2 matches b2(m). Thus, we get an ordering of states∆b(y) = d1, d2, . . . , dk and a corresponding sequenceof transitions α1, α2, . . . , αk, where each αj is of the form(dj , sj) → (oj , o

′j) with sj , oj , o

′j 6∈ d1, d2, . . . , dj.

Finally, each transition αj occurs at least b(m) = (6|Λm| +2|Λm|) · g(m) times in pm.

We will now perform a set of transformations on thetransition sequence pm, called surgeries, with the goal ofconverging to a desired type of configuration. The next twoclaims specify these transformations, which are similar to thesurgeries used in [DS15], but with some key differences dueto configuration u and the new general definitions of Γ and∆. The proofs are provided in the appendix. Configurationu is defined as in the theorem statement. For brevity, we useΓg = Γg(y), ∆g = ∆g(y), Γb = Γb(y) and ∆b = ∆b(y).

LEMMA 3.4. There exist configurations e : Λm → N and z′

with z′>0 ⊆ Γg , such that e+ u+ xm =⇒ z′. Moreover, wehave an upper bound on the counts of states in e: ∀s ∈ Λm :e(s) ≤ 2|Λm| · g(m).

The configuration e+u+xm has at most 2|Λm| ·g(m)·|Λm|+g(m)+m agents, which is less than 3m for sufficiently largem. The state transitions used here and everywhere below arefrom δm = . . . = δ3m.

For any configuration e : Λm → N, let e∆ be itsprojection onto ∆, i.e. a configuration consisting of onlythe agents from e in states ∆. We can define eΓ analogously.By definition, eΓ = e− e∆.

LEMMA 3.5. Let e be any configuration satisfying ∀s ∈Λm : e(s) ≤ 2|Λm| · g(m). There exist configurations pand w, such that p>0 ⊆ ∆b, w>0 ⊆ Γg and p + xm =⇒p + w + e∆g . Moreover, for counts in p, we have that∀s ∈ Λm : p(s) ≤ b(m) and for counts in wΓg , we have∀s ∈ Γg : w(s) ≥ 2|Λm| · g(m).

Let our initial configuration i be im + im, which becauseim is fully dense and because of input-additivity, must alsobe a fully dense initial configuration from I2m. Trivially,i =⇒ xm + xm. Let us apply Lemma 3.5 with e as definedin Lemma 3.4, but use one xm instead of p. This is possiblebecause ∀s ∈ Λm : x(s) ≥ b2(m) ≥ b(m) ≥ p(s). Hence,we get xm+xm =⇒ xm+w+ e∆g = xm+ e+ (w− eΓg ).The configuration w − eΓg is well-defined because both wand eΓg contain agents in states in Γg , with each count in wbeing larger or equal to the respective count in eΓg , by thebounds from the claims.

Finally, by Lemma 3.4, we have u + xm + e + (w −eΓg ) =⇒ z′+(w−eΓg ). We denote the resulting configura-tion (with all agents in states in Γg) by z, and have i =⇒ z,as desired.

The following lemma is a technical tool that critically relieson our definitions of Γ and ∆.

LEMMA 3.6. Consider a population protocol in a systemwith any fixed number of agents n, and an arbitrary fixedfunction h : N → N+ such that h(n) ≥ 2|Λn|. Letξ(n) = 2|Λn|. For all configurations c, c′ : Λn → N, suchthat c>0 ⊆ Γh(c) ⊆ Γξ(c

′), any state producible from c isalso producible from c′. Formally, for any state s ∈ Λn,c =⇒ y with y(s) > 0 implies c′ =⇒ y′ with y′(s) > 0.

Proof. Since h(n) ≥ 2|Λn|, for any state from Γh(c), itscount in c is at least 2|Λn|. As Γh(c) ⊆ Γξ(c

′), the countof each of these states in c′ is also at least ξ(n) = 2|Λn|.We say two agents have the same type if they are in thesame state in c. We will prove by induction that any statethat can be produced by some transition sequence from c,can also be produced by a transition sequence in which atmost 2|Λn| agents of the same type participate (ever interact).Configuration c only has agents with types (states) in Γh(c),and configuration c′ also has at least 2|Λn| agents for each ofthose types, the same transition sequence can be performedfrom c′ to produce the same state as from c, proving thedesired statement.

The inductive statement is the following. There is ak ≤ |Λn|, such that for each i = 0, 1, . . . , k we can find setsS0 ⊂ S1 ⊂ . . . ⊂ Sk where Sk contains all the states thatare producible from c, and all sets Sj satisfy the followingproperty. LetAj be a set consisting of 2j agents of each typein Γh(c), out of all the agents in configuration c (we couldalso use c′), for the total of 2j · |Γh(c)| agents. There areenough agents of these types in c (and in c′) as j ≤ k ≤ |Λn|.Then, for each 0 ≤ j ≤ k and each state s ∈ Sj , there existsa transition sequence from c in which only the agents in Ajever interact and in the resulting configuration, one of theseagents from Aj ends up in state s.

We do induction on j and for the base case j = 0 we takeS0 = Γh(c). The setA0 as defined contains one (20) agent ofeach type in Γh(c) = S0

4. All states in S0 are immediatelyproducible by agents in A0 via an empty transition sequence(without any interactions).

Let us now assume inductive hypothesis for some j ≥ 0.If Sj contains all the producible states from configuration c,then k = j and we are done. We will have k ≤ |Λn|, becauseS0 6= ∅ and S0 ⊂ S1 ⊂ . . . Sk imply that Sk contains at

4In c, all the agents are in one of the states of Γh(c), so as long asn > 0 there must be at least one agent per state (type). So, if Γh(c) = ∅,then n must necessarily be 0, so nothing is producible A0 = ∅, k = 0 andwe are done

2565 Copyright © by SIAM

Unauthorized reproduction of this article is prohibited

Dow

nloa

ded

07/1

4/17

to 7

4.10

4.16

2.18

4. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 7: Time-Space Trade-offs in Population Protocolspeople.csail.mit.edu/rivest/pubs/AAEGR17.pdf · trade-off between the time complexity of such algorithms, and the space complexity, i.e.

least k different states, and there are |Λn| total. Otherwise,there must be some state s 6∈ Sj that can be produced afteran interaction between two agents both in states in Sj , let ussay by a transition α : r1, r2 → s, p with r1, r2 ∈ Sj (orthere is no state that cannot already be produced). Also, asSj contains at least j states out of |Λn| total, and there isthe state s 6∈ Sj , j < |Λn| holds and the set Aj+1 is well-defined. Let us partition Aj+1 into two disjoint sets B1 andB2 where each contain 2j agents from c for each type. Then,by induction hypothesis, there exists a transition sequencewhere only the agents in B1 ever interact and in the end, oneof the agents b1 ∈ B1 ends up in the state r1. Analogously,there is a transition sequence for agents in B2, after whichan agent b2 ∈ B2 ends up in state r2. Combining these twotransition and adding one instance of transition α in the endbetween agents b1 and b2 (in states r1 and r2 respectively)leads to a configuration where one of the agents from Aj+1

is in state s. Also, all the transitions are between agentsin Aj+1. Hence, setting Sj+1 = Sj ∪ s completes theinductive step.

Now we can prove the lower bounds on majority and leaderelection.

COROLLARY 3.1. Any monotonic population protocol with|Λn| ≤ 1/2 log log n states for all sufficiently large numberof agents n that stably elects at least one and at most `(n)

leaders, must take Ω(

n144|Λn|·|Λn|6·`(n)2

)expected parallel

time to convergence.

Proof. We set g(n) = 2|Λn| · `(n). In the initial configu-rations for leader election all agents are in the same startingstate. Hence, all initial configurations are fully dense, andall monotonic population protocols for leader election haveto be input-additive.

Assume for contradiction that the protocol converges inparallel time o

(n

144|Λn|·|Λn|6·`(n)2

). For all n, In contains

the only initial fully dense configuration with n agents in thesame initial state. Using Theorem 3.1 and setting u to bea configuration of zero agents, we can find infinitely manyconfigurations i and z of 2m agents, such that (1) i =⇒ z,(2) i ∈ I2m, (3) |Λm| = |Λ2m| and by monotonicity, thesame protocol is used for all number of agents between mand 2m, (4) z>0 ⊆ Γg(y), i.e. all agents in z are in statesthat each have counts of at least 2|Λm| · `(m) in some stableoutput configuration y (of |y| = m agents).

Because y is a stable output configuration of a protocolthat elects at most `(m) leaders, none of these states in Γg(y)that are present in strictly larger counts (2|Λm|·`(m) > `(m))in y and z can be leader states (i.e. γm = γ2m mapsthese states to output Lose). Therefore, the configurationz does not contain a leader. This is not sufficient for acontradiction, because a leader election protocol may well

pass through a leaderless configuration before convergingto a stable configuration with at most `(m) leaders. Weprove below that any configuration reachable from z mustalso have zero leaders. This implies an infinite time ofconvergence from a valid initial configuration i (as i =⇒ z)and completes the proof by contradiction.

If we could reach a configuration from z with an agentin a leader state, then by Lemma 3.6, from a configuration c′

that consists of 2|Λm| agents in each of the states in Γg(y),it is also possible to reach a configuration with a leader, letus say through transition sequence q. Recall that the con-figuration y contains at least 2|Λm| · `(m) agents in each ofthese states in Γg(y), hence there exist disjoint configura-tions c′1 ⊆ y, c′2 ⊆ y, etc, . . . , c′`(m) ⊆ y contained iny and corresponding transition sequences q1, q2, . . . , q`(m),such that qj only affects agents in c′j and leads one of theagents in c′j to become a leader. Configuration y is a outputconfiguration so it contains at least one leader agent already,which does not belong to any c′j because as mentioned above,all agents in c′j are in non-leader states. Therefore, it is pos-sible to reach a configuration from y with `(m) + 1 leadersvia a transition sequence q1 on the c′1 component of y, fol-lowed by q2 on c′2, etc, q`(m) on c′`(m), contradicting that yis a stable output configuration.

The proof of the majority lower bound follows similarly.

COROLLARY 3.2. Any monotonic population protocol with|Λn| ≤ 1/2 log log n states for all sufficiently large numberof agents n that stably computes correct majority decisionfor initial configurations with majority advantage εn, musttake Ω

(n

36|Λn|·|Λn|6·max(2|Λn|,εn)2

)expected parallel time to

convergence.

Proof. We set g(n) = max(2|Λm|+1, 4εn). For majoritycomputation, initial configurations consist of agents in one oftwo states, with the majority state holding an εn advantage inthe counts. Therefore, the sum of two initial configurationsof the same protocol is also a valid initial configuration, andthus monotonic populations protocols for majority compu-tation must be input-additive. The bound is nontrivial onlyin a regime εn ∈ o(

√n), which we will henceforth assume

without loss of generality. The initial configurations we con-sider from In will all have advantage εn, and are all be fullydense.

Let us prove that for all sufficiently large m, in any finalstable configuration y, strictly less than 2|Λm| ≤ g(m)/2agents will be in the initial minority state Bm. The reasonis that if c is the initial configuration of all m agents instate Bm, the protocol must converge from c to a finalconfiguration where the states correspond to decision WinB .By Lemma 3.6, from any configuration that contains atleast 2|Λm| agents in Bm it would also be possible to reacha configuration where some agent supports decision Bm.

7

2566 Copyright © by SIAM

Unauthorized reproduction of this article is prohibited

Dow

nloa

ded

07/1

4/17

to 7

4.10

4.16

2.18

4. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 8: Time-Space Trade-offs in Population Protocolspeople.csail.mit.edu/rivest/pubs/AAEGR17.pdf · trade-off between the time complexity of such algorithms, and the space complexity, i.e.

Therefore, all stable final configuration y have at mostg(m)/2− 1 agents in initial minority state Bm. This allowsus to let u be a configuration of g(m)/2 + 1 ≥ 2εm + 1agents in state Bm.

Assume, to the contrary, that the protocol converges inparallel time o

(n

36|Λn|·|Λn|6·max(2|Λn|,εn)2

). We only con-

sider initial configurations that are fully dense and contain(1+ε)n

2 agents in state An and (1−ε)n2 agents in state Bn, i.e.

having majority state An with advantage εn. Let I ′n ⊆ Incontain only this fully dense initial configuration for each nand using Theorem 3.1 with I ′n instead of In, we can findinfinitely many configurations i and z of at most 3m agents,such that (1) i+u =⇒ z, (2) i ∈ I ′2m, i.e. it is an initial con-figuration of 2m agents with majority state A2m = Am andadvantage 2εm. (3) |Λm| = |Λ2m| = |Λ2m+|u|| = |Λ3m|and by monotonicity, the same protocol is used for all num-ber of agents between m and 3m, (4) z>0 ⊆ Γg(y), i.e allagents in z are in states that have counts at least g(m) insome stable output configuration y of m agents.

To get the desired contradiction we will prove twothings. First, z is actually a stable output configuration fordecision WinA (majority opinion in i), and second, i + uis a valid initial configuration for the majority problem, butwith majority state Bm = B2m+|u|. This will imply that theprotocol converges to a wrong outcome, and complete theproof by contradiction.

If we could reach a configuration from z with anyagent in a state s that maps to output WinB (γm(s) =WinB), then by Lemma 3.6, from a configuration y (whichcontains 2|Λm| agents in each of the states in Γg(y)) we canalso reach a configuration with an agent in a state s thatmaps to output WinB . However, configuration y is a finalstable configuration for an initial configuration in Im with amajority Am.

Configuration i ∈ I2m contains 2εm more agents instateAm states than in stateBm. Configuration u consists ofat least 2εm+ 1 agents all in state Bm. Hence, i+ u whichis a legal initial configuration from I2m+|u| has a majority ofagents in state Bm.

4 Synthetic Coin FlipsThe state transition rules in population protocols are deter-ministic, i.e. the interacting nodes do not have access to ran-dom coin flips. In this section, we introduce a general tech-nique that extracts randomness from the schedule and afteronly constant parallel time, allows the interactions to rely onclose-to-uniform synthetic coin flips. This turns out to be anuseful gadget for designing efficient protocols.

Suppose that every node in the system has a booleanparameter coin , initialized with zero. This extra parametercan be maintained independently of the rest of the protocol,and only doubles the state space. When agents x and yinteract, they both flip the values of their coins. Formally,

x′.coin ← 1 − x.coin and y′.coin ← 1 − y.coin , and theupdate rule is fully symmetric.

The nodes can use the coin value of the interactionpartner as a random bit in a randomized algorithm. Clearly,these bits are not independent or uniform. However, weprove that with high probability the distribution of coinquickly becomes close to uniform and remains that way.We use the concentration properties of random walks on thehypercube, analyzed previously in various other contexts,e.g. [AR15]. We also note that a similar algorithm is used byLaurenti et al. [LCK16] to generate randomness in chemicalreaction networks, although they do not prove convergencebounds.

THEOREM 4.1. Suppose k ≥ αn for a fixed constant α ≥2. Let Xi be the number of ones in the system after iinteractions. For all sufficiently large n, we have thatPr[|Xk − n/2| ≥ n/24α] ≤ 2 exp(−αn/8).

Proof. We label the nodes from 1 to n, and represent theircoin values by a binary vector of size n. Suppose we knewa fixed vector v representing the coin values of the nodesafter the interaction k − αn. For example, if k = αn, weknow v is a zero vector, because of the way the algorithmis initialized. For 1 ≤ t ≤ αn, define by Yt the pair ofnodes that are flipped during interaction k − αn + t. Then,Xk is a deterministic function of Y1, . . . , Yαn, (whereby thefunction encodes the fixed starting vector v). Moreover,Yj are independent random variables and changing any oneYj only changes Xk by at most 4. Hence, we can applyMcDiarmid’s inequality with Xk = fv(Y1, . . . , Yαn).

LEMMA 4.1. (MCDIARMID’S INEQUALITY) LetY1, . . . , Ym be independent random variables and let Xbe their function X = f(Y1, . . . , Ym), such that changingvariable Yj only changes the function value by at most cj .Then, we have that Pr[|X − E[X]| ≥ ε] ≤ 2 · exp(− 2ε2∑

c2j).

Setting ε = α√n we get Pr[|Xk − E[Xk|] ≥ α

√n] ≤

2 · exp(−αn/8), given that the coin values after interactionk − αn are fixed and represented by vector v. Fixing valso fixes the number of ones among coin values in thesystem at that moment, which we will denote by x, i.e.x =

∑nj=1 vj = Xk−αn. We then notice that the following

lemma holds, whose proof is provided in the full version.

LEMMA 4.2. E[Xi+m | Xi = x] = n/2 + (1 − 4/n)m ·(x− n/2).

By Lemma 4.2 we have E[Xk | Xk−αn = x] = n/2 +(1 − 4/n)αn · (x − n/2), which as 0 ≤ x ≤ n and(1 − 4/n)αn ≤ exp(−4α), implies n/2 − n/24α+1 ≤E[Xk | Xk−αn = x] ≤ n/2 + n/24α+1. For each fixedv, we can apply McDiarmid inequality as above, and get

2567 Copyright © by SIAM

Unauthorized reproduction of this article is prohibited

Dow

nloa

ded

07/1

4/17

to 7

4.10

4.16

2.18

4. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 9: Time-Space Trade-offs in Population Protocolspeople.csail.mit.edu/rivest/pubs/AAEGR17.pdf · trade-off between the time complexity of such algorithms, and the space complexity, i.e.

an upper bound on the probability that Xk (given the fixedv), diverges from the expectation by at most α

√n. But as

we just established, for any v, the expectation we get in thebound will be at most n/24α+1 away from n/2. Combiningthese and using n/24α+1 ≥ α

√n for all sufficiently large n

gives the desired bound.

Approximate Counting: Synthetic coins can be used toestimate the number of agents in the system, as follows.Each node executes the coin-flipping protocol, and countsthe number of consecutive 1 flips it observes, until thefirst 0. Each agent records the number of consecutive 1coin flips as its estimates. The agents then exchange theirestimates, always adopting the maximum estimate. It iseasy to prove that the nodes will eventually converge to anumber which is a constant-factor approximation of log n,with high probability. This property is made precise in theproof of Lemma C.2.

5 The Lottery Leader Election Algorithm

Overview: We now present a leader election population pro-tocol using O(log2 n) states. The protocol is split concep-tually into two stages: in the first lottery stage, each agentgenerates an individual payoff by flipping synthetic coins.In the second competition stage, each agent starts as a po-tential leader, and proceeds to compare a value associatedwith local state against that of each interaction partner. Ofany two interacting agents, the one with “larger” value wins,and remains in contention, while the other drops out and be-comes a minion. Importantly, a minion agent can no longerbe a leader, but carries the value of its victor in future inter-actions. Thus, if an agent still in contention meets a min-ion with higher value, it will drop out and adopt the highervalue. Similarly, a minion will always adopt the highestleader value it encounters. This minion-based propagationmechanism has been used previously, where the value cor-responded to interaction counts [AG15]; here, we employ amore complex value function, which requires less states tostore.

We now present the algorithm in detail. Initially, allnodes start in the same state, which is determined by sevenparameters: coin , mode, payoff , level , counter , phase andones . The coin parameter stores the synthetic coin value; itis binary, initially 0. The agent can be in one of four modes:seeding (preparing the random coin), lottery (generating thepayoff value), tournament (competing), or minion (out ofcontention).

We fix a parameter m such that m ≥ (10 log n)2. Theprotocol will use O(m) states per node.Seeding Mode: All agents start in seeding mode, withpayoff and level values 0, and counter value 4. The goalof the four-interaction seeding mode is for the synthetic coinimplemented by the coin parameter to mix well, generating

values which are close to uniform random. In the first fourinteractions, each agent simply decreases its counter value.Once this counter reaches 0, the agent moves to lotterymode. By the properties of synthetic coins, when agentsfinish seeding, they hold 0 or 1 values in roughly equalproportion.Lottery Mode: In lottery mode, an agent starts countingin its own payoff the number of consecutive interactionsuntil observing 0 as the coin value of an interaction partner,by incrementing payoff upon observing 1 coins. When theagent first meets a 0, or if the agent reaches the maximumpossible value that payoff can hold, set to

√m, the agent x

finalizes its payoff , and changes its mode to tournament.Tournament Mode: The goal of tournament mode is two-fold: to force agents to compete (by comparing states), andto generate additional tie-breaking random values (via thelevel variable).

Agents start with level = 0 and repeatedly attempt toincrease their level. Each agent x keeps track of phases,consisting of Θ(log payoff ) consecutive interactions. Ineach phase, if all coin values of interaction partners are1, then the x.level is incremented; otherwise, it stays thesame. An agent which reaches the maximum possible level ,set at

√m/ logm, remains in tournament mode, but stops

increasing its level.Phases can be implemented by the phase parameter

used as a counter, and a boolean ones parameter. phasestarts with 0, and is incremented every interaction until itis reset to 0 when the phase ends. ones is set to true at thebeginning of each phase and becomes false is coin value 0is encountered. As mentioned above, the phase consists ofΘ(log payoff ) interactions, and the exact function will beprovided later.

When two agents x and y in tournament modemeet, they compare (x.payoff , x.level , x.coin) and(y.payoff , y.level , y.coin). If the former is larger, thenagent x eliminates agent y from the tournament, and viceversa. Practically, agent y sets its mode to minion, andadopts the payoff and level values of the other agent. Notethat agents with higher lottery payoff always have priority;if both payoff and level are equal, the coin value is used asa tie-breaker.Minion Mode: An agent in minion mode keeps a recordof the maximum .payoff , .level pair ever encountered inany interaction in its own payoff and level parameters.If x.mode = minion and y.mode = tournament, and(x.payoff , x.level) > (y.payoff , y.level), then the agentin state y will be eliminated from contention, and turnedinto a minion. Intuitively, minions help leaders with highpayoffs and levels to eliminate other contenders by spreadinginformation. Importantly, minions do not use the coin valueas a tie-breaker (as this could lead to a leader eliminatingitself via a cycle of interactions).

9

2568 Copyright © by SIAM

Unauthorized reproduction of this article is prohibited

Dow

nloa

ded

07/1

4/17

to 7

4.10

4.16

2.18

4. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 10: Time-Space Trade-offs in Population Protocolspeople.csail.mit.edu/rivest/pubs/AAEGR17.pdf · trade-off between the time complexity of such algorithms, and the space complexity, i.e.

Analysis Overview: One agent with the highest lotterypayoff eventually becomes the leader. This is an agent thatmanages to reach a high level, and turns other competitorsinto its minions, that further propagate the information aboutthe highest payoff and level through the system.

Only nodes with mode = minion are non-leaders,and once a node becomes a minion it remains a minion.Therefore, we first prove in Lemma C.1 that not all nodescan become minions, and if there are n − 1 minions in thesystem, then there is a stable leader. The proofs are providedin the appendix.

The number of states of an agent can be determined bymultiplying the maximum different values of state parame-ters, givingO(1) ·

√m ·

√m

logm ·O(logm) = O(m) as desired.Next, we prove in Lemma C.2 that with probability at

least 1 − O(1)/n3, after O(n log n) interactions, all agentswill be in the competition stage, that is, either in tournamentor minion mode, with maximum payoff at least log n/2 andat most 9 log n, and at most 5 log n agents will have thismaximum payoff .

We set the phase size of an agent with payoff = pto 4.2(log p + 1). We then show in Lemma C.3 thatwith probability at least 1 − O(1)/n3, only one contenderreaches level ` = 3 log n/ log log n, and it takes at mostO(n log5.3 log log n) interactions for a new level to bereached up to `.

The above claims imply that with high probabil-ity (at least 1 − O(1)/n3), the protocol elects a sin-gle leader within O(n log n) + O(n log5.3 log log n) · ` =O(n log6.3 n) interactions, that is, O(log6.3 n) parallel time.Finally, Lemma C.4 gives a O(log5.3 n log log n) bound onexpected parallel time until convergence.

6 The Split-Join Majority Algorithm

Overview: In this section, we present an algorithm for ex-act majority using O(log2 n) states per node. The main ideabehind the algorithm is that each node state corresponds toan integer value: the sign of the value corresponds to thenode’s opinion about the majority state—by convention, Ais positive, and B is negative. To minimize the state space,we will devise a special representation for the integer values,where not all integers will be representable. Whenever twonodes meet, they modify their respective values following asequence of simple operations. The intuition is that, on eachinteraction, nodes average their corresponding values. Aver-aging ensures that the sign of the sum over all values in thesystem never changes, while, initially, this sign correspondsto the sign of the majority value. Thus, the crux of the anal-ysis will be to show that all nodes converge to values of thesame sign, and do so quickly.States: We now describe the algorithm in detail. Each nodestate corresponds to a pair of integers x and y, representedby 〈x, y〉, whereby both integers are powers of two from

x, y ∈ 0, 1, 2, 22, . . . , 2dlogne, where n is the number ofnodes. The value of a state corresponding to a pair 〈x, y〉 isvalue(〈x, y〉) = x− y.

Nodes start in one of two special states. By convention,nodes starting in stateA have the initial pair 〈2dlogne, 0〉, andthe nodes starting in stateB have the initial pair 〈0, 2dlogne〉.(Here, we assume that nodes have an estimate of n.) Wedistinguish between two types of states: strong states rep-resent non-zero values, and will always satisfy x 6= y and2 min(x, y) 6= max(x, y). The weak states are representedas pairs 〈0, 0〉+ and 〈0, 0〉−, corresponding to value 0, lean-ing towards A or B, respectively. We will refer to states andtheir corresponding value pairs interchangeably. The outputfunction γ maps each state to the output based the sign of itsvalue (treating 〈0, 0〉+ as positive and 〈0, 0〉− as negative).Interactions: The algorithm, specified in Figure 2, consistsof a set of simple deterministic update rules for the nodestate, to be applied on every interaction. In the pseudocode,we make the distinction between pairs 〈x, y〉, which cor-respond to states, and pairs [x, y] corresponding to tuplesof integer values. The interaction rule between the states〈x1, y1〉 and 〈x2, y2〉 of two interacting nodes is describedby the function interact. The states after the interaction are〈x′1, y′1〉 and 〈x′2, y′2〉. All nodes start in the designated ini-tial states and continue to interact according to the interactrule. If both interacting states are weak, nothing changes(line 24). Otherwise, three elementary reactions, cancel,join, and split are applied, in this order. Each reaction takesfour values x1, y1, x2, y2 and returns (possibly updated) val-ues x′1, y

′1, x′2, y′2.

The cancel reaction matches positive and negative pow-ers of 2 from the two interaction partners. The join opera-tion matches values of the same sign, attempting to createhigher powers of two. The split reaction does the opposite,by breaking powers of two into smaller powers. Please seeFigure 3 for an illustration. Before returning, the interactprocedure normalizes the two states to satisfy some simplewell-formedness conditions. Notice that all these operationspreserve the sum of values corresponding to their inputs.Correctness and Convergence: The first observation is thatthe sum of values in the system is constant throughout theexecution. By construction, the initial sum is of the major-ity sign; since the sum stays constant, the algorithm may notreach a state in which all nodes have an opinion correspond-ing to the initial minority. This guarantees correctness. Theconvergence bound follows by carefully tracking the maxi-mum value in the system, and showing that minority valuesget cancelled out and switch sign quickly.

THEOREM 6.1. The Split-Join algorithm will never con-verge to the minority decision, and is guaranteed to convergeto the majority decision within O(log3 n) parallel time, bothin expectation and w.h.p.

2569 Copyright © by SIAM

Unauthorized reproduction of this article is prohibited

Dow

nloa

ded

07/1

4/17

to 7

4.10

4.16

2.18

4. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 11: Time-Space Trade-offs in Population Protocolspeople.csail.mit.edu/rivest/pubs/AAEGR17.pdf · trade-off between the time complexity of such algorithms, and the space complexity, i.e.

7 ConclusionWe have studied the trade-off between time and spacecomplexity in population protocols, and showed that asuper-constant state space is necessary to obtain fast, poly-logarithmic convergence time for both leader election andexact majority. On the positive side, we gave algorithmswhich achieve poly-logarithmic expected convergence timeusing O(log2 n) states per node for both tasks. Our findingsare not great news for practitioners, as even small constantstate counts are currently difficult to implement [CDS+13].It is interesting to note how nature appears to have overcomethis impossibility [CCN12]: algorithms solving majority atthe cell level do so approximately, allowing for a positiveprobability of error, using small constant states per node andconverging in poly-logarithmic time [AAE08b].

We open several avenues for future work. The first isto characterize the time-space trade-off between log log nand log2 n states. This question will likely require thedevelopment of analytic techniques parametrized by thenumber of states. A second direction is exploring the space-time trade-offs for approximately correct algorithms.

8 AcknowledgmentsDan Alistarh was supported by a Swiss National ScienceFoundation Ambizione Fellowship. James Aspnes was sup-ported by the National Science Foundation under grantsCCF-1637385 and CCF-1650596. Rati Gelashvili was sup-ported by the National Science Foundation under grantsCCF-1217921, CCF-1301926, and IIS-1447786, the Depart-ment of Energy under grant ER26116/DE-SC0008923, andOracle and Intel corporations.

The authors would like to thank David Doty, DavidSoloveichik, and Milan Vojnovic for insightful discussionsand feedback during the development of this work.

References

[AAD+06] Dana Angluin, James Aspnes, Zoe Diamadi, Michael JFischer, and Rene Peralta. Computation in networks ofpassively mobile finite-state sensors. Distributed computing,18(4):235–253, 2006.

[AAE08a] Dana Angluin, James Aspnes, and David Eisenstat.Fast computation by population protocols with a leader. Dis-tributed Computing, 21(3):183–199, September 2008.

[AAE08b] Dana Angluin, James Aspnes, and David Eisenstat.A simple population protocol for fast robust approximatemajority. Distributed Computing, 21(2):87–102, July 2008.

[AG15] Dan Alistarh and Rati Gelashvili. Polylogarithmic-timeleader election in population protocols. In Automata, Lan-guages, and Programming, pages 479–491. Springer, 2015.

[AGV15] Dan Alistarh, Rati Gelashvili, and Milan Vojnovic. Fastand exact majority in population protocols. In Proceedings

of the 2015 ACM Symposium on Principles of DistributedComputing, PODC ’15, 2015.

[AR15] Alexandr Andoni and Ilya Razenshteyn. Tight lowerbounds for data-dependent locality-sensitive hashing. arXivpreprint arXiv:1507.04299, 2015.

[BB04] James M Bower and Hamid Bolouri. Computationalmodeling of genetic and biochemical networks. MIT press,2004.

[BFK+16] Petra Berenbrink, Tom Friedetzky, Peter Kling, Fred-erik Mallmann-Trenn, and Chris Wastell. Plurality consensusvia shuffling: Lessons learned from load balancing. arXivpreprint arXiv:1602.01342, 2016.

[CCDS14] Ho-Lin Chen, Rachel Cummings, David Doty, andDavid Soloveichik. Speed faults in computation by chemicalreaction networks. In Distributed Computing, pages 16–30.Springer, 2014.

[CCN12] Luca Cardelli and Attila Csiksz-Nagy. The cell cycleswitch computes approximate majority. Nature ScientificReports, 2:656, 2012.

[CDS+13] Yuan-Jyue Chen, Neil Dalchau, Niranjan Srnivas, An-drew Phillips, Luca Cardelli, David Soloveichik, and GeorgSeelig. Programmable chemical controllers made from dna.Nature Nanotechnology, 8(10):755–762, 2013.

[CMN+11] Ioannis Chatzigiannakis, Othon Michail, StavrosNikolaou, Andreas Pavlogiannis, and Paul G Spirakis.Passively mobile communicating machines that use re-stricted space. In Proceedings of the 7th ACM ACMSIGACT/SIGMOBILE International Workshop on Founda-tions of Mobile Computing, pages 6–15. ACM, 2011.

[Dot14] David Doty. Timing in chemical reaction networks. InProceedings of the Twenty-Fifth Annual ACM-SIAM Sympo-sium on Discrete Algorithms, SODA ’14, pages 772–784.SIAM, 2014.

[DS15] David Doty and David Soloveichik. Stable leader electionin population protocols requires linear time. In Proceedingsof the 2015 International Symposium on Distributed Comput-ing, DISC ’15, 2015.

[DV12] Moez Draief and Milan Vojnovic. Convergence speedof binary interval consensus. SIAM Journal on Control andOptimization, 50(3):1087–1109, 2012.

[LCK16] Luca Laurenti, Luca Cardelli, and Marta Kwiatkowska.Programming discrete distributions with chemical reactionnetworks. 2016. http://arxiv.org/abs/1601.02578.

[MNRS14] George B. Mertzios, Sotiris E. Nikoletseas, Christo-foros Raptopoulos, and Paul G. Spirakis. Determining ma-jority in networks with local interactions and very small localmemory. In Automata, Languages, and Programming - 41stInternational Colloquium, ICALP 2014, Copenhagen, Den-mark, July 8-11, 2014, Proceedings, Part I, ICALP ’14, pages871–882, 2014.

[PVV09] Etienne Perron, Dinkar Vasudevan, and Milan Vojnovic.Using three states for binary consensus on complete graphs.In INFOCOM 2009, IEEE, pages 2527–2535. IEEE, 2009.

[TWS15] Chris Thachuk, Erik Winfree, and David Soloveichik.Leakless dna strand displacement systems. In InternationalConference on DNA Computing and Molecular Program-ming, pages 133–153. Springer, 2015.

11

2570 Copyright © by SIAM

Unauthorized reproduction of this article is prohibited

Dow

nloa

ded

07/1

4/17

to 7

4.10

4.16

2.18

4. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 12: Time-Space Trade-offs in Population Protocolspeople.csail.mit.edu/rivest/pubs/AAEGR17.pdf · trade-off between the time complexity of such algorithms, and the space complexity, i.e.

State Space:StrongStates = 〈x, y〉|x, y ∈ 0, 1, 2, 22, . . . , 2dlog ne, x 6= y, 2 ·min(x, y) 6= max(x, y),WeakStates = 〈0, 0〉+, 〈0, 0〉−Input: States of two nodes, 〈x1, y1〉 and 〈x2, y2〉Output: Updated states 〈x′1, y

′1〉 and 〈x′2, y

′2〉

1 Reduce(u, v) =

[0, 0] if u = v[u− v, 0] if u = 2v[0, v − u] if 2u = v[u, v] otherwise.

2 procedure cancel(x1, y1, x2, y2)

3 [x′1, y′2]← Reduce(x1, y2)

4 [x′2, y′1]← Reduce(x2, y1)

5 procedure join(x1, y1, x2, y2)

6 if (x1 − y1 > 0 and x2 − y2 > 0 and y1 = y2) then y′1 ← y1 + y2 and y′2 ← 0

7 else y′1 ← y1 and y′2 ← y2

8 if (x1 − y1 < 0 and x2 − y2 < 0 and x1 = x2) then x′1 ← x1 + x2 and x′2 ← 0

9 else x′1 ← x1 and x′2 ← x2

10 procedure split(〈x1, y1〉, 〈x2, y2〉)11 if (x1 − y1 > 0 or x2 − y2 > 0) and max(x1, x2) > 1 and min(x1, x2) = 0 then12 x′1 ← max(x1, x2)/2 and x′2 ← max(x1, x2)/2

13 else x′1 ← x1 and x′2 ← x2

14 if (x1 − y1 < 0 or x2 − y2 < 0) and max(y1, y2) > 1 and min(y1, y2) = 0 then15 y′1 ← max(y1, y2)/2 and y′2 ← max(y1, y2)/2

16 else y′1 ← y1 and y′2 ← y2

17 procedure normalize(x, y, v)18 [x, y]← Reduce(x, y)19 if x = 0 and y = 0 then20 if v ≥ 0 then 〈x′, y′〉 ← 〈0, 0〉+

21 else 〈x′, y′〉 ← 〈0, 0〉−

22 else 〈x′, y′〉 ← 〈x, y〉23 procedure interact(〈x1, y1〉, 〈x2, y2〉)24 if x1 = y1 = x2 = y2 = 0 then [〈x′1, y

′1〉, 〈x

′2, y′2〉]← [〈x1, y1〉, 〈x2, y2〉]

25 else26 [x1, y1, x2, y2]← split(join(cancel(x1, y1, x2, y2)))

27 〈x′1, y′1〉 ← normalize(x1, y1, x2 − y2)

28 〈x′2, y′2〉 ← normalize(x2, y2, x1 − y1)

Figure 2: The state update rules for the Split-Join majority algorithm.

A Lower BoundLEMMA A.1. (DENSITY LEMMA) For all population pro-tocols A using |Λn| ≤ 1/2 log log n states and startingin a fully dense initial configuration, with probability ≥1 − (1/n)0.99, there exists an integer j such that the con-figuration reached after j steps is n0.99-rich with respect toΛn.

Proof. Recall that by definition, from a fully dense initialconfiguration every state in Λn is producible.

We begin by defining, for integers k ≥ 0, the function

f(k) = n51−2k+1.

Alternatively, we have that f(k)2 = f(k + 1)n/51.Let c = 1/2. Given the above, we notice that, with thischoice it holds that, for sufficiently large n ≥ 2,• 3(c log log n)2/n ≤ (1/n)0.99, and• for 0 ≤ k ≤ c log log n, we have that f(k) ≥

max(n0.99, 50√n log n).

We divide the execution into phases of index k ≥ 0,each containing n/2 consecutive interactions. For each 0 ≤k ≤ |Λn| − 1, we denote by Ck the system configuration atthe beginning of phase k.

Inductive Claim.: We use probabilistic induction to provethe following claim: assuming that configurationCk is f(k)-rich with respect to the set of states Sk, with probability1 − 3|Λn|/n, the configuration Ck+1 is f(k + 1)-rich withrespect to Sk+1.

For general k ≥ 0, let us fix the interactions up tothe beginning of phase k, and assume that configuration Ckis f(k)-rich with respect to the set of states Sk. Further,consider a state q ∈ Sk+1. We will aim to prove that, withprobability 1 − O(1/n), the configuration Ck+1 containsstate q with count ≥ f(k + 1).

First, we define the following auxiliary notation. For anynode r and set of nodes I , count the number of interactionsbetween r and nodes in the set I , i.e.

intcount(I, r) = |interaction j in phase k :

there exists i ∈ I such that ej = (i, r) |.Next, we define the set of nodes in a state s at the beginningof phase k as

W (s) = v : v ∈ V and Ck(v) = s.Finally, we isolate the set of nodes in state s at the beginningof phase k which did not interact during phase k as

W ′(s) = v : v ∈W (s) and intcount(V, v) = 0.

2571 Copyright © by SIAM

Unauthorized reproduction of this article is prohibited

Dow

nloa

ded

07/1

4/17

to 7

4.10

4.16

2.18

4. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 13: Time-Space Trade-offs in Population Protocolspeople.csail.mit.edu/rivest/pubs/AAEGR17.pdf · trade-off between the time complexity of such algorithms, and the space complexity, i.e.

Figure 3: Example of the outcome of the interaction function. We apply the cancel, join, and split steps in sequence. In thiscase, the join step and the normalize step (not shown) are trivial.

Returning to the proof, there are two possibilities for thestate q. The first is when q ∈ Sk, that is, the state is alreadypresent at the beginning of phase k. But then, by assumption,state q has count ≥ f(k) at the beginning of phase k. Tolower bound its count at the end of phase k, it is sufficient toexamine the size of the set W ′(q). For a node v ∈W (q), theprobability that v ∈W ′(q) is(

1− 1

n

)n/2≥ 1/2,

by Bernoulli’s inequality. Therefore the expected size ofW ′(q) is at least |W (q)|/2. Changing any interaction duringphase k may change |W ′(q)| by at most 1, and therefore wecan apply the method of bounded differences to obtain that

Pr

[|W ′(q)| < |W (q)|

2−√n log n

]≤ exp

(−n log n

n

)=

1

n.

Since, by assumption, |W (q)| ≥ f(k) ≥ 10√n log n, it

follows that

Pr

[|W ′(q)| < 2

5f(k)

]≤ 1

n.

Since 2f(k)/5 ≥ f(k + 1), we have thatPr[#q(Ck+1) ≥ f(k + 1)] ≥ 1 − 1/n, which concludesthe proof of this case.

It remains to consider the case when q ∈ Sk+1 − Sk.Here, we know that there must exist states qi and qr in Sksuch that δ(qi, qr) = q. We wish to lower bound the numberof interactions between nodes in state qi and nodes in stateqr throughout phase k. To this end, we isolate the set R ofnodes which are in state qr at the beginning of phase k, andonly interact once during the phase, i.e.

R = v : v ∈W (qr) and intcount(V, v) = 1,and the set of nodes R′, which are in R, and only

interacted once during phase k, with a node in the setW ′(qi),i.e.

R′ = v : v ∈ R and intcount(W ′(qi), v) = 1.

Notice that any node in the set R′ is necessarily in stateq at the end of phase k+1. In the following, we lower boundthe size of this set.

First, a simple probabilistic argument yields thatE[|R|] ≥ |W (qr)|/4. Since each interaction in this phaseaffects the size of R by at most 2 (since it changes the countof both interaction partners), we can again apply the methodof bounded differences to obtain that

Pr

[|R| < |W (qr)

4− 2√n log n

]≤ 1

n,

implying that

Pr

[|R| < 1

20f(k)

]≤ 1

n.

To lower bound the size of R′, we apply again themethod of bounded differences. We have that |W ′(q)| ≥(2/5)f(k), and that |R| ≥ (1/20)f(k), we have that

Pr

[|R′| ≤ 1

50

(f(k − 1)2

n

)−√n log n

]≤ 1

n.

At the same time, we have that1

50

(f(k)2

n

)−√n log n ≥ 51

50f(k + 1)− 1

50f(k + 1)

= f(k + 1),

which concludes the claim in this case as well.Final Argument.: According to the lemma statement, weare considering an initial configuration in which all initialstates have count ≥ n/M , for some constant M ≥ 0. Let k0

be the first positive integer such that n/M ≥ f(k0). We havethat the initial configuration is f(k0)-rich with respect to theset of initial states S0. By a variant of the previous inductiveclaim, we obtain that, for any integer 0 ≤ ` ≤ |Λn| satisfyingf(k0 + `) ≥ max(n0.99, 10

√n log n), at the beginning of

phase `, configuration C` is f(k0 + `)-rich with respect toS`.

It therefore follows that, with probability at least

(1− 3|Λn|/n)|Λn| ≥ 1− 3(c log log n)2/n ≥ 1− 1/n0.99,

13

2572 Copyright © by SIAM

Unauthorized reproduction of this article is prohibited

Dow

nloa

ded

07/1

4/17

to 7

4.10

4.16

2.18

4. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 14: Time-Space Trade-offs in Population Protocolspeople.csail.mit.edu/rivest/pubs/AAEGR17.pdf · trade-off between the time complexity of such algorithms, and the space complexity, i.e.

there exists an integer j such that the configuration reachedafter j steps is n0.99-rich with respect to Λn.

Proof. [Lemma 3.4] The proof is analogous to [DS15],but we consider a subsequence of the ordered transitions∆b = d1, . . . , dk obtained earlier by Lemma 3.3. Sinceb(m) ≥ g(m), we can represent ∆g = dj1 , . . . , djl, withj1 ≤ . . . ≤ jl. We iteratively add groups of transitions at theend of transition sequence pm, (pm is the transition sequencefrom xm to y), such that, after the first iteration, the resultingconfiguration does not contain any agent in dj1 . Next, weadd group of transitions and the resulting configuration willnot contain any agent agent in dj1 or dj2 , and we repeat thisl times. In the end, no agents will be in states from ∆g .

The transition ordering lemma provides us with the tran-sitions to add. Initially, there are at most g(m) agents in statedj1 in the system (because of the requirement in Theorem 3.1on counts in u+y). So, in the first iteration, we add the sameamount (at most g(m)) of transitions dj1 , sj1 → oj1 , o

′j1

, af-ter which, as sj1 , oj1 , o

′j16∈ d1, . . . dj1, the resulting con-

figuration will not contain any agent in configuration di1 . Ifthere are not enough agents in the system in state sj1 alreadyto add all these transitions, then we add the remaining agentsin state in sj1 to e. For the first iteration, we may need to addat most g(m) agents.

For the second iteration, we add transitions of typedj2 , sj2 → oj2 , o

′j2

to the resulting transition sequence.Therefore, the number of agents in dj2 that we may needto consume is at most 3 · g(m), g(m) of them could havebeen there in y + u, and we may have added 2 · g(m) in theprevious iteration, if for instance both oj1 and o′j1 were dj2 .In the end, we may need to add 3 · g(m) extra agents to e.

If we repeat these iterations for all remaining r =3, . . . , l, in the end we will end up in a configuration z thatcontains all agents in states in Γg as desired, because of theproperty of transition ordering lemma that sjr , ojr , o

′jr6∈

d1, . . . , djr. For any r, the maximum total number ofagents we may need to add to e at iteration r is (2r−1)·g(m).The worst case is when oj1 and o′j1 are both dj2 , and oj2 , o

′j2

are both dj3 , etc.Finally, it must hold that l < |Λm|, because the final

configuration contains m agents in states in Γg and none indj1 , . . . , djl, so Γg cannot be empty. Therefore, the totalnumber of agents added to e is g(m) ·

∑lr=1(2r − 1) <

2l+1·g(m) ≤ 2|Λm|·g(m). This completes the proof becausee(s) for any state s can be at most the number of agents in e,which is at most 2|Λm| · g(m).

Proof. [Lemma 3.4] As in the proof of Lemma 3.4, wedefine a subsequence (j1 ≤ jl), ∆g = dj1 , . . . , djl of∆b = d1, . . . , dk obtained using Lemma 3.3. We startby the transition sequence pm from configuration xm to y,and perform iterations for r = 1, . . . k. At each iteration, wemodify the transition sequence, possibly add some agents to

configuration p, which we will define shortly, and considerthe counts of all agents not in p in the resulting configuration.Configuration p acts as a buffer of agents in certain states thatwe can temporarily borrow. For example, if we need 5 agentsin a certain state with count 0 to complete some iteration r,we will temporarily let the count to −5 (add 5 agents to p),and then we will fix the count of the state to its target value,which will also return the “borrowed” agents (so p will alsoappear in the resulting configuration). As in [DS15], thisallows us let the counts of certain states temporarily dropbelow 0.

We will maintain the following invariants on the countof agents, excluding the agents in p, in the resulting configu-ration after iteration r:

1) The counts of all states (not in p) in ∆g ∩ d1, . . . , drmatch to the desired counts in e∆g .

2) The counts of all states in d1, . . . dr−∆g are at least2|Λm| · g(m).

3) The counts in any state diverged by at most (3r − 1) ·2|Λm| · g(m) from the respective counts in y.These invariants guarantee that we get all the desired

properties after the last iteration. Let us consider the finalconfiguration after iteration k. Due to the first invariant, theset of all agents (not in p) in states ∆g is exactly e∆g . All theremaining agents (also excluding agents in p) are in w, andthus, by definition, the counts of states in ∆g in configurationw will be zero, as desired. The counts of agents in states∆b − ∆g = d1, . . . dk − ∆g that belong to w will be atleast 2|Λm| · g(m), due to the second invariant. Finally, thecounts of agents in Γb that belong to w will also be at leastb(m)− 3|Λm| · 2|Λm| · g(m) ≥ 2|Λm| · g(m), due to the thirdinvariant and the fact that the states in Γb had counts at leastb(m) in y. Finally, the third invariant also implies the upperbound on counts in p. The configuration p will only containthe agents in states ∆b, because the agents in Γb have largeenough starting counts in y borrowing is never necessary.

In iteration dr, we fix the count of state dr. Let usfirst consider the case when dr belongs to ∆g . Then, thetarget count is the count of the state dr in e∆g , which weare given is at most 2|Λm| · g(m). Combined with the thirdinvariant, the maximum amount of fixing required may beis 3r−1 · 2|Λm| · g(m). If we have to reduce the numberof dr, then we add new transitions dr, sr → or, o

′r, similar

to Lemma 3.4 (as discussed above, not worrying about thecount of sr possibly turning negative). However, in thecurrent case, we may want to increase the count of dr. Inthis case, we remove instances of transition dr, sr → or, o

′r

from the transition sequence. The transition ordering lemmatells us that there are at least b(m) of these transitions tostart with, so by the third invariant, we will always haveenough transitions to remove. We matched the count of drto the count in e∆g , so the first invariant still holds. Thesecond invariant holds as we assumed dr ∈ ∆g and since

2573 Copyright © by SIAM

Unauthorized reproduction of this article is prohibited

Dow

nloa

ded

07/1

4/17

to 7

4.10

4.16

2.18

4. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 15: Time-Space Trade-offs in Population Protocolspeople.csail.mit.edu/rivest/pubs/AAEGR17.pdf · trade-off between the time complexity of such algorithms, and the space complexity, i.e.

by Lemma 3.3, sr, or, o′r 6∈ d1, . . . , dr. The third invariantalso holds, because we performed at most 3r−1 ·2|Λm| ·g(m)transition additions or removals, each affecting the count ofany other given state by at most 2, and hence the total countdiffer by at most

(3r−1 − 1) · 2|Λm| · g(m) + 2 · 3r−1 · 2|Λm| · g(m)

= (3r − 1) · 2|Λm| · g(m).

Now assume that dr belongs to ∆b−∆g . If the count ofdr is already larger than 2|Λm| ·g(m), than we do nothing andmove to the next iteration, and all the invariants will hold. Ifthe count is smaller than 2|Λm| · g(m), then we set the targetcount to 2|Λm| · g(m) and add or remove transitions like inthe previous case, thus the first two invariants will again holdafter the iteration. We are fixing the count from less than2|Λm| ·g(m) to 2|Λm| ·g(m). Therefore, again considering themaximum effect on counts, the new difference can be at most(3r−1− 1) · 2|Λm| · g(m) + 2 · 2|Λm| · g(m), which is at most3r · 2|Λm| · g(m). As in the previous case, we have enoughtransitions to remove and the third invariant also holds.

B Analysis of the Majority AlgorithmThe update rules in Figure 2 are chained, i.e. a cancel is fol-lowed by a join and a split. This is an optimization, apply-ing as many possible reactions as possible. However, for theanalysis we consider a slight modification, where we onlyapply split only if both join and cancel were unsuccessful.

For presentation purposes, we assume that n is a powerof two, and when necessary, we assume that it is sufficientlylarge. Throughout this proof, we denote the set of nodesexecuting the protocol by V . We measure execution time indiscrete steps (rounds), where each time step t correspondsto an interaction. The configuration at a given time t is afunction c : V → Q, where c(v) is the state of the node vat time t. (We omit the explicit time t when clear from thecontext.)

Recall that a value of a state 〈x, y〉 is defined as x − yand we will also refer to max(x, y) as the level of this node.We call 〈x, y〉 a mixed state if both x and y are non-zero, anda pure state otherwise. A mixed or pure node is a node in amixed or a pure state, respectively.

The rest of this section is focused on the following.

THEOREM B.1. The Split-Join algorithm will never con-verge to the minority decision, and is guaranteed to convergeto the majority decision within O(log3 n) parallel time, bothin expectation and w.h.p.

Correctness: We first prove that nodes never converge to thesign of the initial minority (safety), and that they eventuallyconverge to the sign of the initial majority (termination).

The first statement follows since given the interactionrules of the algorithm, the sum of the encoded values staysconstant as the algorithm progresses. The proof follows bythe structure of the algorithm.

LEMMA B.1. The sum∑v∈V value(c(v)) never changes,

for all reachable configurations c of the protocol.

This invariant implies that the algorithm may never convergeto a wrong decision value. For instance, if the initial sum ispositive, then positive values must always exist in the system.Therefore we only need to show that the algorithm convergesto a state where all nodes have the same sign. We do this viaa rough convergence bound, assuming an arbitrary startingconfiguration.

LEMMA B.2. There are at most 2n2 split reactions in anyexecution.

Proof. A level of a node in state s = 〈x, y〉 is defined aslevel(s) = max(x, y). Consider a node in a state with levell. Then, we say that the potential of the node is φ(l) = 2l forl > 0 and φ(0) = 1. In any configuration c, the potential ofthe system is Φ(c) =

∑ni=1 φ(level(si)).

Then, the potential of the system in the initial configura-tion is

∑(2n) = 2n2, and it can never fall below

∑(1) = n.

By the interaction rules of the algorithm, potential of the sys-tem never increases after an interaction, and it decreases byat least one after each successful split interaction. This im-plies the claim.

LEMMA B.3. Let c be an arbitrary starting configuration.Define S :=

∑v∈V value(c(v)) 6= 0. With probability

1, the algorithm will reach a configuration c such thatsgn(c(v)) = sgn(S) for all nodes v ∈ V . Moreover, in alllater configurations ce reachable from c, no node can everhave a different sign, i.e. ∀v ∈ V : sgn(ce(v)) = sgn(c(v)).For sufficiently large n, the convergence time to c is at mostn5 expected communication rounds, i.e. parallel time n4.

Proof. Assume without loss of generality that the sum S ispositive.

We estimate the expected convergence time by splittingthe execution into three phases. The first phase starts at thebeginning of the execution, and lasts until either i) no nodeencodes a strictly negative value or ii) each node encodes avalue in −1, 0, 1, i.e. all nodes are in states 〈1, 0〉, 〈0, 0〉+.〈0, 0〉− or 〈0, 1〉.

Due to Lemma B.1, at least one node encodes a strictlypositive value. Also, by definition, during the first phasethere is always a node encoding a strictly negative value.Moreover, there is a node in state 〈x, y〉with max(x, y) > 1.Assume that x > y for this node. Then, if there is anothernode in state 〈0, y2〉 for any y2, then with probability at least1/n2 these two nodes interact in the next round resulting ina split reaction. Otherwise, every node 〈x1, y1〉 that encodesa strictly negative value must have min(x1, y1) > 0. At leastone such node exists and if there is another node in state〈x2, 0〉 for any x2, then again with probability at least 1/n2

a split reaction occurs in the next round. The case of x < y

15

2574 Copyright © by SIAM

Unauthorized reproduction of this article is prohibited

Dow

nloa

ded

07/1

4/17

to 7

4.10

4.16

2.18

4. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 16: Time-Space Trade-offs in Population Protocolspeople.csail.mit.edu/rivest/pubs/AAEGR17.pdf · trade-off between the time complexity of such algorithms, and the space complexity, i.e.

is analogous and we get that during the first phase, if thereis no pair whose interaction would result in a split reaction,all nodes must be in states 〈x, y〉 with min(x, y) > 0, i.e.in mixed states. By Lemma B.4, with probability at least

12(logn−1) a pure node appears after the next communicationround and by the above argument, if the first phase has notbeen completed, in the subsequent round a split reaction willoccur with probability at least 1/n2. Therefore, during thefirst phase, the expected number of rounds until the next splitreaction is at most 4n2(log n − 1). By Lemma B.2, therecan be at most 2n2 split reactions in any execution, thus theexpected number of communication rounds in the first phaseis at most 8n4(log n− 1).

The second phase starts immediately after the first, andends when no node encodes a strictly negative value. Notethat if this was already true when the first phase ended, thenthe second phase is trivially empty. Consider the other casewhen all nodes encode values −1, 0 and 1 at the beginningof the second phase. Under these circumstances, because ofthe update rules, no node will ever be in a state 〈x, y〉 withmax(x, y) > 1 in any future configuration. Also, the numberof nodes encoding non-zero values can only decrease. Ineach round, with probability at least 1/n2, two nodes withvalues 1 and −1 interact, becoming 〈0, 0〉+ and 〈0, 0〉−. Asthis can only happen n/2 times, the expected number ofcommunication rounds in the second phase is at most n3/2.

The third phase lasts until the system converges, thatis, until all nodes with value 0 are in state 〈0, 0〉+.By Lemma B.1, S > 0 holds throughout the execution,so there is at least one node with a positive sign and non-zero value. There are also at most n − 1 conflicting nodeswith negative sign, all in state 〈0, 0〉−. Thus, independentlyin each round, with probability at least 1/n2, a conflict-ing node meets a node with strictly positive value and be-comes 〈0, 0〉+, decreasing the number of conflicting nodesby one. The number of conflicting nodes can never increaseand when it becomes zero, the system has converged to thedesired configuration c. Therefore, the expected number ofrounds in the third phase is at most n3.

Combining the results and using the linearity of expec-tation, the total expected number of communication roundsbefore reaching c is at most n3(8n(log n−1)+1/2+1) ≤ n5

for sufficiently large n. Finite expectation implies that thealgorithm converges with probability 1. Finally, when twonodes with positive sign meet, they both remain positive, soany configuration ce reachable from c has the correct signs.

Convergence Time: Next, we bound the time until all nodesconverge to the correct sign.

LEMMA B.4. Consider a configuration where nδ out of then nodes are in a mixed state, for δ ≥ 2(logn−1)

n . In thenext interaction round, the number of mixed nodes strictlydecreases with probability at least δ2

2(logn−1) .

Proof. Consider log n − 1 buckets corresponding to values1, 2, 4, . . . , n/4. Let us assign mixed nodes to these bucketsaccording to their states, where node in state 〈x, y〉 goes intobucket min(x, y). All nodes fall into one of the log n − 1buckets because of the definition of (mixed) states.

If two nodes in the same bucket interact, either cancel orjoin will be successful, and since we consider the algorithmwhere split is not applied in this case, and the numberof mixed nodes will strictly decrease. Thus, if there ared1, d2, . . . , dlogn−1 nodes in the buckets, the number ofpossible interactions that decrease the number of mixednodes is at least

∑logn−1i=1

di(di−1)2 =

(∑d2i )−nδ2 .

By the Cauchy-Schwartz inequality,∑d2i ≥ n2δ2

logn−1 .Combining this with the above and using nδ ≥ 2(log n− 1)

we get that the there are at least n2δ2

4(logn−1) pairs of nodeswhose interactions decrease the number of mixed nodes. Thetotal number of pairs is n(n − 1)/2, proving the desiredprobability bound.

LEMMA B.5. Suppose f is a function such that f(n) ∈O(poly(n)). For all sufficiently large n, the probability ofhaving less than n

211 logn pure nodes in the system at anytime during the first f(n) communication rounds is at most1− 1/n5.

Proof. Assume that this number became less than n211 logn

for the first time at time T after some number of commu-nication rounds. Let t be the last time when the numberof pure nodes was at least n

29 logn (such a time exists sincethe initial number of pure nodes is n) and let α be the num-ber of communication rounds between t and T . The numberof mixed nodes increases by at most two in each round, soα ≥ n

211 logn .By definition of t and T , at all times during the α com-

munication rounds between t and T , at least n(29 logn−1)29 logn ≥

n2 nodes are mixed. Thus, by Lemma B.4 in each of thesecommunication rounds, the number of mixed nodes de-creases by at least one with probability at least 1

8 logn . Letus describe by a random variable X ∼ Bin(α, 1

8 logn ) atleast how often the number of mixed nodes decreased. Eachnode is pure or mixed, and by Chernoff Bound, the probabil-ity that the number of pure nodes increased less than α

16 logn

times is Pr[X ≤ α

16 logn

]= Pr

[X ≤ α

8 logn (1− 1/2)]≤

exp(− α

8 logn·22·2

)≤ exp

(− n

217 log2 n

)On the other hand, in each of these α rounds, the number

of pure nodes can decrease only if one of the interactingnodes was in a pure state. By definition of t and T , thenumber of such pairs is at most n2

218 log2 n+2n

2(29 logn−1)218 log2 n

≤2n2

29 logn . This implies that in each round the probability thatthe number of pure nodes will decrease is at most 1

26 logn .Let us describe the (upper bound on the) number of such

2575 Copyright © by SIAM

Unauthorized reproduction of this article is prohibited

Dow

nloa

ded

07/1

4/17

to 7

4.10

4.16

2.18

4. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 17: Time-Space Trade-offs in Population Protocolspeople.csail.mit.edu/rivest/pubs/AAEGR17.pdf · trade-off between the time complexity of such algorithms, and the space complexity, i.e.

rounds by a random variable Y ∼ Bin(α, 126 logn ). Since in

each such round the number of pure nodes can decrease by atmost 2, using Chernoff bound the probability that the numberof pure nodes decreases by more than α

16 logn during the αcommunication rounds is at most Pr [Y ≥ α/(32 log n)] ≤exp

(− n

219 log2 n

)In order for the number of pure nodes to have decreased

from n29 logn at time t to n

211 logn at time T , either the numberof mixed nodes must have increased by at most α

16 logn ,or the number of pure nodes must have decreased by atleast α

16 logn during the α communication rounds betweent and T . Otherwise, the increase in mixed nodes would bemore than the decrease in pure nodes. However, by unionbound, the probability of this is at most exp

(− n

217 log2 n

)+

exp(− n

219 log2 n

).

We can now take union bound over the number ofcommunication rounds until the number of pure nodes dropsbelow n

211 logn (time T ). For at most f(n) ∈ O(poly(n))rounds, we get that the probability of the number of purenodes ever being less than n

211 logn is at most 1/n5 for alllarge enough n.

Consider the high probability case of the above lemma,where a fraction of pure nodes are present in every config-uration in the execution prefix. We call a round a negative-round if, in the configuration c at the beginning of the round,there are at least n

211 logn pure nodes and at least half of thepure nodes encode a non-positive value. Analogously, wecall a round a positive-round if there are at least n

211 logn purenodes, at least half of which encode a non-negative value. Around can be simultaneously negative and positive, for in-stance when all pure nodes encode value 0. Next lemmaestablishes the speed at which the maximum level in the sys-tem decreases. The proof follows by bounding the probabil-ity that a node with the maximum level meets a pure nodewith value 0 or a value of the opposite sign. This results in asplit (or cancel) reaction decreasing the level of the node, andwe use Chernoff and Union Bounds to bound the probabilitythat the node avoids such a meeting for significant time.

LEMMA B.6. Let w > 1 be the maximum level amongthe nodes with a negative (resp., positive) sign. There is aconstant β, such that after βn log2 n positive-rounds (resp.,negative-rounds) the maximum level among the nodes witha negative (resp., positive) sign will be at most bw/2c withprobability at least 1− 1

n5 .

Proof. We will prove the lemma for nodes with negativevalues. (The converse follows analogously.) Fix a round r,and recall that w > 1 is the maximum level of a node witha negative value at the beginning of the round. Let U be theset of all nodes with negative values and the same level w at

the beginning of the round, and let u = |U |. We call thesenodes target nodes.

By the structure of the algorithm, the number of targetnodes never increases and decreases by one in every elimi-nating round where a target node meets a pure node with anon-negative value, due to a split or cancel reaction. Con-sider a set of αn log n consecutive positive-rounds after r,for some constant α > 212. In each round, if there are still atleast du/2e target nodes, then the probability of this roundbeing eliminating is at least du/2e

212n logn (since in a positiveround at least half of n

211 logn pure nodes have non-negativevalue). Let us describe the process by considering a randomvariable Z ∼ Bin(αn log n, du/2e

212n logn ), where each successevent corresponds to an eliminating round. By a ChernoffBound, the probability of having αn log n iterations with atmost du/2e eliminations is at most:

Pr [Z ≤ du/2e] = Pr

[Z ≤ αdu/2e

212

(1− α− 212

α

)]≤ exp

(−αdu/2e(α− 212)2

213α2

)For sufficiently large α and u ≥ log n, the probability of thisevent is at most 1

n6 for αn log n positive-rounds. Applyingthe same rationale iteratively as long as u ≥ log n, we obtainby using a Union Bound that the number of target nodes willbecome less than log n within αn log n(log n − log log n)positive-rounds, with probability at least 1− logn−log logn

n6 .Finally, we wish to upper bound the remaining number

of positive-rounds until no target node remains. Again forsufficiently large α, but when u < log n, we get from thesame argument as above that the number of target nodesis reduced to bu/2c within αn log2 n

u consecutive positive-rounds with probability 1/n6. So we consider increasingnumbers of consecutive positive-rounds, and obtain that notarget nodes will be left after at most αn log n+2αn log n+. . . + αn log2 n ≤ 2αn log2 n positive-rounds, with proba-bility at least 1 − log logn

n6 , where we have taken the unionbound over log log n events. The original lemma follows bysetting β = 3α, taking Union Bound over the above twoevents (u ≥ log n and u < log n) and log n ≤ n.

We get a condition for halving the maximum level(among positive or negative values) in the system with highprobability. The initial levels in the system is n, whichcan only be halved log n times for each sign. Combiningeverything results in the following lemma:

LEMMA B.7. There exists a constant β, such that if duringthe first 2βn log3 n rounds the number of pure nodes isalways at least n

211 logn , then with probability at least 1 −2 lognn5 , one of the following three events occurs at some point

during these rounds:

1. Nodes only encode values in −1, 0, 1;

17

2576 Copyright © by SIAM

Unauthorized reproduction of this article is prohibited

Dow

nloa

ded

07/1

4/17

to 7

4.10

4.16

2.18

4. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 18: Time-Space Trade-offs in Population Protocolspeople.csail.mit.edu/rivest/pubs/AAEGR17.pdf · trade-off between the time complexity of such algorithms, and the space complexity, i.e.

2. There are less than n212 logn nodes with non-positive

values, all encoding 0 or −1,

3. There are less than n212 logn nodes with non-negative

values, all encoding 0 or 1.

Proof. We take a constant β that works for Lemma B.6.Since there are at least n

211 logn pure node at all times duringthe first 2βn log3 n rounds, each round during this intervalis a negative-round, a positive-round, or both. We callmaximum positive (resp. negative) level the maximum levelamong all the nodes encoding non-negative (resp. non-positive) values. Unless the maximum positive level inthe system is ≤ 1, by Lemma B.6, a stretch of βn log2 nnegative-rounds halves the maximum positive level, withprobability at least 1 − 1

n5 . The same holds for stretches ofβn log2 n positive-rounds and the maximum negative level.

Assume that none of the three events hold at any timeduring the first 2βn log3 n rounds. In that case, each roundcan be classified as either:• a negative-round where the maximum positive level is

strictly larger than 1, or• a positive-round where the maximum negative level is

strictly larger than 1.To show this, without a loss of generality consider anypositive-round (we showed earlier that each round ispositive-round or a negative-round). If the maximum neg-ative level is > 1 then the round can be classified as claimed,thus all non-positive values in the system must be 0 or −1.Now if there are less than n

212 logn such nodes, then we havethe second event, so there must be more than n

212 logn nodesencoding 0 and −1. However, all these nodes are pure, sothe round is simultaneously a negative-round. Now if themaximum positive level is > 1 then the round can again beclassified as claimed, and if the maximum positive level is atmost 1, then all nodes in the system are encoding values −1,0 or 1 and we have the first event.

Thus, each round contributes to at least one of thestretches of βn log2 n rounds that halve the maximum (pos-itive or negative) level, w.h.p. However, this may happen atmost 2 log n times. By applying Lemma B.6 2 log n timesand the Union Bound we get that after the first 2βn log3 nrounds, with probability at least 1 − 2 logn

n5 only values −1,0 and 1 may remain. However, this is the same as the firstevent above. Hence, the probability that none of these eventshappen is at most 2 logn

n5 .

Final Argument: To see how to obtain the convergenceupper bound, let us assume without loss of generality thatthe initial majority of nodes was in A (positive) state, i.e.a > b.

Setting β as in Lemma B.7, by Lemma B.5, with highprobability, we have at least n

211 logn pure nodes during thefirst 2βn log3 n rounds. Thus, w.h.p. during these rounds

the execution reaches a configuration where one of the threeevents from Lemma B.7 holds. Consider this point T in theexecution.

By our assumption about the initial majorityand Lemma B.1,

∑v∈V value(c(v)) = εn2 holds in

every reachable configuration c. The third event is impos-sible, because the total sum would be negative. In the firstevent, the total sum is εn2 ≥ n of n encoded values eachbeing −1, 0 or 1. Therefore, in this case, all nodes must bein state 〈1, 0〉 and we are done.

In the second event implies there are at leastn(212 logn−1)

212 logn ≥ 2n3 nodes encoding strictly positive values.

Hence, at time T during the first 2βn log3 n rounds there areat least n/3 more strictly positive than strictly negative val-ues. Moreover, −1’s are the only strictly negative values ofthe nodes at point T , and this will be the case for the rest ofthe execution because of the update rules. After time T , wehave

LEMMA B.8. Consider a configuration where all nodeswith strictly negative values encode −1, while at least2n3 nodes encode strictly positive values. The number of

rounds until convergence is O(n log n) in expectation andO(n log2 n) with high probability.

Using this lemma which is proven in the full version,and by Union Bound over Lemma B.5 and Lemma B.7,with probability 1 − logn+1

n5 the number of communicationrounds to convergence is thus 2βn log3 n + O(n log2 n) =O(n log3 n).

In the remaining low probability event, with probabilityat most logn+1

n5 , the remaining number of rounds is at mostO(n5) by Lemma B.3. Therefore, the same O(n log3 n)bound also holds in expectation,

C Analysis of the Leader Election AlgorithmLEMMA C.1. All nodes can never be minions. A configura-tion with n − 1 minions must have a stable leader, meaningthat the non-minion node will never become a minion, whileminions will remain minions.

Proof. Assume for contradiction that all nodes are minionsat some time T , and let u be a maximum (.payoff , .level)pair (lexicographically) among all the minions at this time.No node in the system could ever have had a larger pair,because no interaction can decrease a pair. The minions onlyrecord the values of such pairs they encounter, and neverincrease them, so there must have been a contender in thesystem with a payoff and level pair u that turned minion bytime T . Among all such contenders, consider the one thatturned minion the last. It could not have interacted with aminion, because no minion (and no node) in the system everheld a larger pair. On the other hand, even if it interacted withanother contender, the contender also could not have held a

2577 Copyright © by SIAM

Unauthorized reproduction of this article is prohibited

Dow

nloa

ded

07/1

4/17

to 7

4.10

4.16

2.18

4. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 19: Time-Space Trade-offs in Population Protocolspeople.csail.mit.edu/rivest/pubs/AAEGR17.pdf · trade-off between the time complexity of such algorithms, and the space complexity, i.e.

larger pair. Thus, it could only have been an interaction withanother contender, that held the same pair and a larger coinvalue used as a tie-breaker. However, that interaction partnerwould remain a contender and and must have turned minionlater, contradicting our assumption that the interaction weconsidered was the last one where a contender with a pair ugot eliminated.

By the structure of the algorithm, minions can neverchange their mode. In any configuration with n− 1 minions,the only non-minion must remain so forever, and thus be astable leader, because otherwise we would get nminions andviolate the above argument.

LEMMA C.2. With probability 1 − O(1)/n3, afterO(n log n) interactions, all agents will be in the competitionstage, that is, either in tournament or minion mode, withmaximum payoff at least log n/2 and at most 9 log n, andat most 5 log n agents will have this maximum payoff .

Proof. Let us call the first 2n interactions the irrelevantinteractions, and all interactions after that the relevant in-teractions. By Theorem 4.1, during relevant interactions,with probability at least 1 − 2 exp(−n/4), there are at leastn(1/2 − 1/28) and at most n(1/2 + 1/28) agents holdingeach possible coin value. Taking an union bound, this holdsfor all of the first n4 relevant interactions with probability atleast 1 − 2n4

exp(n/4) < 1 − 1/n3. From now on, we will con-sider this high probability event of synthetic coin workingproperly and implicitly take unoin bound over it.

During any relevant interaction, any agent in lotterymode with probability at least 1/2 − 1/28 observes 0 andchanges its mode to the tournament. Hence, the probabilitythat it increases its payoff more than 5 log n times is atmost (1/2 + 1/28)5 logn ≤ 1/n4. Taking the union boundover all agents gives that the increase in payoffs in therelevant interactions for all agents are less than 5 log n withprobability at least 1 − 1/n3. During the first 2n irrelevantinteractions, by the Chernoff bound, the probability that agiven agent interacts more than 4 log n times is at most 1/n4.Taking an union bound, all agents interact at most 4 log ntimes with probability at least 1−1/n3. Even if they increasetheir payoff each time, in total, with probability at least1−O(1)/n3, all agents will have payoffs at most 9 log n.

Since every agent stays in seeding mode for four in-teractions, we can find at least n/2 agents, who move tolottery mode during a relevant interaction. Consider anyone of the n/2 agents. By assumption, the agent will haveprobability at least 1/2 − 1/28 of finalizing its payoff andmoving to tournament mode. The probability that the pay-off of this agent will be larger than log n/2 is thus at least(1/2− 1/28)logn/2 ≥ 1/n0.6. If the payoff is indeed larger,we are done, otherwise, we can find another agent among then/2−log n/2 whose interactions we have not yet considered,and analogously get that with probability at least 1/n0.6, it

would get a payoff at least log n/2. We can continue this pro-cess, and will end up with about n/ log n agents, whose in-teractions were completely independent, and because of thebias, each of them had a probability of at least 1/n0.6 ofgetting a larger payoff than log n/2. If we describe this pro-cess as a random variable Bin

(n

logn ,1n0.6

)with expectation

n0.4/ log n, we get by the Chernoff Bound that the probabil-ity of no node getting ≥ log n/2 payoff is extremely low (inparticular, less than 1/n3).

Again, considering all the high probability events fromabove, we know that the maximum payoff in the system isbetween log n/2 and 9 log n. Consider any fixed payoff kin this interval, and let us say it is the maximum. Then,any agent that reaches this payoff, has to flip 0, but theymight flip 1 with probability at least 1/2 − 1/28. Thus, theprobability that at least 12 log n agents will stop exactly atpayoff k is at most (1/2 + 1/28)5 logn ≤ 1/n4. Taking theunion bound over at most 9 log n < n payoffs, and the abovehigh probability events, we get that with probability at most1−O(1)/n3, at most 5 log n agents will have the maximumpayoff, which will be between log n/2 and 9 log n.

It only remains to prove that after O(n log n) interac-tions, all agents will be in the tournament stage with prob-ability 1 − O(1)/n3, which is a standard high probabilitycoupon collector argument. Recall that after 2n irrelevant in-teractions, with probability at least 1− 1/n3 synthetic coinswork properly for the next n4 interactions, and thus, eachagent that has not yet moved to the tournament stage and in-teracts, has probability of at least 1/2 − 1/28 of doing so.Thus, after O(n log n) interactions, by the Chernoff bound,a given agent will interact O(log n) times with very highprobability and move to the tournament stage at least dur-ing one of these O(log n) interactions also with very highprobability. Taking union bound over all agents and theseevents completes the proof.

LEMMA C.3. with probability at least 1 − O(1)/n3, onlyone contender reaches level ` = 3 log n/ log log n, and ittakes at most O(n log5.3 log log n) interactions for a newlevel to be reached up to `.

Proof. By our assumption on m, it holds that mlogm ≥

3 lognlog logn , so this level can always be reached. We considerthe high probability case in Lemma C.2, which occurs withprobability ≥ 1 − O(1)/n3. Hence, we need to prove thatthe probability that more than one competitor reaches level `is also at most O(1)/n3.

Consider some competitor v which just increased themaximum level among competitors in the system. Untilsome other competitor reaches the same level, v will turnevery interaction partner into its minion. Furthermore, as inepidemic spreading, these minions will also turn their inter-action partners into minions of the highest level contender v.

19

2578 Copyright © by SIAM

Unauthorized reproduction of this article is prohibited

Dow

nloa

ded

07/1

4/17

to 7

4.10

4.16

2.18

4. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Page 20: Time-Space Trade-offs in Population Protocolspeople.csail.mit.edu/rivest/pubs/AAEGR17.pdf · trade-off between the time complexity of such algorithms, and the space complexity, i.e.

Let the payoff and level pair of this competitor be u. Wecall a node whose pair also at least u an up-to-date node; thenode is out-of-date otherwise. Initially, only the contender vthat reached the maximum level is up-to-date.

We will show that if in some configuration x < n nodesare up-to-date, after a phase of 16n(n−1) logn

4x(n−x) interactions, atleast x+ 1 nodes will be up-to-date with probability at least1 − 1

n5 . Up-to-date nodes may never become out-of-date.On the other hand, an out-of-date node becomes up-to-dateitself after an interaction with any up-to-date node. If wehave x up-to-date nodes, in each interaction, the probabilitythat an out-of-date node interacts with an up-to-date nodeincreasing the number of up-to-date nodes to x + 1, is2x(n−x)n(n−1) , which we denote by 1/α. To probability that this

never happens during 16n(n−1) logn4x(n−x) = 8α log n interactions

is then (1− 1/α)8α logn ≤ 1− 1/n5.An Union Bound over at most n phases gives that withprobability at least 1− 1/n4, after at most

n−1∑x=1

16n(n− 1) log n

4x(n− x)

≤ 16(n− 1) log n

4

n−1∑x=1

(1

x+

1

n− x

)≤ 16n log2 n

rounds, all nodes will have value at least u. Taking anunion bound over all possible levels ` < log n < n, andall previous events, we get that with probability at least1 − O(1)/n3, once a contender reaches some level, unlesssome other contender reaches the same level within the next16n log2 n interactions, the original node will turn everyother node into minions and become a stable leader.

Once some contender has increased the maximum level,it needs to observe between 4.2 log log n and 4.2(log logn+log 9 + 1) consecutive ones to increase a level, as with highprobability we knew that the payoff was between log n/2and 9 log n and we set the phase size to 4.2(log payoff +1). Notice that all nodes that did not get a maximumpayoff are not considered as contenders, because they willall be eliminated by maximum payoff contenders withinO(n log n) interactions.

A given node at each iteration observes coin value onewith probability at least (1/2 − 1/28), so the probabilitythat a stretch of 4.2 log log n+O(1) consecutive interactionsresults in a level increase is at least Θ(1/ log4.3 n). Ifwe consider an interval containing O(n log5.3 n log log n)interactions, then the agent increases the level in expectationO(log5.3 n/ log4.3 n) = O(log n) times and by the Chernoffbound, at least once with high probability.

For the rest of the argument, we will bound the probabil-ity that any of the other contenders, whose number is at most5 log n by Lemma C.2, will be able to increment their levelduring an arbitrary stretch of 6 log log n+12 consecutive in-teractions. This probability will be low, and therefore it is

likely that the process will terminate after a level increase.More precisely, once a new level is reached after a

level increment, the nodes have 16n log2 n interactions toincrement to the same level, or they will soon all becomeminions. To do so, they should all have at least one iterationof observing at least 4.2 log log n consecutive ones.

Hence, there can be at most 5 log n·Θ(log2 n/ log log n)such interaction intervals and each interaction interval hasprobability at most (1/2 + 1/28)4.2 log logn ≤ 1/ log4 n ofsuccess. Hence, by taking sufficiently large n, we can makethe expectation of the number of successful iterations be lessthan 1/ log n.

Hence, the probability that there is no second survivoramong contenders at each level (which would correspond toa stable leader being elected) is at most 1/ log n, every timethe maximum level is incremented. The probability that thisdoes not happen for all ` = 3 logn

log logn levels is then at most( 1

logn )` ≤ 1n3 .

LEMMA C.4. The expected parallel time until convergenceis O(log5.3 n log log n).

Proof. To get the same bound on expected parallel time asin the with high probability case, we should incorporate theexpected time in the low probability event. The lottery stagetakes expected O(n log n) interactions by standard couponcollector argument (see the argument in Lemma C.2).

During competition, as we saw in Lemma C.3, withhigh probability, a new level is reached within everyO(n log5.3 n log log n) interactions, and with more than con-stant probability a single contender remains after each level.Therefore, in the high probability case, only constantly manylevels will be used in expectation and expected parallel timewill be O(log5.3 n log log n).

Otherwise, during competition, recall that non-minionscan always eliminate each other in direct interactions com-paring their payoffs, levels, and the coin as a tie-breaker. So,for any given two non-minion nodes x and y, in every in-teraction, there is a probability of at least O(1/n2) that theymeet, and one eliminates each other for certain if they haddifferent coin values. If not, then with probability at least1/n, one of the nodes, say x, interacts with some other nodein this interaction, and then immediately afterward, interactswith y, this time with different coin values. Hence, in everytwo iterations, with probability at leastO(1/n3), the numberof contenders decreases by at least one. The expected num-ber of interactions until one remains is O(n4), thus paralleltime O(n3) with probability at most O(1/n3), which givesnegligible expectationO(1) on parallel time in this low prob-ability case.

2579 Copyright © by SIAM

Unauthorized reproduction of this article is prohibited

Dow

nloa

ded

07/1

4/17

to 7

4.10

4.16

2.18

4. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p