Aug 27, 2018 - lipics-logo-bw.pdf. Schloss Dagstuhl ... stant with respect to n, for important distributed computing problems such as leader elec- tio...

0 downloads 10 Views 1MB Size

Mahsa Eftekhari2 Department of Computer Science, University of California, Davis [email protected]

arXiv:1808.08913v2 [cs.DC] 12 Sep 2018

Abstract We study uniform population protocols: networks of anonymous agents whose pairwise interactions are chosen at random, where each agent uses an identical transition algorithm that does not depend on the population size n. Many existing polylog(n) time protocols for leader election and majority computation are nonuniform: to operate correctly, they require all agents to be initialized with an approximate estimate of n (specifically, the exact value blog nc). Our first main result is a uniform protocol for calculating log(n) ± O(1) with high probability in O(log2 n) time and O(log7 n log log n) states (O(log log n) bits of memory). The protocol is converging but not terminating: it does not signal when the estimate is close to the true value of log n. If it could be made terminating, this would allow composition with protocols, such as those for leader election or majority, that require a size estimate initially, to make them uniform. However, our second main result implies that the protocol cannot be made terminating. We show that a uniform protocol for any task requiring more than constant time cannot be terminating even with probability bounded above 0, if infinitely many initial configurations are dense: any state present initially is the state of Ω(n) agents. (In particular no leader is allowed.) The result holds no matter the memory or time permitted. Finally, we show that with an initial leader, our size-estimation protocol can be made terminating with high probability, with the same asymptotic time and space bounds.

1

Introduction

Population protocols [5] are networks that consist of computational entities called agents with no control over the schedule of interactions with other agents. In a population of n agents, repeatedly a random pair of agents is chosen to interact, each observing the state of the other agent before updating its own state.3 They are an appropriate model for electronic computing scenarios such as sensor networks and for “fast-mixing” physical systems such as animal populations [33], gene regulatory networks [13], and chemical reactions [31], the latter increasingly regarded as an implementable “programming language” for molecular engineering, due to recent experimental breakthroughs in DNA nanotechnology [17, 32]. All problems computable with zero error probability by a constant-state population protocol are computable in O(n) time [6, 21]; the benchmark for “efficient” computation is thus sublinear time, ideally polylog(n). For example, the transition x, q → y, y (starting with at least as many q as the “input” state x) computes f (x) = 2x in expected time O(log n), whereas x, x → y, q computes f (x) = bx/2c exponentially slower: expected time O(n) [16].

1 2 3

Supported by NSF grant CCF-1619343. Supported by NSF grant CCF-1619343. Using message-passing terminology, each agent sends its entire state of memory as the message. © David Doty and Mahsa Eftekhari; lipics-logo-bw.pdfLeibniz International Proceedings in Informatics Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl Publishing, Germany

XX:2

Efficient size estimation and impossibility of termination in uniform dense population protocols

Although the original model [5] assumed a set of states and transitions that is constant with respect to n, for important distributed computing problems such as leader election [22], majority computation [1], and computation of other functions and predicates [10] no constant-state protocol can stabilize in sublinear time with probability 1.4 This motivated the study of protocols in which the set of states and transitions grows with n (essentially adding a non-constant memory to each agent). Such protocols achieve leader election and majority computation using O(polylog(n)) time, while keeping the number of states “small”: typically O(polylog(n)) [1–3, 11, 12], although O(log log n) states suffice for leader election [24]. Unfortunately, most of these sublinear-time protocols [1–3, 11, 12] are nonuniform: the set of states and transitions are allowed to depend arbitrarily on n. This capability is used to initialize each agent with an approximate estimate of n (the exact value blog nc); without this estimate, the protocols do not work at all. A representative example portion of such a protocol is shown in Fig 1: each agent has an internal “counter”, which increments upon each encounter with an x. When the counter reaches log n, the protocol terminates (or moves to a different “stage”). In all of these cases, it is not obvious how to remove this encoding of log n while preserving correctness.5 typical nonuniform algorithms require pre-supplied approximation of population size: population size ≈ 28

pseudocode

c2,x→c3,_ … c8,x→t,_ transitions

counter := counter+1 if counter == 20: terminate pseudocode

… c8,x→c9,_ … c20,x→t,_

population size = anything

How to make uniform?

algorithm Elect_Leaderall

…

counter := counter+1 if counter == 8: terminate

population size ≈ 220 c1,x→c2,_ algorithm Elect_Leader20 c2,x→c3,_

…

…

algorithm Elect_Leader8 c1,x→c2,_

counter := counter + 1 if counter == ???: terminate

transitions

Figure 1 Many population protocols with ω(1) states use nonuniform algorithms: the value log2 n is “hardcoded” into the reactions. Above, “terminate” could mean terminate the whole algorithm, or it could mean “move to the next stage of the algorithm”.

More desirable would be a uniform protocol in which each agent’s local algorithm for computing the outputs, given the inputs, has no knowledge of n. Such an algorithm may produce outputs longer than its inputs, retaining the ability to use a number of states that grows with the population size. A uniform protocol can be deployed into any population without knowing in advance the size, or even a rough estimate thereof. Although the original model of population protocols, stipulating a constant set of states, is uniform, when the number of states vary with the population size in this manner, little is known about the computational abilities and limitations of uniform protocols (with a few interesting exceptions [14, 15, 27, 28]).

4

5

A protocol stabilizes when it becomes unable to change the output. A protocol converges in a given random execution when the output stops changing, though it could take longer to subsequently stabilize. Known time lower bounds [1, 10, 22] are on stabilization, not convergence. Recently Kosowski and Uznanski [26] achieved a breakthrough result, showing constant-state protocols for leader election and all decision problems computable by population protocols (the semilinear predicates), which converge with high probability in polylog(n) time, and for any > 0, probability 1 protocols for the same problems that converge in O(n ) expected time. The latter protocols require Ω(n) time to stabilize, as would any constant-state protocol due to the cited time lower bounds. As we prove in Section 4, uniform protocols with sufficiently “dense” initial configurations cannot properly terminate.

D. Doty and M. Eftekhari

1.1

Contribution

Nonuniform protocols in the literature [1–3,11,12] initialize each agent with the value blog nc. Hence we study the problem of computing an approximate estimate of log n. Our first main result, Theorem 3.1, is a uniform protocol, starting from a configuration where all n agents are in an identical state, that with high probability computes log n ± O(1) (storing the value in every agent), using O(log2 n) time and O(log7 n log log n) states.6 One might hope to use this protocol as a subroutine to “uniformize” existing nonuniform protocols for leader election and majority [1–3, 11, 12].7 Suppose the size-estimating protocol could be made terminating, eventually producing a termination “signal” that with high probability does not appear until the size estimate has converged. This would allow composition with other protocols requiring the size estimate. It has been known since the very beginning of the population protocol model [5] that termination cannot be guaranteed with probability 1. However, some leader-driven protocols can be made terminating with high probability, including simulation of register machines [6] or exact (but slow) population size counting [27]. Our second main result, Theorem 4.1, shows that this is impossible to do with our leaderless size-estimation protocol and a very wide range of others. The production of such a terminating signal cannot be delayed, even with probability bounded above 0, by more than O(1) time in any uniform protocol where infinitely many valid initial configurations are α-dense for some α > 0, meaning that all states present are the state of at least αn agents. Since virtually all non-trivial computation with population protocols requires Ω(log n) time8 (including leader election, and computation of predicates and functions such as majority and f (x) = 2x), this implies that no uniform terminating protocol can solve these problems from dense initial configurations. The hypothesis of density is crucial: with a leader, highprobability termination is possible in a uniform protocol [6]. The hypothesis of uniformity √ is also crucial; if each agent can initially store a value f (n) = Ω(n) (e.g., f (n) = n or n), then a termination signal can be delayed until some agent experiences f (n) interactions, an event whose expected time grows unboundedly with n if f grows sufficiently fast. The first main result answers affirmatively open question 5 of [20], and the second main result answers negatively open questions 1-3 of [20]. Finally, Theorem 3.11 shows that with an initial leader, our size estimation can be made terminating with high probability, with the same asymptotic time and space bounds.

1.2

Related work

The work of this paper was inspired by recent work on nonuniform polylog time leader election/majority [1–4, 11, 12, 24]; the fact that those protocols require an approximate size estimate is the direct motivation for seeking a protocol that can compute such an estimate (though unfortunately due to Theorem 4.1, composition of our protocol with these, if possible, will not be straightforward). Self-stabilizing leader election and exact size counting. Cai, Izumi, and Wada [14] (using different terminology) show an impossibility result for uniform population protocols,

6

7 8

It appears difficult to compute blog nc exactly, rather than within a positive additive constant, since for all k, such a protocol could distinguish between the exact population sizes 2k − 1 and 2k . Some protocols for leader election [24,25] are uniform, but other protocols [1,3,11,12] have the benefit of simplicity and may possibly be easier to reason about and compose with other protocols. One way to see that Ω(log n) is a lower bound on most interesting computation is that, by a standard coupon collector argument, this is the time required for each agent to have at least one interaction.

XX:3

XX:4

Efficient size estimation and impossibility of termination in uniform dense population protocols

that no protocol electing a leader can be uniform if it is also required to be self-stabilizing: correct with probability 1 from any initial configuration. In fact, it must be nonuniform in a very strong way: the exact population size must be encoded into each agent. Self-stabilizing exact size computing has also been shown to be possible with a leader [9] in O(n log n) time and O(n) states for the leader and 2 states for the other agents, all asymptotically optimal parameters in the self-stabilizing setting [7]. Exact size counting. In the less restrictive setting in which all agents start from a predetermined state, Michail [27] proposed a uniform terminating protocol (where the agents know when they have converged on the correct output) in which a pre-elected leader computes the exact population size n in O(n log n) time with high probability. Going from the terminating to the less restrictive converging criterion (where agents eventually converge on the correct size, but do not know when this occurs), exact size counting is possible in O(log n log log n) time and O(n60 ) states [20], without a leader (all agents start in the same state). Approximate size estimation. Alistarh, Aspnes, Eisenstat, Gelashvili, Rivest [1] have shown a uniform protocol that in O(log n) expected time and states converges to an approximation n0 of the true size n, computing an integer k such that with high probability √ 1 n ≤ 2k ≤ n9 . We improve this from a constant multiplicative 2 log n ≤ k ≤ 9 log n, i.e., error in approximating log n to a constant additive error, or in other words we estimate the population size to within a constant multiplicative factor (instead of a polynomial factor as in [1]), but use O(log2 n) time and O(log7 n log log n) states.

2

Model

To formally define uniform computation in population protocols, the agents’ transition algorithm is modeled as a multitape Turing machine. (Our protocol describes a constant number of integer fields comprising each agent’s state, which could be implemented as one field per tape.) When two agents interact they are both in an initial configuration of the Turing machine, their input is the tape content in the halting configuration of the last time each of them had an interaction. (We assume the Turing machine state and tape head positions are not transmitted, since they can be assumed WLOG identical in every halting configuration.) The space usage (in bits) s is defined as normal for Turing machines: the maximum number of tape cells that are written during the computation. The number of states is then 2s , where s is the maximum space usage of any agent during an execution of the protocol. For ease of understanding, we will use standard population protocol terminology and not refer explicitly to details of the Turing machine definition except where needed. Therefore a state s ∈ Λ always refers to the TM tape content of an agent (leaving out TM state and tape head positions since these are identical in all initial configurations), where Λ is the set of all states, a configuration ~c ∈ NΛ is a vector indexed by a state, where ~c(s) is the count of state s in the population. We set the output of our protocol the value stored in a special field labeled “output”. This definition is not indisputable since the output of a protocol can also be defined as a function of the fields stored in an agent’s memory. Therefore, agents do not necessarily carry the output in their memory. Though the latter definition does not reduce our protocol’s space usage it might benefit other algorithms to save memory. We leave the discussion that which definition is more suited to the readers. The traditional definition of population protocols assumes a deterministic transition function. Several papers [1,11] indicate how to use the randomness built into the interaction scheduler to provide nearly uniform random bits to the agents, using various synthetic coin

D. Doty and M. Eftekhari

techniques, showing that the deterministic model can effectively simulate a model with a randomized transition. For brevity and simplicity of presentation, we will simply assume in the model that each agent has access to a source of uniformly random bits. So instead of a transition function, we have a transition relation δ ⊆ Λ4 . We assume that each agent has access to independent uniformly random bits, pre-written on a read-only tape, allowing the Turing machine to be deterministic even though it is computing a nondeterministic relation. Throughout this paper, n denotes the number of agents in the population. Repeatedly, a pair of agents is selected uniformly at random to interact, where they run the transition algorithm on the pair of states they were in prior to the interaction, and storing the output states for their next interactions. The time until some event is measured as the number of interactions until the event occurs, divided by n, also known as parallel time. This represents a natural model of time complexity in which we expect each agent to have O(1) interactions per unit of time, hence across the whole population, Θ(n) total interactions occur per unit time. All references to “time” in this paper refer to parallel time. An execution is a sequence of configurations ~c0 , ~c1 , . . . such that for all i, applying a transition to ~ci results in ~ci+1 . log n is the base-2 logarithm of n, and ln n is the base-e logarithm of n.

3

Fast protocol for estimating log n within O(1) additive error

In this section we describe a uniform protocol for computing the value of log n with an additive error, i.e., estimating the population size to within a constant multiplicative factor. We say a population protocol is leaderless if all agents start in the same state, and we say a protocol has an initial leader if one agent starts in a special state `, and all other n − 1 agents start in a different state f 6= `. I Theorem 3.1. There is a uniform leaderless population protocol that, with probability at least 1 − 12/n, computes and stores in each agent an integer k such that |k − log n| ≤ 4.7, taking O(log2 n) time and O(log7 n log log n) states. Interpreted as an estimate of the population size n, the protocol estimates n within multiplicative factor 24.7 < 26. We note that the protocol is not stabilizing: it has a positive probability of error. It is an open question to create a protocol using expected polylog(n) time and states that computes log(n) ± O(1) with probability 1. The protocol is described and its time and state complexity analyzed in Subsection 3.2. Most of the analysis of the approximation closeness amounts to proving a bound on the moment-generating function of a maximum of geometric random variables, so that the Chernoff technique can be applied to sums of such variables. This is shown in Section C.

3.1

Intuition

Alistarh et al. [1] describe a protocol for estimating log n within a constant multiplicative factor. A 12 -geometric random variable is the number of flips needed to get one head when flipping a fair coin. In their protocol, each agent generates an independent geometric random variable Gi , then propagates the maximum M = max1≤i≤n Gi by epidemic: transitions of the form i, j → j, j for i ≤ j, which in O(log n) time “infect” all agents with the maximum. It is known that E [M] ≈ log n [23], and 1/2 log n < X < 9 log n with probability ≥ 1 − O(1)/n3 [1]. We take the obvious extension of this approach: do this K times in parallel and take an average. The estimated average is within O(1) of log n so long as K = Ω(log n) (Co-

XX:5

XX:6

Efficient size estimation and impossibility of termination in uniform dense population protocols

rollary C.10). One problem to solve first is how to calculate K; after all, K = Θ(log n) scales with n, so with a uniform protocol it cannot be encoded into the agents at the start. The agents estimate it using the protocol of [1]. Since that protocol is converging but not terminating (provably it cannot be made terminating by Theorem 4.1), each time an agent updates its value of K, it reinitializes the remainder of its state. However, a trickier problem remains: a naïve approach to implementing “averaging of K numbers” requires storing K = Θ(log n) numbers in each agent, each having value Θ(log n), implying the number of states is Θ((log n)log n ) = Θ(nlog log n ). This is even more than the O(n60 ) sufficient to quickly compute exactly n [20]. We require a subtler approach. Each agent will eventually generate K different geometrically distributed random numbers (in addition to the first one that is used to calculate K). However, only two of these numbers will ever be stored in the agent at a time. Each agent randomly guesses an index 1 ≤ i ≤ K; we say that the agent is responsible for index i. It then generates a geometric random variable, propagating the maximum among all those agents also responsible for i. Furthermore, an agent responsible for i iterates through each index j ∈ {1, . . . , K} \ {i}. It waits to encounter an agent responsible for j, generates another geometric random variable G, propagating G to that agent, before moving on to index j + 1. Therefore, the K maxima have the same distribution as in the above memory-intensive protocol: each agent generates exactly K independent geometric random variables, with the maximum of each being propagated among the agents responsible for its index. After generating K different geometric random variables, to compute the average of the K population-wide maxima, each agent repeatedly cycles through the indices j ∈ {1, . . . , K}, at each index j waiting to encounter an agent responsible for j, adding the latter agent’s stored maximum to a running sum. Upon adding the K’th number, the agent divides the sum by K and stores the result in its output field. It resets j to 1 and repeats this cycle forever. Once all K maxima have been propagated to the appropriate agents, each agent will complete a full cycle and calculate the correct average, finally converging on an output. The time is O(log2 n) by the following rough analysis (details follow). Each index i ∈ n = Θ( logn n ) agents responsible for it. Therefore, an agent {1, . . . , K} has approximately K responsible for i waits O(log n) interactions on average in between each interaction with another agent responsible for i. This slows down the standard O(log n) time epidemic by factor O(log n), so each epidemic completes in time O(log2 n). By the union bound it is unlikely for any of the K epidemics to take longer to finish. Also, each agent must cycle through the K indices, but moving from index j to j + 1 also requires Θ(log n) interactions for each agent, so it takes time Θ(log2 n) time for an agent to complete each cycle. Again the union bound shows low probability for any agent to take too long to complete a cycle.

3.2

Formal specification of protocol

In this protocol, agents start by generating one geometric random variable (called G0 ) and continue by propagating the maximum among the population. By Lemma C.4 and Corollary C.7 the propagated maximum is in the interval of [log n − log ln n, 2 log n + 3/2] with probability ≥ 1 − O(1)/n. Thus, it could be used to estimate K, which is the number of independent additional geometric random variables each agent will generate, with the K maxima propagated across the population. Each time an agent figures out it was not storing the maximum value of G0 , it restarts its PropagateMaxGRV protocol. Agents generate a new random variable from 1 to K as their ResponsibleIndex and propagate the maximum for the newly generated index. The process of restarting and updating the ResponsibleIndex will continue until each

D. Doty and M. Eftekhari

XX:7

Protocol 1 LogSizeEstimation(rec, sen) . initial state of agent: G0 ← 21 -geometric random variable (used for estimating K) K ← Smallest power of two ≥ cinit · (1/2 log G0 + G0 ) . (cinit = 4 for our main analysis) ResponsibleIndex ← number i selected uniformly at random from {1, . . . , K} . agent is responsible for propagating the maximum of the i’th of the K values of GeometricRV across the population GeometricRV ← 21 -geometric random variable GeneratedIndex ← 1 TemporaryGeometricRV ← 0 AveragingIndex ← 0 sum ← 0 . Sum of the K maxima of GeometricRV ave ← 0 . Stores sum/K; this is the output field PropagateMaxInitialEstimate(rec, sen) PropagateMaxGRV(rec, sen) if rec.GeneratedIndex > K then Averaging(rec, sen) Subprotocol 2 PropagateMaxInitialEstimate(rec, sen) . Maximum generated geometric variable for G0 will be propagated. if rec.G0 < sen.G0 then rec.G0 ← sen.G0 Restart(rec)

agent stores the same value in their G0 . In the Restart protocol, all random variables are generated from the beginning. Therefore, we can ignore what have been stored or propagated before the final restart. Although propagating the random variables is distributed among the population, each agent performs the averaging locally. Agents have a field sum for accumulating the maximum variable associated to each index in {1, . . . , K}. I Lemma 3.2. The value K in LogSizeEstimation is in the interval [4 log n, 24 log n] with probability ≥ 1 − 1.32/n. Proof. Corollary C.7 gives us a very tight bound for the expected value of the maximum of n independent geometric random variables with high probability. We use this corollary to bound E [G0 ]. By Lemma C.4, log n + 1 < E [G0 ] < log n + 3/2. Then we can write: Pr [G0 ≤ E [G0 ] − 1/2 log G0 ] < <

Pr [G0 ≤ E [G0 ] − (1 + 1/2 log log n)] Pr [G0 ≤ E [G0 ] − (1 + log ln n)]

≤ n−1 by Corollary C.7 Therefore, setting K to be the smallest power of two ≥ 4 · (1/2 log G0 + G0 ) guarantees K to be greater then or equal 4 log n with probability ≥ 1 − 1/n. Also, by Lemma C.4 and Corollary C.7, K ≤ 2 · 4 · (2 log n + 1/2 log G0 + 3/2) ≤ 8 · (3 log n) for high values of n with probability ≥ 1 − 0.32/n. J I Lemma 3.3. LogSizeEstimation uses O(log7 n) states with probability ≥ 1 − 2.32/n.

XX:8

Efficient size estimation and impossibility of termination in uniform dense population protocols

Subprotocol 3 Restart(rec) . state: K ← Smallest power of two ≥ cinit · (1/2 log G0 + G0 ) ResponsibleIndex ← number selected uniformly at random from {1, . . . , K} GeometricRV ← 12 -geometric random variable GeneratedIndex ← 1 TemporaryGeometricRV ← 0 sum ← 0 ave ← 0 Subprotocol 4 PropagateMaxGRV(rec, sen) if rec.ResponsibleIndex = sen.ResponsibleIndex and rec.GeometricRV sen.GeometricRV then . propagate maximum r.v. between agents responsible for same index rec.GeometricRV ← sen.GeometricRV if rec.GeneratedIndex = rec.ResponsibleIndex then . geometric r.v. for agent’s responsible index was already generated, so skip rec.GeneratedIndex ← rec.GeneratedIndex + 1 else if rec.GeneratedIndex = sen.ResponsibleIndex then . generate geometric r.v. for this index, propagate it, then forget it rec.GeneratedIndex ← rec.GeneratedIndex + 1 rec.TemporaryGeometricRV ← 12 -geometric random variable if sen.GeometricRV < rec.TemporaryGeometricRV then sen.GeometricRV ← rec.TemporaryGeometricRV

<

Proof. By setting the number of i.i.d geometric random variables = nK and λ = 2 ln nK in Corollary C.6 we bound all geometric random variables to be ≤ 3 log nK ≤ 15 + 3 log n + 3 log log n ≤ 4 log n with probability ≥ 1 − 3.5/(nK) ≥ 1 − n1 for sufficiently large n. Here, all fields carried by each agent in protocol LogSizeEstimation are listed: G0 K ResponsibleIndex GeneratedIndex GeometricRV TemporaryGeometricRV sum AveragingIndex ave

[log n − log ln n − 1/2, 2 log n + 3/2] [4 log n, 24 log n] [4 log n, 24 log n] [4 log n, 24 log n] [1, 4 log n] [1, 4 log n] [1, 96 log2 n] [4 log n, 12 log n] [1, 4 log n]

by Corollary C.7 by Lemma 3.2 by Lemma 3.2 by Lemma 3.2 by Lemma 3.3 by Lemma 3.3 by Lemmas 3.2, 3.3 by Lemma 3.2 by Lemma 3.3

Noting that agents never use both of GeneratedIndex and AveragingIndex at the same time in LogSizeEstimation protocol; hence the same space can be shared between the variables. Also, once G0 is generated, it is only needed to compute K, so the space for K subsumes that for G0 , which is not needed after K is generated. Since, K is a power of two storing only the power suffices. Thus we have one variable that is O(log2 n), one with O(log log n) space, and after the described optimizations, 5 additional variables that are O(log n). Therefore, each agent needs O(log7 n log log n) states with probability ≥ 1 − (1.32 + 1)/n. J

D. Doty and M. Eftekhari

Subprotocol 5 Averaging(rec, sen) if rec.AveragingIndex = rec.ResponsibleIndex then rec.AveragingIndex ← rec.AveragingIndex + 1 rec.sum ← rec.sum + rec.GeometricRV else if rec.AveragingIndex = sen.ResponsibleIndex then rec.AveragingIndex ← rec.AveragingIndex + 1 rec.sum ← rec.sum + sen.GeometricRV if rec.AveragingIndex > K then rec.ave ← brec.sum/Kc rec.AveragingIndex ← 1 rec.sum ← 0

In this protocol, we have each agent choosing the ResponsibleIndex uniformly random from 1 to K, which has the same distribution as throwing n balls into K bins. For a fast convergence of the protocol, there should be enough agents (balls) assigned to each index (bins). We want to show that the time for each epidemic is O(log2 n) with high probability. The next lemma, proven in Section A, bounds the number of agents selecting each index. √n , for each index i ∈ {1, . . . , K}, at least I Lemma 3.4. With probability at least 1 − 2e n 2n and at most agents are responsible for i. 2K K Each agent repeatedly cycles a counter from 1 to K, incrementing when encountering an agent responsible for the index equal to its counter. (This is used both to generate the geometric random variables for each index, and later to repeatedly compute the sum of their maxima.) The following lemma, proven in Section A, bounds the time required to complete one of these cycles. I Lemma 3.5. With probability ≥ 1 − 1/n, it takes O(log2 n) time for all agents, starting with a counter at index 1, to have all their counters reach K and reset back to 1. I Corollary 3.6. Averaging protocol takes ≤ 2304 log2 n time with probability ≥ 1 − 1/n. I Corollary 3.7. Generating all K · n random variables takes ≤ 2304 log2 n time with probability at least 1 − 1/n. To analyze the time complexity of our protocol, we require the time bounds for completing an epidemic from the paper [6]. The current form is taken from [20]. I Lemma 3.8 Let the time to complete an epidemic. Then E [T ] = √ ( [6]). T denote 1 n−1 − n H , Pr T < ln n < 2e , and for any αu > 0, Pr [T > αu ln n] < 4n−αu /4+1 . n−1 n 4 I Lemma 3.9. With probability at least 1 − exp(−n/24 log n), it takes ≤ 3 · 2048 log2 n time for all agents to propagate all K independent maximum geometric random variables. Proof. Fix the ith maximum geometric random variable, we bound the time for completing an epidemic to propagate it within the subpopulation that are responsible for the ith maxn imum geometric random variable. Lemma 3.4 claims w.h.p at least 2K and at most 2n K agents are propagating each individual epidemic. So, the probability that the next interaction is n ( 2K ) 1 among the responsible subpopulation for the ith maxima is ≥ n2 ≥ 8K 2 . Also, by setting (2) αu = 8 in Lemma 3.8 the probability that an epidemic takes longer than 8 ln n0 is less than 4/n0 for any population size n0 . Thus, we can model the completion of an epidemic in the

XX:9

XX:10

Efficient size estimation and impossibility of termination in uniform dense population protocols

responsible subpopulation by counting the number of trials of a biased coin until 8 ln n0 heads appears where the probability of seeing a head equals p = 1/8K2 . With m = 128K2 n0 ln n0 trials, probability of success p = 1/8K2 , and number of successes 8 ln n0 , we have that Pr [B(m, p) ≤ 8n0 ln n0 ] = Pr [NB(8n0 ln n0 , p) ≥ m]. We use a Chernoff bound for binomial random variables, which states that, for 0 < δ ≤ 1, 2 setting µ = E [B(m, p) = mp], Pr [B(m, p) ≤ (1 − δ)µ] ≤ e−µδ /2 . Setting δ = 21 and µ = (128K2 n0 ln n0 )p = 16n0 ln n0 , Pr NB(8n0 ln n0 , p) ≥ 128K2 n0 ln n0 = Pr B(128K2 n0 ln n0 , p) ≤ 8n0 ln n0 = Pr B(128K2 n0 ln n0 , p) ≤ (1 − δ)µ 2

≤

e−µδ

=

e−2n0 ln n0

<

e−n/K

<

e

/2

−n/24 log n

by Lemma 3.4 by Lemma 3.2

. The above inequality shows that an epidemic takes more than 3 · 2048n log2 n ≥ 128K2 n0 ln n0 interactions or equivalently 3 · 2048 log2 n time with probability at most exp (−n/24 log n). J I Lemma 3.10. LogSizeEstimation protocol takes O(log2 n) time with probability at least 1 − 7/n for all agents to store the average of K independent maximas. Proof. We assume in the worst case that individual sub-protocols run consecutively rather than in parallel, and add their required time to upper bound their parallel time. By Lemmas 3.8, 3.9 and Corollaries 3.6, 3.7 we have: 8 ln n + 243 log2 n + 2304 log2 n + 2304 log2 n = O(log2 n). By the union bound, the probability that LogSizeEstimation takes > O(log2 n) is 4 ≤ n + n1 + n1 + n1 = 7/n for sufficiently large n. J Finally, we combine these results to prove the main result of this section. Proof of Theorem 3.1. By Lemma 3.3, LogSizeEstimation protocol uses O(log7 n log log n) states with probability at least 1 − 2.32/n. Lemma 3.10 proves that this protocol takes O(log2 n) time with probability at least 1 − 7/n. Finally, Corollary C.10 guaranties the average of k obtained maxima are with an additive error O(1) of log n with probability at least 1 − 2/n. By the union bound, after O(log2 n) time agents will store the estimation of log n with an additive error O(1) with probability at least 1 − 12/n. J

3.3

Terminating size estimation with a leader

It is possible to make the size-estimation protocol terminating if we start with an initial leader. I Theorem 3.11. There is a uniform terminating population protocol with an initial leader that, with probability ≥ 1 − O(1)/n, computes and stores in each agent an integer k such that |k − log n| ≤ 4.7, taking O(log2 n) time and O(log7 n log log n) states. Proof. By Theorem 4.1, a leader (or a o(n)-size junta of leaders) is required for termination to work with positive probability. In the presence of a leader, the population can simulate

D. Doty and M. Eftekhari a phase clock. By [6, Corollary 1], there is a constant k1 = max(8c, 8d c ) for the number of phases that it takes at least d ln n to reach phase k1 with probability at least 1 − n1c . We change the algorithm to have two main stages. In stage one, the maximum G0 gets propagated. If we set the number of phases in a phase clock greater than 64, then reaching the maximum phase takes at least 8 ln n time with probability at least 1− n1 . By Lemma 3.8, 8 ln n time is sufficient for propagating the maximum of G0 among the population with high probability. By Lemma C.4 and Corollary C.7 w.h.p. the maximum of G0 among n agents is in interval of [log n − log ln n, 2 log n + 3/2]. So, after the phase clock reaches k1 the leader can terminate stage one and move to stage two; where agents compute the sum of K maximas of n geometric random variable. By setting the number of phases equal to k2 · 2G0 , we can set a timer to count up until k82 ln n log n time with probability at least 1 − n1 for some “big” k2 [6, Corollary 1]. When the phase clock reaches k2 ·2G0 , leader terminates stage and report the ave value it computed. J

4

Termination

Though the concept of termination has been referenced and studied in population protocols [8, 27, 28], to our knowledge no formal definition exists. We give an abstract definition that captures the behavior of most existing protocols that “perform a computational task”. Let P be a population protocol with a set I of “valid” initial configurations for P , where each agent’s memory has a Boolean field terminated that is False in every configuration in I.9 We say that a configuration ~c of P is terminated if at least one agent in ~c has terminated = True. Let κ > 0 and t : N → N. We say that P is κ-t-terminating if, for all ~i ∈ I, with probability ≥ κ, P reaches from ~i to a terminated configuration ~c, but takes time ≥ t(n) to do so. This definition leaves totally abstract which particular task (e.g., leader election) is assumed to have terminated. The idea is that if the task will not be complete before time t(n) with high probability, then no agent should set terminated = True until time ≥ t(n) with high probability. So proving an upper bound on t(n) in the definition of terminating implies that no protocol can be terminating if it requires time > t(n) to converge. The definition is applicable beyond the narrow goal of terminating a population protocol. It says more generally that a “signal” is produced after some amount of time. This signal may be used to terminate a protocol, move it from one “stage” to another, or it may be some specific Boolean value relevant to a specific protocol, where in any case the value will start False for all agents and eventually be set to True for at least one agent. Let α > 0. We say a configuration ~c is α-dense if, for all s ∈ Λ, ~c(s) > 0 =⇒ ~c(s) ≥ αn. (Recall n = k~ck.) In other words, every state present occupies at least fraction α of the population. We say protocol P with valid initial configuration set I is i.o.-dense if there exists α > 0 such that infinitely many ~i ∈ I are α-dense. In particular, an i.o.-dense protocol does not always have an initial leader: a state present in count 1 in every ~i ∈ I. The next theorem, our second main result, shows that termination is impossible for uniform i.o.-dense protocols that require more than constant time. I Theorem 4.1. Let κ > 0 and t : N → N, and let P be a uniform i.o.-dense population protocol. If P is κ-t-terminating, then t(n) = O(1).

9

In the language of states, we partition the state set Λ into disjoint subsets ΛT and ΛN such that ΛT ∪ ΛN = Λ and ΛT are precisely the states with terminated = True.

XX:11

XX:12

Efficient size estimation and impossibility of termination in uniform dense population protocols

Let Λ be the (possibly infinite) set of all states of a population protocol. Since the protocol may be randomized, we consider a transition relation ∆ ⊆ Λ4 , writing a, b → c, d to denote that (a, b, c, d) ∈ ∆ (i.e., if agents in states a and b interact, then one of the ρ possible random outcomes is to change to states c and d). For ρ ∈ (0, 1], we write a, b → c, d to denote that when states a and b interact, with probability ρ they transition to c and d. ρ Say that ρ is the rate constant of transition a, b → c, d. If there exist a, b ∈ Λ and ρ0 ≥ ρ ρ0

such that a, b → c, d, we write c ∈ PRODρ (a, b) and d ∈ PRODρ (a, b). (In other words, c ∈ PRODρ (a, b) if c is produced with probability at least ρ whenever a and b interact). For any Γ ⊆ Λ and ρ ∈ [0, 1], define PRODρ (Γ) = {s ∈ Λ | (∃a, b ∈ Γ) s ∈ PRODρ (a, b)}.10 i−1 m Let Λ0 ⊆ Λ. For all i ∈ N+ , define Λiρ = Λi−1 ρ ∪PRODρ (Λρ ). For any m ∈ N, if s ∈ Λρ , 0 11 we say s is m-ρ-producible from Λ . For a configuration ~c, we say s is m-ρ-producible from ~c if s is m-ρ-producible from Λ0 = {s ∈ Λ | ~c(s) > 0}, the set of states present in ~c.12 Our main technical tool is the following lemma, a variant of the “timer/density lemma” of [19] (see also [1]). The original lemma states that in a protocol with O(1) states, from any sufficiently large α-dense configuration, in O(1) time all states appear with δ-density (for some 0 < δ < α). The proof is similar to that of [19], but is re-tooled to apply to protocols with a non-constant set of states (also to use the discrete-time model of population protocols, instead of the continuous-time model of chemical reaction networks).13 The key new idea is that, even if a protocol has infinitely many states (of which only finitely many can be produced in finite time), for any subset of states Λm ρ “producible via only m transitions, each having rate constant at least ρ”, all states in Λm ρ are produced in constant time with high probability from sufficiently large configurations. I Lemma 4.2. Let α > 0, m ∈ N+ , ρ ∈ (0, 1], and P be a population protocol. Then there are constants , δ, n0 > 0 such that, for all n ≥ n0 , for all α-dense configurations ~c of P c. For with n = ||~c||, the following holds. Let Λm ρ be the set of states m-ρ-producible from ~ s ∈ Λ and t > 0, let Ct,s be the random variable denoting the count of s at time t, assuming −n . at time 0 the configuration is ~c. Then Pr (∀s ∈ Λm ρ ) C1,s ≥ δn ≥ 1 − 2 A self-contained proof is in Section D. We now use Lemma 4.2 to prove Theorem 4.1. Proof of Theorem 4.1. Assume P is κ-t-terminating; we will show t(n) = O(1). Let (~ci )∞ i=1 be an infinite sequence of α-dense initial configurations in I. Dickson’s Lemma [18] states that every infinite sequence in Nk has an infinite nondecreasing subsequence, so assume without loss of generality that ~ci ≤ ~ci+1 for all i ∈ N. Let Λ0 = {s ∈ Λ | ~c0 (s) > 0} be the set of states present in ~c0 .

10

In other words, PRODρ (Γ) is the set of states producible by a single transition, assuming that only states in Γ are present, and that the only transitions used are those that have probability at least ρ of occurring when their input states interact. 11 If s is m-ρ-producible from Λ0 , then in other words, s is producible from any sufficiently large configuration that contains only states in Λ0 , using at most m different types of transitions, each of which has probability at least ρ. More than one instance of each transition, however, may be necessary. For ρ instance, with transitions xi , xi → xi+1 , q for all i ∈ N+ , xm is m-ρ-producible from Λ0 = {x1 }, but m 2 transitions of type x1 , x1 → x2 , q must be executed, followed by 2m−1 of type x2 , x2 → x3 , q, etc. 12 Note that s may be m-ρ-producible from ~c, but not actually producible from ~c, if the counts in ~c are too small for the requisite transitions to produce s. 13 Alistarh et al. [1] also prove a variant applying to protocols with ω(1) states, but for a different purpose: to show that all states in Λ appear as long as |Λ| ≤ 12 log log n. However, beyond that bound, the lemma does not hold [24]. In our case, we are not trying to show that all states in Λ appear, only those in some constant size subset of states, all of which are m-ρ-producible from the initial configuration.

D. Doty and M. Eftekhari

By hypothesis Pr [P terminates from ~c0 ] ≥ κ > 0. Thus there is at least one finite execution E starting with ~c0 and ending in a terminated configuration. Let m = |E| be ρi the length of this execution. Let the i’th transition in E be denoted a, b → c, d. Let ρ = min ρi be the minimum rate constant of any transition in E. Then every state appearing in 1≤i≤m

0 configurations in E is m-ρ-producible from ~c0 , i.e., are in Λm c0 (s) > 0} ρ where Λ = {s ∈ Λ | ~ is the set of states present in ~c0 . For any ` ≥ 1, since ~c0 ≤ ~c` , all states in Λm c` as well. By ρ are m-ρ-producible from ~ Lemma 4.2, there are constants , δ, n0 > 0 such that, for all ` ∈ N such that n = k~c` k ≥ n0 , letting C1,s be the random variable denoting the count of s at time 1, assuming at time 0 the configuration is ~c` , Pr ∀s ∈ Λm C1,s ≥ δn ≥ 1 − 2−n . ρ However, Λm c` with k~c` k ≥ n0 , with probability ρ contains terminated states, so for all ~ ≥ 1 − 2−n , P terminates within time 1. Since 2−n < κ for sufficiently large n, this implies that if P is κ-t-terminating, then t(n) ≤ 1 for sufficiently large n. Thus t(n) = O(1). J

Observe how the assumption of uniformity is used in the proof: we take a set of transitions used on the population ~c0 and apply it to a larger population ~c` . In a nonuniform protocol, the transitions may not be legal to apply to ~c` . As a concrete example, in a nonuniform protocol such as that sketched in Fig. 1, an agent increments a counter using transitions such as c7 , x → c8 , y until the counter exceeds log n, then produces a termination signal t via a transition c8 , x → t, y. The transition c8 , x → t, y producing this signal is not legal in a population larger than twice n, since the value log n is at least 1 larger in such a protocol. In this example, the transition of the larger protocol with the same input states simply increments the counter without producing a termination signal: c8 , x → c9 , y.

Acknowledgements. The authors are grateful to Eric Severson for helpful comments on earlier drafts.

References 1

Dan Alistarh, James Aspnes, David Eisenstat, Rati Gelashvili, and Ronald L Rivest. Timespace trade-offs in population protocols. In SODA 2017: Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 2560–2579. SIAM, 2017.

2

Dan Alistarh, James Aspnes, and Rati Gelashvili. Space-optimal majority in population protocols. In SODA 2018: Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 2221–2239. SIAM, 2018.

3

Dan Alistarh and Rati Gelashvili. Polylogarithmic-time leader election in population protocols. In 42nd International Colloquium on Automata, Languages, and Programming (ICALP), volume 9135 of Lecture Notes in Computer Science, pages 479 – 491. Springer, Berlin, Heidelberg, 2015.

4

Dan Alistarh, Rati Gelashvili, and Milan Vojnović. Fast and exact majority in population protocols. In PODC 2015: Proceedings of the 2015 ACM Symposium on Principles of Distributed Computing, pages 47–56. ACM, 2015.

5

Dana Angluin, James Aspnes, Zoë Diamadi, Michael J. Fischer, and René Peralta. Computation in networks of passively mobile finite-state sensors. Distributed Computing, 18(4):235–253, March 2006.

6

Dana Angluin, James Aspnes, and David Eisenstat. Fast computation by population protocols with a leader. Distributed Computing, 21(3):183–199, September 2008.

XX:13

XX:14

Efficient size estimation and impossibility of termination in uniform dense population protocols

7

8 9

10

11

12

13 14

15

16

17

18 19 20

21 22

23

James Aspnes, Joffroy Beauquier, Janna Burman, and Devan Sohier. Time and Space Optimal Counting in Population Protocols. In 20th International Conference on Principles of Distributed Systems (OPODIS 2016), volume 70, pages 13:1–13:17, 2017. James Aspnes and Eric Ruppert. An introduction to population protocols. Bulletin of the European Association for Theoretical Computer Science, 93:98–117, October 2007. Joffroy Beauquier, Janna Burman, Simon Claviere, and Devan Sohier. Space-optimal counting in population protocols. In DISC 2015: International Symposium on Distributed Computing, pages 631–646. Springer, 2015. Amanda Belleville, David Doty, and David Soloveichik. Hardness of computing and approximating predicates and functions with leaderless population protocols. In ICALP 2017: 44th International Colloquium on Automata, Languages, and Programming, volume 80 of LIPIcs, pages 141:1–141:14, 2017. Petra Berenbrink, Dominik Kaaser, Peter Kling, and Lena Otterbach. Simple and Efficient Leader Election. In 1st Symposium on Simplicity in Algorithms (SOSA 2018), volume 61, pages 9:1–9:11, 2018. Andreas Bilke, Colin Cooper, Robert Elsässer, and Tomasz Radzik. Brief announcement: Population protocols for leader election and exact majority with O(log2 n) states and O(log2 n) convergence time. In PODC 2017: Proceedings of the ACM Symposium on Principles of Distributed Computing, pages 451–453. ACM, 2017. James M Bower and Hamid Bolouri. Computational modeling of genetic and biochemical networks. MIT press, 2004. Shukai Cai, Taisuke Izumi, and Koichi Wada. Space complexity of self-stabilizing leader election in passively-mobile anonymous agents. In Shay Kutten and Janez Žerovnik, editors, Structural Information and Communication Complexity, pages 113–125, Berlin, Heidelberg, 2010. Springer Berlin Heidelberg. Ioannis Chatzigiannakis, Othon Michail, Stavros Nikolaou, Andreas Pavlogiannis, and Paul G. Spirakis. Passively mobile communicating machines that use restricted space. Theoretical Computer Science, 412(46):6469–6483, October 2011. Ho-Lin Chen, David Doty, and David Soloveichik. Deterministic function computation with chemical reaction networks. Natural Computing, 13(4):517–534, 2013. Special issue of invited papers from DNA 2012. Yuan-Jyue Chen, Neil Dalchau, Niranjan Srinivas, Andrew Phillips, Luca Cardelli, David Soloveichik, and Georg Seelig. Programmable chemical controllers made from DNA. Nature Nanotechnology, 8(10):755–762, 2013. Leonard E. Dickson. Finiteness of the odd perfect and primitive abundant numbers with n distinct prime factors. American Journal of Mathematics, 35:413–422, 1913. David Doty. Timing in chemical reaction networks. In SODA 2014: Proc. of the 25th Annual ACM-SIAM Symp. on Discrete Algorithms, pages 772–784, 2014. David Doty, Mahsa Eftekhari, Othon Michail, Paul G. Spirakis, and Michail Theofilatos. Brief announcement: Exact size counting in uniform population protocols in nearly logarithmic time. In DISC 2018: 32nd International Symposium on Distributed Computing, 2018. David Doty and Monir Hajiaghayi. Leaderless deterministic chemical reaction networks. Natural Computing, 14(2):213–223, 2015. Preliminary version appeared in DNA 2013. David Doty and David Soloveichik. Stable leader election in population protocols requires linear time. Distributed Computing, 31(4):257–271, 2018. Special issue of invited papers from DISC 2015. Bennett Eisenberg. On the expectation of the maximum of iid geometric random variables. Statistics & Probability Letters, 78(2):135 – 143, 2008.

D. Doty and M. Eftekhari

24

25

26 27

28

29 30

31 32 33

Leszek Gasieniec and Grzegorz Stachowiak. Fast space optimal leader election in population protocols. In SODA 2018: ACM-SIAM Symposium on Discrete Algorithms, 2018. to appear. Leszek GKasieniec, Grzegorz Stachowiak, and Przemysław Uznański. Almost logarithmictime space optimal leader election in population protocols. Technical Report 1802.06867, arXiv, 2018. Adrian Kosowski and Przemyslaw Uznanski. Population protocols are fast. CoRR, abs/1802.06872, 2018. Othon Michail. Terminating distributed construction of shapes and patterns in a fair solution of automata. In Proceedings of the 2015 ACM Symposium on Principles of Distributed Computing, pages 37–46, 2015. Also in Distributed Computing, 2017. Othon Michail, Ioannis Chatzigiannakis, and Paul G Spirakis. Terminating population protocols via some minimal global knowledge assumptions. In Stabilization, Safety, and Security of Distributed Systems (SSS), pages 77–89. Springer, 2012. Michael Mitzenmacher and Eli Upfal. Probability and computing: Randomized algorithms and probabilistic analysis. Cambridge university press, 2005. Philippe Rigollet. Lecture notes for MIT course 18.s997: High dimensional statistics, 2015. URL: https://ocw.mit.edu/courses/mathematics/ 18-s997-high-dimensional-statistics-spring-2015/lecture-notes/. David Soloveichik, Matthew Cook, Erik Winfree, and Jehoshua Bruck. Computation with finite stochastic chemical reaction networks. Natural Computing, 7(4):615–633, 2008. Niranjan Srinivas, James Parkin, Georg Seelig, Erik Winfree, and David Soloveichik. Enzyme-free nucleic acid dynamical systems. Science, 358(6369):eaal2052, 2017. Vito Volterra. Variazioni e fluttuazioni del numero d’individui in specie animali conviventi. Mem. Acad. Lincei Roma, 2:31–113, 1926.

XX:15

XX:16

Efficient size estimation and impossibility of termination in uniform dense population protocols

A

Proofs for correctness of size estimation protocol

This section contains proofs of lemmas required to analyze the correctness and time/space complexity of the size estimation protocol of Theorem 3.1. We use a Chernoff bound for Poisson random variables [29]. I Theorem A.1 ( [29]). Letting P(λ) be a Poisson random variable with rate λ, for all l ∈ N: n For λ > l, Pr [P(λ) ≤ l] ≤ e−λ eλ l n . For λ < l, Pr [P(λ) ≥ l] ≤ e−λ eλ . l We now prove some lemmas used in the main√text. n Lemma 3.4. With probability at least 1 − 2e , for each index i ∈ {1, . . . , K}, at least 2n n and at most agents are responsible for i. 2K K Proof. Agents choose a random index from the set of {1, . . . , K} and stay responsible for the chosen index. We can model this event by the problem of throwing n balls in to K bins. It is well known [29] that if we approximate the distribution of tossing m balls into n bins by a distribution where each bin gets a number of balls given by a Poisson distribution with rate m/n (the so-called “Poisson heuristic”), then any event whose probability is monotonicially increasing or decreasing with m, and which occurs in the Poisson case with probability at most p, occurs in the exact case with probability at most 2p. Setting l = λ/2 in the theorem A.1 results in λ/2 eλ −λ Pr [P(λ) ≤ λ/2] ≤ e λ/2 = = ≤ <

e−λ (2e)λ/2 λ/2 2 e n 24 log n 2 e √n 2 . e

The last inequality is correct if we set λ = n/K and from Lemma 3.2 we know that 4 log n ≤ K ≤ 24 log n. Also by setting l = 2λ in the theorem A.1 results in 2λ eλ −λ Pr [P(λ) ≥ 2λ] ≤ e 2λ e = e−λ ( )2λ 2 e λ = 4 n e 24 log n ≤ 4 n 24 log n 2 ≤ e √n 2 < . e The lemma follows by using the Poisson approximation bound.

J

D. Doty and M. Eftekhari

XX:17

Lemma 3.5. With probability ≥ 1 − 1/n, it takes O(log2 n) time for all agents, starting with a counter at index 1, to have all their counters reach K and reset back to 1. Proof. Fix a single agent. We set Xi to be random variable representing the number of interactions that takes for this agent to increase counter from i − 1 to i. Once all agents’ n agents are responsible for each index GeneratedIndex reach K, by Lemma 3.4 at least 2K with high probability. Therefore, selecting the current agent as the sender and an agent who is responsible for index i as the receiver gets stochastically dominated by a geometric 1 1 random variable with success probability of p = n1 · 2K = 2nK . Thus, E [Xi ] = 2nK where K, the number of total indices, is in interval [4 log n, 12 log n] with high probability (Lemma 3.2). We desire to bound the sum of K geometric random variables, each with success probability p, otherwise known as a negative binomial random variable NB(K, p). With m trials, probability of success p, and number of successes k, we have that Pr [B(m, p) ≤ K] = Pr [NB(K, p) ≥ m]. We use a Chernoff bound for binomial random variables, which states that, for 0 < δ ≤ 1, setting µ = E [B(m, p) = mp], 2 Pr [B(m, p) ≤ (1 − δ)µ] ≤ e−µδ /2 . Setting δ = 12 and µ = 8nK2 p = 4K, Pr NB(K, p) ≥ 8nK2 = Pr B(8nK2 , p) ≤ K = Pr B(8nK2 , p) ≤ (1 − δ)µ 2

≤

e−µδ

=

e−4K(1/4)/2

=

e−K/2

<

e−2 log n 1 . n2

<

/2

by Lemma 3.2

. By the union bound over all n agents, the probability that any agent takes more than 8nK2 ≤ 2304n log2 n (since K ≤ 24 log n by Lemma 3.2) interactions (equally 2304 log2 n time) is less than n1 . J

Efficient size estimation and impossibility of termination in uniform dense population protocols

B

Simulation

Simulation results are shown in Fig. 2. Convergence Time Parallel time (interactions/population size)

XX:18

25000

20000

15000

10000

5000

0

1

10

100

1000

10000

100000

1000000

population size

Figure 2 Simulated convergence time of the protocol. Although the proofs give only that the estimate of log n is likely to get within additive error of 5, in practice the estimate is always within 2, so this is how we define convergence in the experiment. The dots indicate the convergence time of individual experiments. The population size axis is logarithmic (i.e., exactly O(c log10 n) time complexity would correspond to a straight line with slope c). Recall setting cinit lower increases the probability that the initial estimate G0 is too low, i.e., cinit · G0 ≤ log n. The circular dots in the plot (larger values) are 10 experiments at each value of n ∈ {103 , 104 , 105 , 106 } with cinit = 4, the value used in our main analysis. The line fitted to them goes through the value 50 log n at each value of n. The crosses in the plot (smaller values) are 10 experiments at each value of n ∈ {103 , 104 , 105 , 106 } with cinit = 1, which appears to work in practice (maintaining that the approximation error remains within 2 of log n) and faster by a significant constant. The line fitted to them goes through the value 5 log n at each value of n. For n ≤ 100 the protocol has large probability of failure to converge due to some index i ∈ {1, . . . , K} having none of the agents choose to be responsible for i.

D. Doty and M. Eftekhari

C

XX:19

Chernoff bound on sums of maxima of geometric random variables

C.1

Sub-exponential random variables

I Definition C.1. Let α, β > 0 and let X be a random variable. We say X is α-β-subexponential if, for all λ > 0, Pr [|X − E [X] | ≥ λ] ≤ αe−λ/β . The following lemma is well-known; we prove it explicitly since the exact form is convenient for our purposes but is more general than typically expressed. It shows that exponential tail bounds for Pr [|X − E [X] | > λ] give bounds on the moment-generating functions of the random variables X − E [X] and E [X] − X. The proof is modeled on Rigollet’s proof of the analogous lemma for sub-gaussian random variables proven in [30]. I Lemma C.2 X be a α-β-sub-exponential random variable. Then for all i ( [30]). Let h 1 1 , we have E es(X−E[X]) , E es(E[X]−X) ≤ 1 + 2αβ 2 s2 . s ∈ − 2β , 2β Proof. Let k ∈ N+ . Then Z ∞ h Z ∞ i Pr |X − E [X] | ≥ λ1/k dλ Pr |X − E [X] |k ≥ λ dλ = E |X − E [X] |k = 0 Z0 ∞ Z ∞ 1/k ≤ αe−λ /β dλ = αβ k k e−u uk−1 du substituting u = βλ1/k 0

0

= αβ k kΓ(k) = αβ k k!, R ∞ −u k−1 e u i du is the gamma function, known to equal (k − 1)! for k ∈ N+ . 0h 1 1 , 2β , Then for all s ∈ − 2β

where Γ(k) =

h

E e

s(X−E[X])

i

" = =

E

∞ X (s(X − E [X]))k

k!

k=0 ∞ k X k=0

s E (X − E [X])k k!

# Taylor expansion of the exponential function dominated convergence theorem

∞ X sk E (X − E [X])k = 1 + s E [X − E [X]] + | {z } k! k=2 =0 ∞ X |s|k E |X − E [X] |k ≤ 1+ odd terms can only get larger k! ≤ 1+

k=2 ∞ X

k=2

∞ X 1 ≤ 1 + αβ 2 s2 2k

=

∞

∞

k=2

k=0

X X |s|k αβ k k! k k =1+α (|s|β) = 1 + αs2 β 2 (|s|β) k!

k=0 2 2

since |s| ≤

1 2β

1 + 2αβ s .

The bound for E es(E[X]−X) is derived by a similar argument.

J

The following Chernoff bound is well-known, but stated in a more convenient form for our purposes.

XX:20

Efficient size estimation and impossibility of termination in uniform dense population protocols I Lemma C.3. Let α, β > 0 and K ∈ N+ . Let X1 , . . . , XK be i.i.d. α-β-sub-exponential PK random variables. Define S = i=1 Xi . Then for all t ≥ 0, K

Pr [|S − E [S] | ≥ t] ≤ 2

(1 + α/2) . et/(2β)

Proof. Then for all s, t > 0, Pr [S − E [S] > t]

= ≤ = = =

i h Pr es(S−E[S]) > est E es(S−E[S]) Markov’s inequality est PK s Xi −E[S] i=1 e−st E e PK e−st E es i=1 (Xi −E[Xi ]) linearity of expectation "K # Y −st s(Xi −E[Xi ]) e E e i=1

=

e−st

K Y

h i E es(Xi −E[Xi ]) .

independence of the Xi ’s

i=1

By Lemma C.2, for all |s| ≤

1 2β ,

E es(Xi −E[Xi ]) ≤ 1 + 2αβ 2 s2 , so letting s =

Pr [S − E [S] > t] ≤ e−st 1 + 2αβ 2 s2

K

K

= e−t/(2β) (1 + α/2) . K

The proof that Pr [E [S] − S ≥ t] < e−t/(2β) (1 + α/2) K Pr [|S − E [S] | ≥ t] < 2 · e−t/(2β) (1 + α/2) .

C.2

1 2β ,

is symmetric. By the union bound, J

Geometric random variables and their maximum

We say G is a p-geometric random variable if it is the number of consecutive flips until the first H (including the H), when flipping a coin with Pr [H] = p. Thus E [G] = p1 ; in particular E [G] = 2 if p = 21 . Defining M = max Gi , where each Gi is an i.i.d. 21 -geometric random variable, it is 1≤i≤n

known [23] that E [M] ≈ log n. Lemma C.5 shows a tail bound on M for general p-geometric random variables, which we will later apply to the case p = 12 . We first require a technical lemma relating E [M] and log n more precisely, and more Pn 1 generally for p-geometric random variables for p 6= 12 . Let Hn = i=1 n be the n’th harmonic number. Let γ = lim (Hn − ln n) ≈ 0.577 be the Euler-Mascheroni constant; for n→∞ all n ≥ 50 we have Hn − ln n − γ ≤ 0.01. I Lemma C.4. Let G1 , . . . , Gn be i.i.d. p-geometric random variables with q = 1 − p ≥ 1e , n ≥ 50, and let M = max Gi . Let 1 = 0.01 and 2 = 0.0006. Then for all λ > 0, ln n+γ ln 1/q

1≤i≤n

1 + 1/2 − 2 < E [M] < ln n+γ+ + 1/2 + 2 ; particularly for q = p = 1/2, we have: ln 1/q log n + 1 < E [M] < log n + 3/2.

Proof. Eisenberg [23] showed that if q ≥ 1e , then λ1 Hn − 0.0006 ≤ E [M] − 1/2 < λ1 Hn + 0.0006, where q = e−λ , i.e. λ = ln(1/q). Thus λ1 Hn + 1/2 − 2 ≤ E [M] < λ1 Hn + 1/2 + 2 ln n+γ+1 i.e., lnlnn+γ + 1/2 + 2 . J 1/q + 1/2 − 2 < E [M] < ln 1/q

D. Doty and M. Eftekhari I Lemma C.5. Let G1 , . . . , Gn be i.i.d. p-geometric random variables with q = 1 − p ≥ 1e , n ≥ 50, and let M = max Gi . Let 1 = 0.01 and 2 = 0.0006. Then for all 1≤i≤n λ > 0, Pr [E [M] − M ≥ λ] ≤ exp −q 1/2+2 −(γ+1 ) ln q−λ and Pr [M − E [M] ≥ λ] ≤ q λ−1/2−2 −γ ln q + q 2λ−1−22 −2γ ln q . Proof. For each t ∈ N, Pr [Gi ≥ t] = q t−1 , so Pr [Gi ≤ t] = 1 − Pr [Gi ≥ t + 1] = 1 − q t . Qn n Since the Gi ’s are independent, Pr [M ≤ t] = i=1(1 − q t )= (1 − q t ) . n 2 ≤ ex for n > Below we use Lemma C.4 and the inequalities ex 1 − xn ≤ 1 + nx 1, |x| < n. Setting t = E [M] − λ, we have n 1 − qt n = 1 − q (E[M]−λ) n < 1 − q log1/q n−(γ+1 ) ln q+1/2+2 −λ n = 1 − q log1/q n q 1/2+2 −(γ+1 ) ln q−λ n q 1/2+2 −(γ+1 ) ln q−λ = 1− n x n < exp −q 1/2+2 −(γ+1 ) ln q−λ since 1 + ≤ ex n Similarly, letting t = E [M] + λ − 1, we have Pr [M ≤ E [M] − λ]

=

Pr [M ≥ E [M] + λ] = = = < = = < = ≤ ≤ ≤

1 − Pr [M ≤ E [M] + λ − 1] n 1 − 1 − qt n 1 − 1 − q E[M]+λ−1 n 1 − 1 − q log1/q n+1/2−2 −γ ln q+λ−1 n 1 − 1 − q log1/q n q λ−1/2−2 −γ ln q n q λ−1/2−2 −γ ln q 1− 1− n q 2(λ−1/2−2 −γ ln q) x2 x n 1 − exp −q λ−1/2−2 −γ ln q 1− since ex 1 − ≤ 1+ n n n 2λ−1−22 −2γ ln q q 1 − exp −q λ−1/2−2 −γ ln q + ln 1 − n 2λ−1−22 −2γ ln q q q λ−1/2−2 −γ ln q − ln 1 − since 1 − ex ≤ −x n 2q 2λ−1−22 −2γ ln q since ln(1 − x) ≥ −2x if x < 0.7 n q λ−1/2−2 −γ ln q + q 2λ−1−22 −2γ ln q since n ≥ 2. J

q λ−1/2−2 −γ ln q +

The following corollary for the special case of p = 21 is used for our main result, showing that a maximum of 12 -geometric random variables is α-β-sub-exponential for α = 3.31, β = 2. I Corollary C.6. Let G1 , . . . , Gn be i.i.d. 21 -geometric random variables, n ≥ 50, and let M = max Gi . Then for all λ > 0, Pr [|M − E [M] | ≥ λ] < 3.31e−λ/2 . 1≤i≤n

XX:21

XX:22

Efficient size estimation and impossibility of termination in uniform dense population protocols

Proof. By Lemma C.5 and the union bound, Pr [|M − E [M] | ≥ λ] < exp −q 1/2+2 −(γ+1 ) ln q−λ + q λ−1/2−2 −γ ln q + q 2λ−1−22 −2γ ln q = exp −2λ−(γ+1 ) ln 2−1/2−2 + 21/2+2 −λ−γ ln 2 + 21+22 −2λ−2γ ln 2 3.31e−λ/2 .

<

justified below

To see the final inequality, note that

< exp −2λ−(γ+1 ) ln 2−1/2−2

exp −2λ−1

=

exp −2λ /2

≤

exp (−λ/2)

21/2+2 −λ−γ ln 2

=

21/2+2 −γ ln 2 · 2−λ

=

21/2+2 −γ ln 2 · 4−λ/2

<

21/2+2 −γ ln 2 · e−λ/2

<

1.1 · e−λ/2 .

21+22 −2λ−2γ ln 2

=

21+22 −2γ ln 2 · 2−2λ

=

21+22 −2γ ln 2 · 16−λ/2

<

(1.1)2 · e−λ/2 .

So, their sum is less than 3.31e−λ/2 .

J

The following is a stronger corollary of Lemma C.5 for p = 12 , which does not fit the definition of α-β-sub-exponential, but which will be useful elsewhere in our analysis. I Corollary C.7. Let G1 , . . . , Gn be i.i.d. 12 -geometric random variables, n ≥ 50, and M = max Gi . Then Pr [M ≥ E [M] + log n] < 0.32 · n−1 and Pr [M ≤ E [M] − (1 + log ln n)] < 1≤i≤n −1

n

.

Proof. Note that for all λ > 0, setting q = 21 , Pr [M − E [M] ≥ λ] < =

q λ−1/2−2 −γ ln q + q 2λ−1−22 −2γ ln q 21/2+2 −λ−γ ln 2 + 21+22 −2λ−2γ ln 2 .

Setting λ = log n, 21/2+2 −λ−γ ln 2

=

21/2+2 −log n−γ ln 2

=

21/2+2 −γ ln 2 · 2− log n

<

0.11 · n−1 .

D. Doty and M. Eftekhari

21+22 −2λ−2γ ln 2

XX:23

=

21+22 −2 log n−2γ ln 2

=

21+22 −2γ ln 2 · 2−2 log n

< 0.21 · n−2 < 0.21 · n−1 . Combining these, we obtain the first inequality Pr [M − E [M] ≥ log n] < 0.32 · n−1 . To see the second, note that setting q = 21 , and letting λ = 1 + log ln n Pr [E [M] − M ≥ λ] < exp −q 1/2+2 −(γ+1 ) ln q−λ = exp −2λ−(γ+1 ) ln 2−1/2−2 < exp −2λ−1 = exp −21+log ln n−1 =

exp (− ln n)

=

n−1 . J

I Lemma C.8. Let n, K ∈ N+ , n ≥ 50. Let M1 , . . . , MK be i.i.d. random variables, each PK of which is the maximum of n i.i.d. 12 -geometric random variables. Define S = i=1 Mi . Then for all t ≥ 0, Pr [|S − E [S] | ≥ t] ≤ 2 · eK−t/4 . Proof. By Corollary C.6 and Lemma C.3, for α = 3.31 < 2e − 2 and β = 2, we have K K 1 + 2e−2 (1 + α/2) 2 <2 = 2 · (e)K · e−t/4 = 2 · eK−t/4 . J Pr [|S − E [S] | ≥ t] < 2 et/(2β) et/4 I Corollary C.9. Let a > 4, n ∈ N+ , n > 50, K ≥

ln n

a 4 −1

, and δ0 = 1/2 + γ/ ln 2 − 2 .

Let M1 , . . . , MK be i.i.d. random variables, each of which is the maximum of n i.i.d. PK geometric random variables. Define S = i=1 Mi . Then S 2 Pr − log n − δ0 ≥ a ≤ . K n

1 2-

Proof. We first manipulate the expression in the conclusion of the corollary to put it in a form where we can apply Lemma C.8.

S Pr − log n − δ0 ≥ a K = Pr [S − K (log n + 1/2 + γ/ ln 2 − 2 ) ≥ aK] <

Pr [S − E [S] ≥ aK] .

Since K (log n + 1/2 + γ/ ln 2 − 2 ) ≤ E[S]

S Pr log n + δ0 − ≥a K = Pr [K (log n + 1/2 + γ/ ln 2 − 2 ) − S ≥ aK] Pr [K (log n + 1/2 + γ/ ln 2 + 1 / ln 2 + 2 ) − S ≥ aK + (1 / ln 2 + 22 ) K] γ + 1 < Pr [E [S] − S ≥ (a + 1 / ln 2 + 22 )K] Since E[S] < K log n + 1/2 + + 2 ln 2 < Pr [E [S] − S ≥ aK] . =

XX:24

Efficient size estimation and impossibility of termination in uniform dense population protocols

S S Since the events K − log n − δ0 ≥ a and log n + δ0 − K ≥ a are disjoint, and the events S − E [S] ≥ aK and E [S] − S ≥ aK are disjoint, the union bound holds with equality, so S S S = Pr − log n − δ0 ≥ a + Pr log n + δ0 − ≥a Pr − log n − δ0 ≥ a K K K < Pr [S − E [S] ≥ aK] + Pr [E [S] − S ≥ aK]

=

Pr [|S − E [S] | ≥ aK] .

Let t = aK. Applying Lemma C.8 with these values of K and t, S < Pr [|S − E [S] | ≥ aK] Pr − log n − δ0 ≥ a K =

Pr [|S − E [S]| ≥ t]

≤ 2 · eK−t/4 =

2 · eK(1− 4 )

=

2 · e−K( 4 −1)

a

≤ 2·e

a

n − ( aln−1) (a 4 −1) 4

2 · e− ln n 2 . J = n

=

For example, choosing a = ln 2+4 < 4.7 means we can choose K ≥ ln n ln(2)/4

ln n

a 4 −1

=

ln n (ln(2)+4)/4−1

=

= 4 log2 n:

I Corollary C.10. Let n ∈ N+ , n ≥ 50, K ≥ 4 log n. Let M1 , . . . , MK be i.i.d. random variables, each of which is the maximum of n i.i.d. 21 -geometric random variables. Define PK S = i=1 Mi . Then S 2 Pr − log n ≥ 4.7 ≤ . K n

D. Doty and M. Eftekhari

D

XX:25

Timer lemma

I Lemma D.1. Let 0 < δ ≤ 21 . Let 0 < k ≤ n and m be positive integers. Suppose we have n bins, of which k are initially empty, and we throw m additional balls randomly into the n bins. Then Pr [≤ δk bins remain empty] ≤ (2δem/n )δk . Pk Proof. The number of balls needed until ≤ δk bins are empty is a sum S = i=δk+1 Gi of independent geometric random variables Gδk+1 , . . . , Gk , where Gi has pi = Pr [success] = i n , with “success” representing the event of throwing a ball into one of the k initially empty bins. The moment-generating function of a geometric random variable G with Pr [success] = p, defined whenever θ < − ln(1 − p) [29], is E eθG =

p peθ p = −θ ≤ , 1 − (1 − p)eθ e −1+p p−θ

where the last inequality follows from ex −1 ≥ x for all x ∈ R. Thus for each i ∈ {δk, . . . , k}, E eθGi ≤

i n i n

−θ

=

i . i − θn

By independence of the Gi ’s, the moment-generating function of the sum S is Pk θ Gi i=δk+1 E eθS = E e =

k Y

E eθGi ≤

i=δk+1

k Y i=δk+1

i . i − θn

1 Setting θ = − δk n , and using the fact that δ ≤ 2 to cancel terms, we have k Y θS δk + 1 i δk + 2 k ... = E e ≤ i + δk δk + 1 + δk δk + 2 + δk k + δk i=δk+1

= =

(δk + 1) . . . (δk + δk) (δk + 1 + δk) . . . (k) 1 · · 1 (δk + 1 + δk) . . . (k) (k + 1) . . . (k + δk) δk (δk + 1) . . . (δk + δk) 2δk < = (2δ)δk . (k + 1) . . . (k + δk) k

The event that throwing m balls results in at most δk empty bins is equivalent to the event that S ≤ m. By Markov’s inequality, since θ = − δk n < 0, θS E eθS δk θm Pr [S ≤ m] = Pr e ≥ e ≤ < (2δ)δk e n m = (2δem/n )δk . J θm e We say a transition consumes a state s if executing the transition strictly reduces the count of s, and that the transition produces s if it strictly increases the count of s. The next lemma bounds the rate of consumption of s, showing that the count of s cannot decrease too quickly. It also makes the observation that, since we are reasoning about s assuming that it is only consumed, we can upper-bound the probability of the count of s dropping below δk at any time t ∈ [0, T ], not just at time t = T . I Lemma D.2. Let s be a state in a population protocol, let 0 < δ ≤ 12 , and let k be the count of s at time 0. Let Ct,s denote the count of s at time t. Then for all T > 0, Pr [(∃t ∈ [0, T ]) Ct,s ≤ δk] ≤ (2δe3T )δk .

XX:26

Efficient size estimation and impossibility of termination in uniform dense population protocols

Proof. s may be produced and consumed. To establish that the count of s remains large for a constant amount of time, in the worst case we assume that s is only consumed. We also make the worst case assumption that each time an agent in state s is picked for a transition, it changes state and we consume that copy of s. We further make the worst-case assumption that if both agents are in state s, both change to a different state. The following almost works: model each transition as throwing two balls into bins, where each agent is a bin, considered “empty” if it is in state s. Each time a transition picks an agent, this puts a ball into the bin that agent represents. Thus, the number of balls in a bin represents the total number of times that the agent interacts. However, these are not identically distributed processes, since a bin may be picked twice consecutively in the balls-and-bins distribution, whereas when agents are picked two at a time, the two agents are guaranteed to be unequal. Thus the actual distribution has slightly higher probability of fewer empty bins than the simplified “throw-two-balls-every-transition” approximation. So instead, consider the distribution of empty bins given by throwing three balls for every transition. Suppose p ∈ {δk + 1, . . . , k} bins out of n are currently empty. After the next three balls, the number of empty bins E3 will be p, p − 1, p − 2, or p − 3. We have that 3 n−p , Pr [E3 = p] = n 2 n−p p Pr [E3 = p − 1] = · n n n−p p n−p−1 + · · n n n 2 n−p−1 p , + · n n n−p p p−1 Pr [E3 = p − 2] = · · n n n p n−p−1 p−1 + · · n n n p p−1 n−p−2 + · · , n n n p p−1 p−2 · · . Pr [E3 = p − 3] = n n n Compare this to the true distribution E2 of the number of empty bins after one interaction, where two unequal bins are picked at random each to get a ball. Then n−p (n − p)(n − p − 1) 2 Pr [E2 = p] = = , n n(n − 1) 2 n−p p · , Pr [E2 = p − 1] = n n p p−1 p(p − 1) (p − 1)(p − 2) 2 2 Pr [E2 = p − 2] = = · n · n n(n − 1) n(n − 1) 2 2 Pr [E2 = p − 3]

=

0

It can be verified by inspection that for each ` ∈ {p − 3, p − 2, p − 1, p}, X X Pr [E2 ≤ `] = Pr [E2 = `0 ] < Pr [E3 = `0 ] = Pr [E3 ≤ `] . `0 ∈{p−3,...,`}

`0 ∈{p−3,...,`}

Thus, the distribution of empty bins given by throwing three balls independently at random stochastically dominates the true distribution of empty bins after one interaction.

D. Doty and M. Eftekhari

XX:27

The number of interactions in time T is T n. Using the stochastically dominating distribution above, we model this as throwing m = 3T n balls independently. By Lemma D.1, Pr [(∃t ∈ [0, T ]) Ct,s ≤ δk] ≤ (2δem/n )δk = (2δe3T )δk .

J

1 The following corollary with δ = 81 and T = 1 states that within time 1, it is unlikely for the count of any state to decrease by more than factor 81 from k to k/81.

I Corollary D.3. Let s be a state in a population protocol, and let k be the count of s at time 0. Let Ct,s denote the count of s at time t. Then Pr [(∃t ∈ [0, 1]) C1,s ≤ k/81] ≤ 2−k/81 . 1 Proof. Note that 2e3 < 40.2, so setting δ = 81 and T = 1 implies that 2δe3T < 12 . Applying 1 and T = 1, we have Lemma D.2 with δ = 81

Pr [(∃t ∈ [0, 1]) C1,s ≤ k/81] < (2δe3T )δk < 2−k/81 .

J

Recall the timer lemma used in Section 4. The proof follows the same structure as the main theorem of [19], but uses the discrete-time model of population protocols rather than the continuous-time model of chemical reaction networks. Additionally, care must be taken to show that although the number of states is infinite (so clearly only a finite number can appear in finite time), those states producible via a constant number of transitions, whose probabilities are bounded below by a positive constant, in sufficiently large dense configurations are all produced in large quantity in constant time. Lemma 4.2. Let α > 0, m ∈ N+ , ρ ∈ (0, 1], and P be a population protocol. Then there are constants , δ, n0 > 0 such that, for all n ≥ n0 , for all α-dense configurations ~c of P c. For with n = ||~c||, the following holds. Let Λm ρ be the set of states m-ρ-producible from ~ s ∈ Λ and t > 0, let Ct,s be the random variable denoting the count of s at time t, assuming −n . at time 0 the configuration is ~c. Then Pr (∀s ∈ Λm ρ ) C1,s ≥ δn ≥ 1 − 2 Proof. We need that |Λm ρ | < ∞. To see this holds, assume otherwise. Note that if all pairs of states a, b have a finite number of transitions of the form a, b → . . ., then this implies by induction that each Λm ρ is finite. So assume there are a, b ∈ Λ and an infinite P∞ ρi set of transitions a, b → . . . for i ∈ N. Because these are probabilities, i=0 ρi ≤ 1. Then lim ρi = 0, so for all but finitely many i, we have ρi < ρ. Transitions with ρi < ρ cannot

i→∞

m be used to produce states in Λm ρ , as their rate constants smaller than the definition of Λρ m allows. This shows that |Λρ | < ∞. By hypothesis all s ∈ Λ0 satisfy ~i(s) ≥ αn. Fix a particular s ∈ Λ0 . Let k = αn in Corollary D.3; then h αn i Pr (∃t ∈ [0, 1]) Ct,s < ≤ 2−αn/81 . 81

By the union bound, h αn i Pr (∃s ∈ Λ0 )(∃t ∈ [0, 1]) Ct,s < ≤ |Λ0 |2−αn/81 . 81

(1)

That is, with high probability, all states in Λ0 have “large” count (at least αn 81 ) for the entire first unit of time. Call this event H(Λ0 ) (i.e., the complement of the event in (1)). We complete the proof by a “probabilistic induction” on i ∈ {0, 1, . . . , m} as follows. We show a sequence δ0 > δ1 > . . . > δm > 0 such that the following holds. Inductively assume

XX:28

Efficient size estimation and impossibility of termination in uniform dense population protocols

that for all s ∈ Λiρ and all t ∈

h

i

i m+1 , 1

, Ct,s ≥ δi n. Call this event H(Λiρ ). Then we will

i+1 show that assuming H(Λiρ ) holds, with high probability H(Λi+1 ρ ) holds, i.e., for all s ∈ Λρ i h α i+1 , 1 , Ct,s ≥ δi+1 n. The base case is established by (1) for δ0 = 81 . and for all t ∈ m+1 We use Chernoff bounds for binomial random variables, which state that, for 1 ≤ i ≤ k, if each Xi is an independent 0/1-random variable with Pr [Xi = 1] = p, defining X = Pk −µβ 2 /2 and i=1 Xi and µ = E [X] 2 = kp, then for 0 < β ≤ 1, Pr [X ≤ (1 − β)µ] ≤ e Pr [X ≥ (1 + β)µ] ≤ e−µβ /3 . To see that the inductive case holds, fix a particular state s ∈ Λi+1 \ Λiρ ; then s ∈ ρ i i PROD(Λρ ). By the definition of PROD(Λρ ), s is produced by a transition of the form ρ0

x, y → s, s0 where x, y ∈ Λiρ and ρ0 ≥ ρ. At time t, the given transition has probability ρ0 · Ct,x · Ct,y / n2 (if x 6= y) or ρ0 · (Ct,x · (Ct,x − 1)/2)/ n2 (if x = y) of occurring in the next interaction. We make the worst-case assumption that the probability is the latter probability, which is smaller when we also make the worst-case assumption Ct,x = Ct,y = δi n, and substitute ρ for ρ0 since ρ0 ≥ ρ. h i i By the induction hypothesis H(Λi ), for all t ∈ m+1 , 1 , Ct,x ≥ δi n and Ct,y ≥ δi n. So for each interaction between time 0

ρ

i m+1

and

i+1 m+1 ,

the probability that it executes transition

0

x, y → s, s is at least ρδi n(δi n − 1)/2 ρδi (δi n − 1) ρδi (δi n) ρδi n(δi n − 1)/2 = = > = ρδi2 . n n(n − 1)/2 n − 1 n 2 ρ0

n interactions in that time interval. Thus the number of times x, y → s, s0 There are m+1 executes in that interval is stochastically dominated by a binomial random variable X+ , nρδ 2 n with k = m+1 trials and probability of success p = ρδi2 , and µ = E [X+ ] = kp = m+1i . By the Chernoff bound, setting β = 21 , + nρδi2 nρδi2 −µβ 2 /2 + = Pr X ≤ (1 − β)µ ≤ e = exp − . Pr X ≤ 2(m + 1) 8(m + 1) ρ0

The above analysis lower bounds how many times transition x, y → s, s0 executes, producing s each time. We now upper bound how many times s is consumed in this same interval. We are trying to show that s gets to a large count, so we make the worst-case assumption that its count starts at 0. Any any time, out of n2 pairs of agents, at most s(n − 1) of those pairs have at least one agent in state s. So at time t, the probability that the next C = 2 nt,s . transition consumes s is at most s(n−1) (n2 ) Prior to s reaching count nρδi2 /32, we can make the worst case assumption that the nρδ 2 /32 probability of each transition consuming s is exactly 2 ni = ρδi2 /16. In this worst case the h numberi of transitions consuming s in the k = n/m interactions in the time interval i i+1 m+1 , m+1

is stochastically dominated by 2X− , where X− is a binomial random variable

with k = n/(m + 1) trials and probability of success p = ρδi2 /16. (We consider 2X− instead of X− to account for the fact that each transition in the worst case could consume 2 copies nρδi2 of s.) Apply the Chernoff bound with µ = E [X− ] = kp = 16(m+1) and β = 1 to give 2 nρδi2 nρδi2 = Pr X− ≥ (1 + β)µ ≤ e−µβ /3 = exp − . Pr X− ≥ 8(m + 1) 48(m + 1) Thus Pr 2X− ≥

nρδi2 nρδi2 ≤ exp − . 4(m + 1) 48(m + 1)

D. Doty and M. Eftekhari

Note that this count threshold is half the count threshold we derived for the lower bound on ρ the number of transitions x, y → s, s0 producing s. Thus, applying the union bound to these two events to bound X+ − 2X− , the net production of s (number produced minus number consumed), we have that nρδi2 nρδi2 nρδi2 Pr X+ − 2X− ≤ ≤ Pr X+ ≤ or 2X− ≥ 4(m + 1) 2(m + 1) 4(m + 1) 2 nρδi nρδi2 ≤ exp − + exp − 8(m + 1) 48(m + 1) 2 nρδi < 2 · exp − . 48(m + 1) nρδi2 , at least nρδi2 /(4(m + 1)) net copies of So with probability at least 1 − 2 · exp − 48(m+1) h i i i+1 s are produced at some point in the time interval m+1 , m+1 . −k/81 Letting k = nρδi2 /(4(m + 1)) in Corollary D.3, with probability at least 1− = h i2 2 i+1 −nρδi /(324(m+1)) 1−2 , we have that Ct,s ≥ δi n/81 for all times t ∈ m+1 , 1 . Setting

δi+1 = (k/n)/81 = ρδi2 /(324(m + 1)) proves the inductive case with probability of failure at most nρδi2 2 · exp − + 2−δi+1 n 48(m + 1) By the union bound over all |Λm ρ | states in all levels of induction, setting δ = δm = m ρ (α/2)2 /(324(m + 1))m , noting that δ ≤ δi for all 0 ≤ i ≤ m, with probability most nρδ 2 −δn |Λm | 2 · exp − + 2 , ρ 48(m + 1) m

the count of all states in Λm ρ fails to reach at least δn by time t = 1. By setting n0 sufficiently large, the above probability is < 1 for all n = n0 (and therefore for all greater n as well). By setting > 0 sufficiently small, this probability is at most 2−n . J

XX:29