Electronic Colloquium on Computational Complexity, Report No. 110 (2005) The Round Complexity of Two-Party Random Selection ∗

We study the round complexity of two-party protocols for generating a random n-bit string such that the output is guaranteed to have bounded bias (according to some measure) even if one of the two parties deviates from the protocol (even using unlimited computational resources). Specifically, we require that the output's statistical difference from the uniform distribution on zon is bounded by a constant less than 1.We present a protocol for the above problem that has 2 log*n+O(1) rounds, improving a 2n-round protocol that follows from the work of Goldreich, Goldwasser, and Linial (FOCS'91). Like the GGL protocol, our protocol actually provides a stronger guarantee, ensuring that the output lands in any set T⊆zon of density μ with probability at most O(√μ+δ), where δ is an arbitarily small constant.We then prove a matching lower bound, showing that any protocol guaranteeing bounded statistical difference requires at least log*n - log* log*n-O(1) rounds. As far as we know, this is the first nontrivial lower bound on the round complexity of random selection protocols (of any type) that does not impose additional constraints (e.g. on communication or "simulatability").We also state several results for the case when the output's bias is measured by the maximum multiplicative factor by which a party can increase the probability of a set T ⊆ zon.


Introduction
One of the most basic protocol problems in cryptography and distributed computing is that of random selection, in which several mutually distrusting parties aim to generate an n-bit random string jointly. The goal is to design a protocol so that even if a party cheats, the outcome will still not be too "biased". (There are many different choices for how to measure the "bias" of the output; the one we use will be specified later.) Random selection protocols can dramatically simplify the design of protocols for other tasks via the following common methodology: first design a protocol in a model where truly random strings are provided by a trusted third party (generally a much easier task), and then use the random selection protocol to eliminate the trusted third party. For this reason, there is a wide literature on random selection protocols, both in the computational setting, where cheating parties are restricted to polynomial time (starting with Blum's "coin flipping by telephone" [Blu82]), and the information-theoretic setting, where security is provided even against computationally unbounded adversaries.
We will focus on two-party protocols in the information-theoretic setting (also known as the "full information model"). In addition to its stronger security guarantees, the information-theoretic setting has the advantage that protocols typically do not require complexity-theoretic assumptions (such as the existence of one-way functions). Various such random selection protocols have been used to construct perfectly hiding bit-commitment schemes [NOVY98], to convert honest-verifier zero-knowledge proofs into general zeroknowledge proofs [Dam94,DGW94,GSV98], to construct oblivious transfer protocols in the bounded storage model [CCM98,DHRS04], and to perform general fault-tolerant computation [GGL98]. There has also been substantial work in the k-party case for k ≥ 3, where the goal is to tolerate coalitions of a minority of cheating players. This body of work includes the well-studied "collective coin-flipping" problem e.g., [BL89,Sak89,AN90,ORV94,RZ98,Fei99] (closely related to the "leader election" problem), and again the use of random selection as a tool for general fault-tolerant computation [GGL98].
In most of the lines of work mentioned above (computational and information-theoretic, two-party and k-party), the round complexity has been a major parameter of interest. For some forms of random selection and their applications, constant-round protocols have been found (e.g. [DGW94,GSV98] improving [Dam94], [DHRS04] improving [CCM98], and [Lin01,KO04] improving [Blu82,Yao86]), but for others the best known protocols have a nonconstant number of rounds, e.g. [NOVY98,GGL98,RZ98]. Lower bounds on round complexity, however, have proven much more difficult to obtain, and we only know of examples that impose additional constraints on the protocol (beyond the basic security guarantee of bounded bias). For example, in the computational setting, it has been recently shown that 5 rounds are necessary and sufficient for random selection protocols satisfying a certain "black-box simulation" condition [KO04]. In the information-theoretic setting, a long line of work on the collective coin-flipping problem has culminated in the (log * n + O(1))-round protocol 1 of Russell and Zuckerman [RZ98] (see also Feige [Fei99]), but the only known lower 1 As in other work [RZ98], for the purposes of this paper we will define log b n = n. Moreover, for n ≥ 1, we define log * b n to be the least natural number k such that bound (of Ω(log * n) rounds), due to Russell, Saks, and Zuckerman [RSZ99], is restricted to protocols where each party can only communicate a small number of bits per round. Without this restriction, it is not even known how to prove that 1 round is impossible.
The problem and main results. As mentioned above, previous works on random selection have considered a number of different measures of the bias of the output, typically motivated by particular applications. Here we focus on what we consider to be the most natural measure -the statistical difference (i.e., variation distance) of the output from the uniform distribution. 2 Specifically, we seek a two-party protocol (A, B) that produces an output in {0, 1} n , such that even if one party deviates arbitrarily from the specified protocol, the statistical difference of the output from uniform is bounded by a constant less than 1. Equivalently, we want to satisfy the following criterion.
Statistical Criterion: There are fixed constants µ > 0 and > 0 such that for every n and every subset T ⊆ {0, 1} n of density at most µ, the probability that the output lands in T is at most 1 − , even if one party deviates arbitrarily from the specified protocol.
In addition to being a natural choice, this criterion is closely related to others considered in the literature. In particular, the standard criterion for the "collective coin-flipping" problem is that the output bit B ∈ {0, 1} satisfies max{Pr [B = 0] , Pr [B = 1]} < p, where p is a constant less than 1; this is equivalent to B's statistical difference from uniform being bounded away from 1. (Here we see that the problem we consider is in some sense "dual" to collective coin-flipping -we restrict to two players but the output comes from a large set, whereas in collective coin-flipping there are many players but the output has only two possibilities.) Of course, the first question is whether or not the Statistical Criterion can be met at all, regardless of round complexity. Indeed, being able to tolerate computationally unbounded cheating strategies is a strong requirement. In fact, when n = 1 (i.e. the output is a single bit), it turns out that one of the two parties can always force the outcome to be constant. This implies that the Statistical Criterion is impossible to meet for µ = 1/2. Surprisingly, the criterion is achievable, however, for some smaller constant µ > 0. This is implied by the following result of Goldreich, Goldwasser, and Linial [GGL98].
Theorem 1.1 ( [GGL98]) For every n, there is a two-party protocol producing output in {0, 1} n such that, as long as one party plays honestly, the probability that the output lands in any set T ⊆ {0, 1} n of density µ is at most p = O( √ µ). The protocol has 2n rounds.
Notice that for sufficiently small µ, the probability p is indeed a constant less than 1. This implies that the Statistical Criterion is achievable with a linear number of rounds. Our goal is to determine the minimal round complexity of this problem. First, we give a protocol achieving the Statistical Criterion with substantially fewer rounds than the above.
Theorem 1.2 For every constant δ > 0, there is a two-party protocol producing output in {0, 1} n with 2 log * n + O(1) rounds such that, as long as one party plays honestly, the probability that the output lands in any set T of density µ is at most The statistical difference between two random variables X and Y taking values in a universe U is defined to be ∆(X, Our protocol is inspired by the log * n-round protocols for leader election [RZ98,Fei99] and Lautemann's proof that BPP is contained in the polynomial hierarchy [Lau83]. Specifically, we exhibit a 2-round protocol that reduces the universe of size N = 2 n to a universe of size polylog(N ), while approximately preserving the density of the set T with high probability. Repeating this protocol log * n times reduces the universe size to a constant, after which point we apply the GGL protocol.
Second, we prove a lower bound that matches the above up to a factor of 2 + o(1).
Theorem 1.3 Any two-party protocol producing output in {0, 1} n that satisfies the Statistical Criterion must have at least log * n − log * log * n − O(1) rounds.
Our proof of this theorem is a technically intricate induction on the game tree of the protocol. Roughly speaking, we associate to each node z of the game tree, a collection S z of very small sets such that if the protocol is started at z and R is a random subset of the universe of density o(1), one of the players X can force the outcome of the protocol to land in R∪S with probability 1−o(1), for any S ∈ S z . The challenge is to keep the size of the sets in the collections S z small as we induct up the game tree (so that they remain of density o(1) when z is the root, which yields the desired lower bound). In particular, a node can have an arbitrary number of children, so we cannot afford to take unions of sets S occurring across all children. The key idea that allows us to keep the sets small is the following. We consider two cases: If we have a collection of sets that contains a large disjoint subcollection, then the random set R will contain one of the sets with high probability and so we do not need to carry the set through the recursion. On the other hand, if the collection of sets has no large disjoint subcollection, then we show how we can use this fact to construct a successful strategy for the other player (based on how we inductively construct the collections S z ).
We stress that our lower bound does not impose any additional constraint on the protocol, such as the number of bits sent per round. Thus, we hope that our techniques can help in establishing unrestricted lower bounds on round complexity for other problems, in particular for the collective coin-flipping (and leader election) problem.
Results on multiplicative guarantees. A different measure of the quality of random selection protocol is a multiplicative guarantee, whereby we require that, even if one player cheats, the probability that the outcome lands in any set T of density µ is at most ρ · µ, for some parameter ρ ≥ 1. The goal, naturally, is for ρ to be as small as possible (ideally a constant independent of n). Previous protocols, e.g. [DGW94], have given a multiplicative guarantee to one player while the other has a statistical guarantee (i.e. a bound on the output's statistical difference from uniform if the other cheats). Our observations and results on multiplicative guarantees are the following: • If both parties have multiplicative guarantees ρ A and ρ B , then an argument of [GGL98] implies ρ A · ρ B ≥ 2 n , regardless of the number of rounds.
• There is a simple two-round protocol achieving ρ A · ρ B ≤ 2 n , for any desired ρ A .
• If one party has a multiplicative guarantee ρ and the other has a statistical guarantee ε, then ε ≥ 1/ρ − 1/2 n . This explains inverse relationships in existing protocols of [DGW94] (where ε = 1/poly(n) and ρ = poly(n)) and [GSV98] (where ε = poly(n) · 2 −k and ρ = 2 k for any k). 3 • There is a protocol with 2 log * n + O(1) rounds that provides a constant statistical guarantee to one player and a (arbitrarily small) constant multiplicative guarantee to the other. Theorem 1.3 implies that this round complexity is tight up to a constant factor, because a constant multiplicative guarantee implies a constant statistical guarantee.

Defining Random Selection Protocols
We can formally characterize a random selection protocol as follows: Definition 2.1 A random selection protocol Π = (A, B, f ) over a universe U 4 consists of a pair of functions A and B and a function f such that: • Both A (Alice) and B (Bob) alternately output strings ("messages") m i of arbitrary length that are a function of the conversation thus far and their sequences of random coin tosses r A and r B , respectively. That is, • The conversation between Alice and Bob is the transcript (A, B) = m 1 m 2 . . . m r , where r is a parameter defining the number of messages 5 of the protocol.
• The output of the protocol is f (m 1 m 2 . . . m r ), which is some element of U.
We are interested in the behavior of the protocol when one of these programs is replaced with an arbitrary "cheating" program A * or B * , which may send its messages as an arbitrary function of the conversation and input length.
Although the formulation we have provided assumes a protocol operates over a single fixed universe, in general we will be interested in studying asymptotic behavior of protocols as the universe size increases. Thus, we define a random selection protocol ensemble to be a sequence (Π (1) , Π (2) , . . .) where each Π (N ) is a protocol over U = {1, . . . , N }.
¿From now on, we blur the distinction between random selection protocols over a fixed universe and random selection protocol ensembles. Results depending on asymptotics will hold for random selection protocol ensembles, and other results will hold for any fixeduniverse random selection protocol-in particular, every protocol in the ensemble.
Two desirable properties of random selection protocols are (a) the output is uniformly distributed in U assuming honest players, and (b) in a protocol ensemble, honest strategies can be computed in time polynomial in the output length, log N . Our protocols will have these properties, but our lower bounds will apply even to protocols without them.
We now introduce a formalism that will be invaluable in the proofs of this paper.
Definition 2.2 Given a protocol Π over universe U, define the game tree T as follows: • A set of nodes V , each representing a partial transcript of messages, (m 1 , . . . , m i ).
• A set of edges E, defined by (u, v) ∈ E if and only if u = (m 1 , . . . , m i ) and (abusing notation) v = (u, m i+1 ), for any message m i+1 . That is, u connects to v if v is a potential protocol state after one message from u.
• For each node z, a distribution D z over the children z i whereby A or B chooses the next message.
• For every leaf z = (m 1 , . . . , m r ), a label equal to f ((m 1 , . . . , m r )), the output of the protocol ending at node z.
One can verify that this formalism produces an equivalent specification as Definition 2.1 of a random selection protocol. Just as any node of a tree is the root of another, any node of a protocol's game tree induces its own random selection protocol starting from that state. We simply fix the messages leading to that node, and have the players choose the remaining messages as in the original protocol. This observation is one of the main reasons that the abstraction of a random selection protocol as a tree will prove useful.
Evaluating a Random Selection Protocol. We evaluate random selection protocols with metrics measuring how "close" the output is to the uniform distribution on U. The primary metric we use is the following.

Definition 2.3
The statistical difference (a.k.a. variation distance) of a distribution X over universe U from uniform is defined to be where T ⊆ U and µ(T ) is the density of T in U (i.e., |T |/|U|).
Statistical difference finds the subset of the universe that is hit with probability most different from uniform. It can be verified that this distance is in the interval [0, 1 − 1/N ], where N is the size of the universe U. A statistical difference of 0 implies that X is uniform, and 1 − 1/N implies X is concentrated on a single point.
We will want to avoid distributions X whose statistical difference from uniform is very close to 1. The following lemma demonstrates this (undesirable) property is equivalent to X landing in a small set with high probability.
Lemma 2.4 If X has statistical difference at least 1 − from uniform, then there exists a set T such that µ(T ) ≤ and Pr x←X [x ∈ T ] ≥ 1 − . Conversely, if there exists such a set T , then X has statistical difference at least 1 − 2 from uniform.
The reverse direction follows directly from the definition of statistical difference.
We also consider multiplicative difference: where T ranges over all subsets of U.
We defer all of our results regarding multiplicative difference to Section 5. Given these metrics, we can define: Definition 2.6 The statistical guarantee for Alice playing honest strategy A in a protocol Π (denoted A ) is the maximum over all B * of the statistical difference between the distribution of f ((A, B * )) and the uniform distribution over U. The guarantees for Bob are defined analogously.
Intuitively, the guarantee of a protocol for a player bounds the damage that the opponent can effect on the distribution by deviating from the protocol. Unfortunately, the terminology here is a bit counterintuitive-the lower the number, the better the guarantee. We will try to avoid confusion by saying a guarantee is "at best x", rather than "at least x." Armed with this notion of a guarantee, we can state the following important equivalence, following directly from Lemma 2.4: Proposition 2.7 The Statistical Criterion is equivalent to both of the statistical guarantees of a protocol being bounded away from 1.

The Iterated Random Shift Protocol
In this section we describe the main protocol of this paper, the Iterated Random Shift Protocol, and prove its main properties. That is, we show that for any constant δ, Iterated Random Shift is a 2 log * N + O(1)-round protocol where the probability the output falls in a set of density µ is at most O( √ µ + δ). It follows that the protocol satisfies the Statistical Criterion given above.
Our protocol is inspired by the log * n-round protocols for leader election [RZ98, Fei99] and Lautemann's proof that BPP is contained in the polynomial hierarchy [Lau83]. It is built by iteration of the following 2-round protocol, which we will call the Random Shift Protocol: The Random Shift Protocol Π(A, B): Given a universe U of size N and m ∈ N, 1. Alice randomly selects a sequence of strings a 1 , . . . , a m ∈ U.
3. Output the sequence (a i + b j : 1 ≤ i ≤ m), where + is a group operation over U.
Note that the Random Shift Protocol is not, strictly speaking, a random selection protocol over U: its output is a sequence of strings from the universe. In using it, we will typically choose the parameter m so that the number of output strings, m 2 , is much smaller than N (e.g. m = polylog(N )), and recursively use our random selection protocol to select one of the m 2 output strings. To show that this approach yields a protocol with bounded statistical guarantees, we argue that even if one of the players cheats, any subset T of the universe is unlikely to appear in much more than a µ(T ) fraction of the outputs of the Random Shift Protocol. This is formalized by the following lemma.
Then as long as one player plays honestly (i.e., chooses strings uniformly at random) and m ≥ (1/2 2 ) · log(N/δ), we have That is, when one player is honest, the sequence a i + b j will be sufficiently random so that it is very unlikely that the density of T in the output sequence will increase substantially.
Proof: Suppose Alice plays honestly and chooses her strings a 1 , . . . a m at random from U. The lemma certainly holds a fortiori for an honest Bob, as a cheating Bob can see what strings Alice has selected.
Fix an arbitrary string b ∈ U. Define random variables i ] = µ(T ) and that these random variables are independent.
By a Chernoff bound, we may conclude the following for any : Using a union bound, we conclude: Remark 3.2 We note that the number of strings sent by Bob need only be (1/2 2 )·log(1/δ) (i.e. the log N factor can be eliminated), since there is no need to do a union bound as in the above proof when proving Bob's guarantee. However, the symmetry of the protocol as presented above has the advantage that it can actually be implemented in one round in a model of simultaneous communication (where honest parties can send messages at the same time, but a cheating party may wait to see the other party's message before sending its own message), as is typically used in many-party protocols (e.g. leader election and collective coin-flipping). This reduces the round complexity of our Iterated Random Shift Protocol below to log * N + O(1) in the simultaneous communication model. It is interesting to know whether our lower bound of log * N − log * log * N − O(1) rounds (in Section 4.2) can be extended to the simultaneous communication model (without paying the factor of 2 in the trivial reduction to our non-simultaneous model), since we would then have a lower bound in that model that is tight up to a factor of 1 + o(1).

Figure 1: The Iterated Random Shift Protocol
Let M ∈ N be a "cutoff" parameter that is a power of 2, let U be a universe of size N . We assume that N ≥ M 2 , else we use the protocol below to select from U = U × [M 2 ] and take the first component of the output.
Recursive Protocol: If N > M 2 , let m = max{M, log 3 N }, and let + be a fixed group operation over universe U (e.g., addition mod N ).

Alice
Bob We now describe our Iterated Random Shift Protocol satisfying Theorem 1.2, which consists of recursively applying the Random Shift Protocol until the universe size is small (say, less than a fixed constant), after which we apply the Goldreich-Goldwasser-Linial [GGL98] Protocol. We define the Iterated Random Shift Protocol in Figure 1. Observe that Theorem 3.3 is much stronger than what we need to show Corollary 3.4. Using Theorem 3.3, we know that when one player is honest, for any "small" set T , the probability the output falls in T is close to zero. The Statistical Criterion requires only that this probability is not arbitrarily close to 1.

Proof of Theorem 3.3:
The key idea is that in the ith application of the Random Shift Protocol, we can bound the increase in density of any particular set T by at most some small i (with high probability) and these i 's can be chosen so that i i ≤ . The Iterated Random Shift Protocol concludes by applying the GGL Protocol to this small universe, and then Theorem 1.1 gives us the result.
We first note that the modification of the protocol in case N < M 2 , of selecting from U ×[M 2 ] and taking the first component, does not affect the property claimed in the theorem (because the density of T × [M 2 ] in U × [M 2 ] equals the density of T in U). Thus we assume that N ≥ M 2 , and let N 0 , N 1 , . . . , N k * be the universe sizes in an execution of the Iterated Random Shift Protocol, where k * is the first value of k such that N k = M 2 . That is, Note that for sufficiently large M , the sequence of N i 's is strictly decreasing and there exists a finite k * such that N k * = M 2 . Now, given a subset T ⊆ U, we track how T evolves through an execution of the Iterated Random Shift protocol by the following for k = 0, . . . , k * : where in the definition of T k , (a i ) and (b j ) are the sequences of elements of U k−1 chosen by Alice and Bob in the kth application of the Random Shift Protocol, and + is the group operation over U k−1 used in the protocol.
Intuitively, U k is the remaining universe (of size N k ) after k iterations and T k represents the portion of the remaining universe such that choosing (i, j) ∈ T k will lead to an element of T being the output of the whole protocol. We call µ(T k ) the "effective density" of T in the kth iteration.

provided at least one party plays honestly.
Proof of Claim: Recall that in the k'th iteration, we are applying the Random . Inducting on Lemma 3.1 and using a union bound, we have that for any k, Since the N k 's are decreasing extremely fast, we have This completes the proof.
Applying Claim 3.5 and using Theorem 1.1, we deduce that the probability that the output lands in T is at most It finally remains to verify the round complexity of the Iterated Random Shift Protocol:

Proof:
Each application of the Random Shift Protocol (except for the last) reduces the universe size from N to log 3 N 2 < log 7 N for sufficiently large N , and takes two rounds. A lemma proven in [RZ98] states that if f (n) ≤ log a n for some constant a, then f (log * n) (n) ≤ b for some constant b, where f (k) represents k = log * n repeated applications of f . This implies that, if M is sufficiently large and the initial universe size is N ≥ M 2 , the Random Shift Protocol is applied at most log * N times. (If N < M 2 , then we apply the Random Shift Protocol at most log * (N M 2 ) = log * N + O(log M ) times.) By Theorem 1.1, the GGL protocol on a universe of size at most M 2 takes at most 4 log M rounds.
Thus, taking M to be a sufficiently large constant, we obtain a protocol with 2 log * N +O(1) rounds satisfying the Statistical Criterion, thereby proving Theorem 1.2. More generally, we obtain a protocol of 2 log * N + O(log(1/ )) rounds such that the output lands in a sets of density µ with probability at most O( √ µ + ). Note that we can take to be a slowly vanishing function of N and still have O(log * N ) rounds.
In the next section, we will prove that the Iterated Random Shift Protocol has optimal round complexity, up to a factor of 2 + o(1), among protocols achieving the Statistical Criterion.

Tradeoffs between Statistical Guarantees
As a warmup to our main lower bound, in this section, we present a tradeoff between the statistical guarantees A and B of Alice and Bob, resp.: Proposition 4.1 In any random selection protocol Π over universe U achieving statistical Corollary 4.2 In any random selection protocol Π, max{ A , B } ≥ 1/2 − 1/(2N ).

Proof:
Suppose we have a protocol where A + B < 1 − 1/N . Then we can partition the universe into two sets, S and View the protocol as a game where Alice wins if the output lands in S and Bob wins if the output lands in U − S. A well-known result in game theory is Zermelo's theorem: that, in such a game, one of the players will have a winning strategy (one that wins regardless of how the other player plays). The basic reasoning is backwards induction on the game tree: every leaf node can be labelled a-win or b-win, and then we inductively label the remaining nodes depending on whether there exists a winning child for the current player to select. If there is, the current player will choose that child and will thus have a winning strategy from the current node. If there is not, then the opposing player will certainly win from the current node, as all children of the node lead to nodes from which he or she has a winning strategy.
This result implies one of the following: The main intuition behind the proof is that, at every stage, either there exists a move that is good for the current player or all moves are good for the other player. In either case, the result is good for one of the two players. All that is needed is a way to make sure that every node on the bottom level can be defined as "winning" for someone, and that this notion can propagate up the tree. As we will see, this type of reasoning will figure strongly in the proof of our main lower bound. There, the primary challenge will be to handle the cases when some nodes do not appear to be "winning" for either player.

The Main Lower Bound
In this section, we prove Theorem 1.3, giving a lower bound on round complexity matching the Iterated Random Shift Protocol up to a factor of 2 + o(1).
Theorem 4.3 (Thm. 1.3, restated) For any , µ > 0, there exists constant c such that any random selection protocol on a universe of size N satisfying the Statistical Criterion with parameters and µ requires at least log * N − log * log * N − c rounds.
To prove this theorem, we must show that in a protocol with "few" rounds, one of the two players will be able to find a set of small size that will contain the output with high probability. We will refer to such a set (that the cheating player is trying to make the output fall in) as the cheating set. The proof will rely to some degree on the probabilistic method: we will show the existence of such a cheating set by assuming it is chosen, at least in part, randomly. Specifically, we will prove: Theorem 4.5 There exists a function f such that for any µ, > 0, r ∈ N and protocol Π with r rounds, one of the following three cases holds: 1. When R is a randomly chosen set of density µ, and Alice plays a strategy maximizing the probability that the output of the protocol falls in R assuming that Bob plays honestly, she will succeed with probability 1 − , on average over all possible R. That is, we say that 3. When R is a randomly chosen set of density µ, both Alice and Bob can force the output into R plus an additional o(N ) elements with high probability. That is, the following two conditions hold: Putting the three conditions together, this theorem says that either one player can make the output fall into a random set of certain density with high probability, or both players can make the output fall into a set consisting of a randomly chosen set of certain density and a certain bounded number of (non-random) elements. We call a protocol in Case 1 a win protocol for Alice, Case 2 a win protocol for Bob, Case 3 a tie protocol for both.
To prove Theorem 4.3 using Theorem 4.5, suppose the protocol satisfied the Statistical Criterion with parameters µ and . Then we can set µ slightly less than µ , slightly less than , and Cases 1 and 2 would violate the Statistical Criterion. (By averaging, there exists a fixed set R of density µ such that one of the players can force the output into R with probability at least 1 − .) Case 3 would also violate it for sufficiently large Proving Theorem 4.5 will require an intricate analysis of the game tree using backwards induction. Like the proof of Proposition 4.1, we will show how to "label" the nodes of the game tree, where each label corresponds to a power of a player to force a particular outcome. To build intuition for the full result, we begin by proving why the Statistical Criterion cannot be achieved by any protocol of at most r rounds for r = 1, 2, 3, in the process sketching the key ideas of Theorem 4.5.

Proof Ideas
We stress that the informal discussion in this section is only meant to convey the main ideas, and the reader who prefers a more precise treatment right away can skip to Section 4.2.2. For a visual depiction of the ideas presented in this section, the reader is directed to the conference talk [San05]. r = 1. In a 1-round protocol, Alice sends a message that determines the output of the protocol. Certainly the Statistical Criterion cannot be achieved here: Alice can fix the output and so the output will fall with probability 1 in a set of density 1/N , which will be less than any µ for sufficiently large N . (Note that this is not sufficient to establish Theorem 4.5 for the case r = 1, but this will be done by our proof below that 2-round protocols cannot meet the Statistical Criterion.) r = 2. In a 2-round protocol, the output is a function of an initial message β from Bob and then a message α from Alice. Suppose such a protocol satisfies the Statistical Criterion with parameters µ, .
Note that Bob's message β defines a distribution D β whereby Alice chooses the output. We divide the analysis into cases depending on the size of the support S β = Support(D β ): r = 2, Case I. There exists a Bob message β such that |S β | ≤ s(µ, ), where s(µ, ) is a sufficiently large constant to be defined later. In this case, Bob can force the output into the set S β with probability 1 by sending β as his message. This certainly violates the Statistical Criterion, since s(µ, )/N < µ for sufficiently large N . r = 2, Case II. For every Bob message β, |S β | > s (µ, ). Then the key observation is that, for an appropriate choice of the function s (µ, ), if Alice chooses a set R of density µ at random, then R ∩ S β = ∅ with probability 1 − over her choice R and Bob's choice β, in which case Alice will be able to select an output in R. This corresponds to a win for Alice in Theorem 4.5.
Basically what we have done is proven a simple case of Theorem 4.5 for the 1-round protocol induced by Bob's message β, where Case I corresponds to Case (3b) of the Theorem (taking S = S β ), and Case II corresponds to Case (1) of the Theorem (win for Alice). r = 3 Assume Alice goes first, sending a message γ, after which Bob sends a message β, and Alice sends a message α. We will again denote by S γ,β the set of possible outputs when the messages γ and β are fixed and α varies. Fix µ and that purportedly satisfy the Statistical Criterion.
First, inductively we observe that no "child" (i.e., 2-round protocol based on Alice's first message γ) can be a win for Alice (i.e., such that for all β, |S γ,β | > s(µ, )), because then by choosing this child Alice can contradict the Statistical Criterion by the analysis in Case II of the proof for r = 2. It follows that for every child γ, Bob can choose a message β such that |S γ,β | ≤ s (µ, ). The basic issue now is that although Bob knows he will have the ability to choose a small support, he doesn't know which small support he will be able to choose, as this is a function of Alice's first message. r = 3, Case I. For every Alice message γ, there exists a collection of s (µ, , s(µ, )) choices for β that yield disjoint sets S γ,β . Then, generalizing the probabilistic argument from above, we observe that for an appropriate choice of the function s , if Bob chooses a cheating set R of density µ at random, then with probability greater than 1 − (over the choice of Alice's message γ and Bob's choice of R), there will exist a β such that S γ,β ⊆ R. Bob can subsequently send the message β, forcing the output to fall in R. But the output falling into R, µ(R) ≤ µ, with probability greater than 1 − contradicts the Statistical Criterion. This protocol is a win for Bob in Theorem 4.5. r = 3, Case II. There exists an Alice message γ such that there do not exist s (µ, , s(µ, )) disjoint small sets S γ,β . The key here is the following fact (proven in Lemma 4.14): If a collection S of nonempty sets, each of size at most s, does not have a disjoint subcollection of size t, then there exists a set X of size at most t · s such that X ∩ S = ∅ for all S ∈ S.
The reason: let S be the largest subcollection S ⊆ S where S i ∩S j = ∅ for all S i , S j ∈ S . Then X = S will certainly have |X| ≤ t · s and moreover, X ∩ S = ∅ for all S ∈ S because otherwise we contradict the assumption S is the largest disjoint subcollection.
We conclude that when fewer than s (µ, , s(µ, )) disjoint small supports exist, we can produce a set X, |X| ≤ s (µ, , s(µ, )) · s(µ, ) intersecting every small support. Now, Alice can set her cheating set to be X ∪ R, where R is randomly chosen of density µ < µ. She can send γ as her first message. Then, if Bob chooses a message β leading to a small support S γ,β , by design X ∩S γ,β = ∅ and Alice can make the output fall in her set X ∪R. Otherwise, Bob will choose a large support S γ,β , and so R ∩ S = ∅ with probability greater than 1 − . Either way, since µ(X ∪ R) < µ for sufficiently large N , we will contradict the Statistical Criterion. r = 4. We do not present this case in detail, but rather just outline its high-level structure, which reflects the structure of the full induction required to prove Theorem 4.5. Suppose Bob goes first, sending a message δ and assume the Statistical Criterion holds for and µ .
Certainly, if there exists a choice of δ producing an induced subprotocol that is a win for Bob, (i.e., a choice whereby for every Alice message γ, there are more than s (µ, , s(µ, )) disjoint sets S δ,γ,β ), then this protocol is a win for him and he can violate the Statistical Criterion by choosing that message δ and then applying the strategy from the r = 3 analysis. Similarly, if for all messages δ Bob can send, the induced subprotocol is a win for Alice, then this protocol is a win for Alice too (this would correspond to the case where, for every message Bob can send, there exists a message Alice can send wherein Bob would be forced to send a large support to Alice).
Otherwise, some messages δ lead to a protocol corresponding to the r = 3, Case II (where the induced subprotocol is a tie, as in Case 3a of Thm. 4.5): Alice can pick a message γ, so that there is a set X of size s = s (µ, , s(µ, )) · s(µ, ) intersecting every small set S δ,γ,β . Just as before, the problem for Alice is that the set X to use depends on Bob's first message δ. As above, we partition the analysis into two cases, depending on whether or not there are many disjoint possibilities for the set X. If yes, then a random set will encompass such a set X with high probability, and it is a win for Alice. If not, there is a small set Y intersecting all these choices for X. Here, however, the use of this fact to construct an effective strategy for Bob is more subtle than in the r = 3 case.
Many of the technical ideas used in the full proof of Theorem 4.5 already occur in the cases above. However, setting up a claim suitable for proof by induction is somewhat delicate, and is done via the lengthy statements of Definition 4.8 and Lemma 4.10 in the next section. Jumping ahead, the reason why the induction will stop at log * n rounds is that the sizes of the "small" sets (e.g. the functions s, s , s in the intuition above) grow like a tower with the number of rounds.

Proof of Theorem 4.5
We proceed by backwards induction on the game tree of the protocol.
Definition 4.6 Given a protocol Π with r rounds and constants and µ, let f (r, , µ) = g(r, r, , µ), where For clarity we write s k for g(k, r, , µ), as r, , and µ will remain fixed throughout the proof.
A concept that will prove helpful is that of a maximal disjoint subcollection.
Definition 4.7 Let S be a collection of sets over a given universe U. Then a maximal disjoint subcollection P of S is a collection P ⊆ S satisfying S ∩ T = ∅ for all S, T ∈ P, and for every T ∈ S \ P, there exists S ∈ P such that S ∩ T = ∅.
Such a subcollection always exists, so we will refer to the canonical maximal disjoint subcollection to be one chosen by some fixed but arbitrary method. Now, fix a protocol Π with r rounds, and consider the game tree T it induces (see Definition 2.2). At each node of this tree, we will associate a certain collection of sets (subsets of the universe U). These sets will correspond to the sets S and T of case (3) of Theorem 4.5. This association will be defined inductively on the game tree.
Specifically, we inductively label the nodes of the tree as either a-win, a-lose, a-tie, b-win, b-lose, or b-tie. For each of the tie nodes, we will also associate a collection S z of subsets of U, as defined below. The 'a' or 'b' just tells us whose turn it is, and as we will see, win, lose, and tie will say something about the power of the player whose turn it is at that point. If k = 0 (i.e., z is a leaf of the tree) then label z as a-tie. Moreover, let S z = {{x}}, where x is the output of the protocol ending at node z.
If k > 0, consider the children z 1 , . . . , z of z. Use the following rules to label the nodes: 1. If there exists 1 ≤ i ≤ such that z i is in case b-lose, then label z as a-win.
2. If, for all 1 ≤ i ≤ , z i is in case b-win, then label z as a-lose.

Otherwise, denote T z = {S : z i is b-tie and S ∈ S z i }. That is, T z is the union of the
collections of sets associated with all children of z that are labelled b-tie. Now, let P denote the canonical maximal disjoint subcollection of T z , as defined above, and let s k , s k−1 be defined as in Definition 4.6.
Two cases: (a) |P| ≥ s k /s k−1 ⇒ label z as a-win.
(b) |P| < s k /s k−1 ⇒ label z as a-tie, and define S z to be {S ⊆ U : |S| ≤ s k and S ∩ T = ∅ for all T ∈ T z }. That is, the sets associated with z consist of all sets that intersect all of the sets associated to the children z i (which will be in case b-tie, since those are the only nodes to which we associate sets), and have size ≤ s k .
Likewise, label all nodes at which it is Bob's turn, by swapping a with b in the above specification.
Intuitively, this structure defines the power of the players at various stages of the protocol. The win, lose, and tie nodes refer to cases (1), (2), and (3) of Theorem 4.5. Moreover, the collections S z correspond to S and T in Case (3) of Theorem 4.5.
We codify this power in Lemma 4.10. Before stating it, it will help to define the following: Intuitively, Π z is the protocol induced by starting the protocol at node z (i.e., assuming all messages leading to z are fixed in advance).
Lemma 4.10 Fix and µ, and suppose the protocol has r turns. Let z be some node on the tree at level k, at which it is Alice's turn to play. Throughout, let R be a uniformly random subset of U of density kµ/r.

If z is in case a-win, then
where R is a random subset of U of density kµ/r, and Π z is the protocol induced by beginning at node z, as defined in Definition 4.9. (We say Alice can "win" from node z).
2. If z is in case a-lose, then similarly, (We say Bob can "win" from node z).

If z is in case a-tie, then:
(a) S z and, if k > 0, T z , are nonempty.
(b) When k > 0, then for any T ∈ T z , (We say both Alice and Bob "win" from node z, with "helper sets" T and S, respectively).
Moreover, the same (with "Alice" exchanged for "Bob", and " a" exchanged for " b") holds for all nodes for which it is Bob's turn.
Lemma 4.10 more precisely asserts Theorem 4.5 at each level of the game tree. To use this lemma to prove Theorem 4.5, we simply need to apply it with k = r and z being the root of the game tree. Certainly, if z r is in Case (1) or (2) of Lemma 4.10, it is in Case (1) or (2) of Theorem 4.5, respectively. If z r is in Case (3) of Lemma 4.10, then subcases (3.a), (3.b), and (3.c) directly prove subcases (3.a) and (3.b) respectively, where the sets of S z and T z of (3.a) and (3.b) correspond precisely to the sets T and S we need in those subcases of the theorem. The existence of such sets is guaranteed by subcase (3.a) of the lemma.
We prove Lemma 4.10 by induction on the levels of the tree.
Base Cases: k = 0: So z is a leaf node, and the output of Π z is just deterministically fixed at, say, x. According to Definition 4.8, S z = {{x}}, and we are in case a-tie. Since the density of R must be zero (it is kµ/r), R = ∅ and so we need to show that This of course holds because the output is fixed at x. k = 1: We assume without loss of generality that it is Alice's turn at node z. Notice first that all of the children of z are in case b-tie, by the reasoning in the k = 0 case. Consequently, z must be labelled a-win or a-tie. Which case we're in depends directly on the size of the canonical maximal disjoint subcollection P of T z (T z , recall, is {S : z i is b-tie and S ∈ S z i }). Notice that since k = 1, all of the children of z are labelled b-tie and T z = {{x} : ∃i such that x is the output at z i }. It follows that |P| = |{x : ∃i such that x is the output at z i }|. |P| is precisely the size of the support of the distribution by which Alice chooses the output of the protocol. k = 1, Case 1: Suppose |P| ≥ s 1 /s 0 = s 1 . Then by Rule 3a of Definition 4.8, z is in case a-win. Thus, we must verify that where R is a random subset of the universe of density µ/r. That is, if at node z Alice plays to maximize the probability that the output falls in a set R of density µ/r, her average probability of success over choices of R will be 1 − /r. We use the following lemma, which intuitively says that if we have a large number of disjoint small sets, then one would expect a randomly chosen set of constant density to contain one of them with high probability.
Lemma 4.11 Suppose we have a collection of disjoint sets S 1 , . . . S m over a fixed universe U, |U| = n, where for all i, |S i | ≤ s. Choose a set R randomly of density µ (i.e., n · µ distinct elements). With probability ≥ 1 − (1/m)(e/µ) s , there will exist S i such that S i ⊆ R.
The proof is by Chebyshev's inequality, and is deferred to the appendix. We know P consists of m ≥ s 1 disjoint sets. So, by Lemma 4.11, a randomly chosen set of density µ/r will contain an element of P with probability ≥ 1 − (1/m)(re/µ) (recall the sets have size 1), where m ≥ s 1 = (r/ )(re/µ). Thus, it will contain an element of P with probability 1 − /r. (Notice that Lemma 4.11 explains the way we defined the constants s k in Definition 4.6.) In such an event, when the random set R contains an S ∈ P, we claim there exists a strategy A * for Alice whereby Π z (A * , B) ∈ R. By definition, S = {x}, where x is the output at some leaf z i that is a child of z. To force the output into R, Alice can play the strategy A * which selects z i on her turn at node z. Whenever R contains S ∈ P this strategy succeeds with probability 1, and since this event occurs for at least a 1 − /r fraction of the choices of R, it follows that Suppose P consists of m < s 1 elements, and thus z is in case a-tie. To prove z satisfies Lemma 4.10 in this case, we must verify the following: (a) S z and T z are nonempty, (b), for any T ∈ T z , and (c), for any S ∈ S z , T z is certainly nonempty; each child z i of z is a b-tie node, and T z consists of singletons of each's output.
Since |P| ≤ s 1 , and since all of the elements of T z are singletons, it follows that fewer than s 1 distinct elements appear in the sets of T z . A set S consisting of precisely these elements will be a set of size less than s 1 intersecting every set in T z -thus S ∈ S z by definition, and S z is nonempty. This verifies condition (a) above.
To verify condition (b), notice that node z is Alice's turn, and so Bob has no influence on the output of the protocol. Moreover, any T ∈ T z consists of a single element x that is the output of the protocol at a child z i . Thus, for any such T , Alice can play the strategy A * which selects the corresponding child z i . Thus, Pr B [Π z (A * , B) ∈ T ] = 1, and a fortiori, r. This will succeed with probability 1, as well.
To verify condition (c), recall that any set S ∈ S z must intersect every set in T z , which, when k = 1, implies that it contains the entire support by which Alice will choose the output. Thus, Pr A [Π z (A, B * ) ∈ S] = 1 by definition, which is again stronger than what we need.

Inductive
Step. Suppose Lemma 4.10 holds for nodes on all levels up to level k − 1. We will show that it holds for an arbitrary node z on level k. Assume it is Alice's turn at z. There are several possibilities: where R is a random subset of density kµ/r.
Proof: We will use Definition 4.8 and the inductive hypothesis to show that every child node z i is "good" for Bob-that is, on average over R, B * can make the outcome land in R with probability 1 − (k − 1) /r. Then certainly the same holds for node z, since Alice cannot help but move to such a node. Formally, we first notice that it suffices to show: where R is a ranges over all sets of density (k − 1)µ/r. Now, for z to be labelled a-lose, we must have used Rule 2 of Definition 4.8. Thus, all of the children of z are in case b-win. By the inductive hypothesis: for each child z i of z, where R ranges over all sets of density (k − 1)µ/r. Since at node z it is Alice's turn, we have where D z is the distribution according to which Alice chooses child z i of z when playing honestly, and the last inequality is by (1).

Claim 4.13
If z is in case a-win, then where R is a random subset of density kµ/r.
Proof: By Definition 4.8, z could have been labelled a-win either by Rule 1 or Rule 3a.
In Rule 1, z has a child z j that is in case b-lose. Since it is Alice's turn at node z, if she can choose a node z j "good" for her then node z will be "good" for her too.
Formally, by the inductive hypothesis applied to z j , we have that since node z is Alice's turn and she can always at least choose z j . Taking expectations of both sides, the claim is proven for this case. The alternative possibility is that z is in a-win because of Rule 3a. So among the sets T z (for all children z i in b-tie), we can find a disjoint subcollection P, Intuitively, what is going on here? Since no b-lose nodes are available among the children of z, Alice cannot simply choose such a branch as above. However, we know that from the b-tie nodes, for a high proportion of sets R, Alice can ensure the output lands in S ∪ R where S ∈ S z i , with high probability. But this is true for many possible sets S-not only at a given child, but also across all the potential children that are in case b-tie (i.e., any S ∈ T z ). Thus, we can expect that with enough disjoint sets in T z , the random set R will encompass S ∈ T z with high probability. The inductive hypothesis will then give the desired result.
Using Lemma 4.11, since P ⊆ T z consists of at least s k /s k−1 = (r/ )(re/µ) s k−1 (disjoint) sets of size at most s k−1 , we can conclude: where R 1 is a random subset of density µ/r. For any S ∈ T z , we can then assert: where R 2 is a random subset of density (k − 1)µ/r. This comes from applying the inductive hypothesis to the child z j such that S ∈ S z j , and since max A * {Pr B [Π z (A * , B) ∈ R 2 ∪ S]} is always at least max A * Pr B Π z j (A * , B) ∈ R 2 ∪ S (because at node z it is Alice's turn). Now, considering the selection of a random subset R of density kµ/r to be the random and independent choices of subsets R 1 and R 2 of densities µ/r and (k − 1)µ/r respectively (compensating for any overlap by adding random elements), we can combine (2) and (3) to derive The claim follows.
The final possibility is that z is in case a-tie. Since z is not a leaf, this can only come about by Rule 3b from Definition 4.8. That is, no children of z are in case b-lose, and at least some are in b-tie. Moreover, among T z the canonical maximal disjoint subcollection P has less than s k /s k−1 elements.
We must prove the following: S z is nonempty, T z is nonempty, Alice can win from this node with a helper set from T z , and Bob can win from this node with a helper set from S z (see Lemma 4.10).
We will require the following combinatorial lemma: Lemma 4.14 Let S be a collection of nonempty sets S 1 , . . . S m over a finite universe U, |U| = N , where for all i, |S i | ≤ s. Suppose that S has a maximal disjoint subcollection of size t. Then there exists a set X that intersects every S ∈ S (i.e., X ∩ S = ∅), and |X| ≤ t · s.
That is, either a collection of small sets has many disjoint members or it has a small "intersect-set"-a set intersecting each member of the collection. Intuitively, the union of a maximal disjoint subcollection must intersect every set, for otherwise one could add the disjoint set to form a larger disjoint subcollection.
Proof: Given a maximal disjoint subcollection P = {S 1 , . . . , S t }, we can define X = 1≤i≤t S i . That |X| ≤ t · s follows from the assumption that all S ∈ S have |S| ≤ s. Since S = ∅ for any S ∈ S, X ∩ S = ∅ for any S ∈ P. Now, suppose there exists S / ∈ P such that X ∩ S = ∅. But then, by the definition of X, P ∪ {S} consists of t + 1 disjoint sets, contradicting the assumption that P was a maximal disjoint subcollection. So X ∩ S = ∅ for all S ∈ S.
Proof: By Lemma 4.14, since the canonical maximal disjoint subcollection P of T z has size ≤ s k /s k−1 and since all S ∈ P have size ≤ s k−1 , there exists a set X of size at most (s k /s k−1 ) · s k−1 = s k intersecting every set in T z . By Definition 4.8, X ∈ S z .
We have already established that z has children in case b-tie (this follows from Definition 4.8 and from our assumption that z ∈ a-tie). By the inductive hypothesis on such a child z i , S z i , and thus T z is nonempty. The final claim required to prove Lemma 4.10 is the following: where R is a random subset of density kµ/r, and S is any element of S z .
This claim is the heart of the entire proof. All we know now is that there is at least one b-tie node that is a child of the current node z, and that among the corresponding sets in T z , the canonical maximal disjoint subcollection P ⊆ T z contains fewer than s k /s k−1 sets. That P is so small is a limitation on the power of Alice, who would like there to be enough such disjoint sets in P that she could choose randomly and encompass a set in P with high probability. The key to this proof is converting this limitation on Alice into an ability for Bob to cheat.
Proof: Fix a set S ∈ S z . Since an honest Alice will choose a child z i at random, it suffices to prove the following for each child z i : where R is a random subset of density kµ/r. So fix an arbitrary child z i . Looking to Definition 4.8, the only way we could have defined z to be in case a-tie is if all children z i are either in case b-win or case b-tie. So z i is in one of these two cases.
If z i is in case b-win, then we are done by the inductive hypothesis. So suppose z i is in case b-tie. Applying the inductive hypothesis to z i , we know that T z i is nonempty. Moreover, for any T ∈ T z i : where R 1 is a random subset of density (k − 1)µ/r. We divide the proof in two cases: First, suppose there exists T ∈ T z i such that T ⊆ S. Then (4) follows immediately from (5). Otherwise, consider the collection of sets T = {T − S : T ∈ T z i }. By assumption, ∅ / ∈ T . Informally, there aren't many disjoint sets in T z i -if there were, we would have labelled z i as a case b-win node for Bob. That said, by intersecting every (small) set that intersected every set in T z i , S captures the lack of disjointness of T z i in the first place. This claim states that once the elements of S are removed from consideration, the result has a large number of disjoint sets.

Proof:
Suppose for the sake of contradiction that T contains fewer than s k−1 /s k−2 disjoint sets. Recalling that these sets all have size at most s k−2 , and since ∅ / ∈ T , by Lemma 4.14 we can produce a set I of size at most (s k−1 /s k−2 ) · s k−2 = s k−1 intersecting every element of T . Without loss of generality, we can assume I ∩ S = ∅ (since for every T ∈ T , S ∩ T = ∅). Since I intersects every set in T , it follows that I intersects every set of T z i , and since we know I has size at most s k−1 , we conclude that by definition, I ∈ S z i . But we defined S to be an arbitrary element of S z , which means it intersects all elements of S z i , including I. Contradiction. Thus, T contains at least s k−1 /s k−2 disjoint sets.
Returning to the proof of Claim 4.17, by Lemma 4.11 we may conclude the following: where R 2 is a random subset of density µ/r. By the definition of T , this in turn implies: Using (5) and choosing R through independent choices of R 1 and R 2 as in the proof of Claim 4.13, we are done.
To conclude Theorem 4.5, it remains to prove the function f defining the set sizes s k does not grow too fast in the number of rounds. Intuitively, the reason the lower bound only holds for protocols with fewer than log * n − log * log * n − O(1) rounds is that these "helper sets" must have no more than o(N ) elements to be useful, but this function f grows as a tower-where both the base and the height of the tower grow with the number of rounds. Our challenge is to lower bound the number of rounds that keep this tower of size o(N ).
By applying Lemma 4.10 to the root of the tree and using Lemma 4.19, we prove Theorem 4.5 and thus Theorem 4.3.

Defining Multiplicative Guarantees
Recall Definition 5: the multiplicative difference of a distribution X from uniform is max T Pr x←X [x ∈ T ]/µ(T ), where T ranges over all subsets of U.
The multiplicative difference is always a rational in [1, N ], where 1 implies the uniform distribution, and N implies one element is chosen with probability 1. The multiplicative difference of a distribution is actually equal to the factor by which a single element's probability of being the protocol's output can be increased from uniform. Formally: For the other direction, let T be the set maximizing Pr x←X [x ∈ T ]/µ(T ). We claim there exists an element s ∈ T such that the following holds: To see this, consider an arbitrary set T = {x 1 , . . . x t }. Then we have: This implies that there exists an i such that Multiplicative and statistical difference are different, but related. Both work by bounding a function of the probability a distribution falls in a set and the density of that set. Even so, multiplicative difference tends to focus on the concentration of probability into small sets (indeed, by Lemma 5.1, sets of size 1), while statistical difference will prove more useful when considering larger subsets (e.g., a constant fraction of the universe).
This said, we can prove some basic relationships between the two metrics that will prove useful: Lemma 5.2 Let X be an arbitrary distribution over universe U, with N = |U|. Denote by the statistical difference of X from uniform and by ρ the multiplicative difference from uniform. Then: That a distribution will have statistical difference at most 1−1/ρ is especially interesting because the relationship contains no dependence on N . This fact implies that a distribution with a constant multiplicative difference will have a constant statistical difference, though the converse is not necessarily true. Put another way, a strong multiplicative guarantee is harder to achieve than a strong statistical guarantee.
Proof of Lemma 5.2: Suppose a distribution X has statistical difference from uniform. Let ρ denote the multiplicative difference of X. Then by Lemma 5.1, we have: Plugging back into the above, we have:

Multiplicative Lower Bounds
In this subsection, we concentrate on lower bounds regarding multiplicative guaranteesand indeed show that no protocol exists that provides constant multiplicative guarantees to either player. This is a very strong limitation on the ability of protocols to limit cheating player's power in this regard.
An Initial Lower Bound. Proposition 4.1 can be adapted to provide a quick lower bound for multiplicative guarantees. Specifically:

Proposition 5.3 In any random selection protocol, (ρ
Proof: The results follow immediately from Theorem 4.1 and from the second part of Lemma 5.2. This lower bound for multiplicative guarantees is not very strong-ρ is a number from 1 to N , but this lower bound is satisfied (for instance) as long as both ρ A and ρ B are at least 2. In Theorem 5.4 we will prove that ρ A ρ B ≥ N , which is a substantially stronger result.
On the other hand, when looking at one player getting a statistical guarantee and the other player getting a multiplicative guarantee, Proposition 5.3 does provide some useful information. Specifically, it tells us that (minus a small 1/N term) we can always expect the statistical guarantee for one player to be worse than the reciprocal of the multiplicative guarantee to the other player. This explains inverse relationships in existing protocols of [DGW94] (where ε = 1/poly(n) and ρ = poly(n)) and [GSV98] (where ε = poly(n) · 2 −k and ρ = 2 k for any k). 6 Notice that these earlier works focus on the case of nonconstant guarantees (ε → 0 and ρ → ∞). Later, we show that the Iterated Random Shift Protocol presented earlier achieves simultaneous constant statistical and multiplicative guarantees, and prove it has optimal round complexity up to a factor of 2 + o(1).
A Tight Lower Bound. The lower bound follows from the work of Goldreich, Goldwasser, and Linial [GGL98].

Corollary 5.5 In any protocol Π, max{ρ
Recalling the multiplicative guarantee ρ A is the greatest factor by which Bob can improve the probability that a single element is chosen over uniform, we conclude: In any random selection protocol, at least one of the players can improve the probability that a single element is chosen by a factor exponential in the length of the output (which equals log N ).
Goldreich et. al. [GGL98] showed a more general result than Theorem 5.4 (for multiparty protocols) using different language and a moderately involved proof. Restricting to the twoparty case, as we will see, provides a simple and elegant proof.
Proof of Theorem 5.4: Fix some element v of the universe. Now, consider the game tree T of the protocol (see Definition 2.2). At each node z of the tree, denote the protocol induced by beginning at node z to be Π z . Then define: Actually, the protocol of [GSV98] does not provide a multiplicative guarantee of 2 k , but rather ensures that the probability that the output lands in any set T of density µ is at most 2 k · µ + o(1). Our lower bound also applies to this more general type of guarantee.
That is, φ z A (resp. φ z B ) is the highest probability Alice (resp. Bob) can make the the output to be v, given that the protocol is now at node z and that Bob (resp. Alice) is playing honestly. p z is the probability that v is chosen starting from z and assuming both player play honestly.
The following lemma is the heart of the proof: To prove the theorem from Lemma 5.6, take z to be the root of the tree. Then we have that φ A ·φ B ≥ p, where φ A (resp. φ B ) is the probability that Alice (resp. Bob) can force the output to be v, and p is the probability that v is chosen when both players play honestly. Notice that v was arbitrary, so we certainly can choose v such that p ≥ 1/N . By definition, But then we have: Proof of Lemma 5.6: We will prove the lemma by backwards induction on the levels of the tree.
When z is a leaf, the protocol is complete. If v is the output of the protocol at leaf z, In either case, the lemma holds. Now, suppose that the lemma holds for all children of z-denote them z 1 , . . . z m . Thus, we know φ z i A φ z i B ≥ p z i for all children z i . Suppose also, without loss of generality, that at node z it is Alice's turn.
Suppose an honest Alice chooses child node z i with probability λ i . Then p z = λ i p z i , and φ z B = λ i φ z i B . This latter equality holds because the probability v will be chosen when Alice is honest will just be the sum of the probabilities v will be chosen from each child, weighted by the probability of reaching that child. When considering φ z A , however, Alice will have the option of cheating. She will simply choose the child that affords her the best probability of successfully forcing the output to be v. That is, φ z A = max z i φ z i A , and so in particular for all i, φ z A ≥ φ z i A . Now just compute: This completes the proof of the lemma, and thus the theorem is proven.
What is the intuition for this result? It is best to try to understand the intuition for the lemma-that φ z A · φ z B ≥ p z . From there it is only symbol manipulation to determine that To see this, suppose that there were only one path down the tree that led to v being chosen as the output. At each node along that path, starting from the root, there is a certain probability that the given player will choose the (unique) next node in the path when it plays honestly. So the probability that v is chosen is the product of these probabilities when both players play honestly. If Alice is cheating, then the probability v will be chosen is the product of the probabilities at nodes where it's Bob's turn. Likewise, the probability that v is chosen when Bob is cheating is the product of the probabilities at nodes where it's Alice's turn. In this case, the lemma holds with equality.
Intuitively, then, the probability v is chosen honestly is the product of the two cheating probabilities because choosing v honestly requires both Alice and Bob to happen to choose the right paths (i.e., to the leaf where v is selected), whereas choosing v with one player cheating requires only the honest player to happen to choose correctly (the cheating player will always choose the right path).
It remains to understand why multiple paths yielding v only help the cheaters. This makes sense, because it merely provides more options to the cheating player-if two routes exist that both could yield output v, the cheating player can now choose the more attractive option.
Note that, unlike Theorem 4.1, this result relies centrally on the assumption that, when one player is cheating, the other player is playing honestly. This assumption is certainly one of the key drivers of the result, as it then allows us to relate the probability of an element being chosen by both players being honest to the probability it is chosen when one cheats.
This lower bound is tight: Proposition 5.7 For all positive integers N ≥ K, there exists a 2-round protocol for selecting from a universe of size N satisfying ρ A = K and ρ B = N/K.
Proof: Consider the following protocol Π(A, B) over universe U of size N , which we call the Random Set Protocol with parameter K : 1. Alice chooses a random subset T of U of size K.
2. Bob chooses a random element x ∈ T .
3. The output is x. Claim 5.9 ρ A = K.
Proof of Claim: Again, by Lemma 5.1, If Alice plays honestly, then in order for Π((A, B * )) = s, she must select it, which occurs with probability K/N . Bob can achieve this probability for any s ∈ U assuming he always selects s when available.
Note that one negative aspect of the Random Set Protocol is that it is not efficientsending a description of the random subset requires communication linear in N (rather than polylog(N )). It should be emphasized that this is not necessary to achieve ρ A ρ B = N : other very simple and efficient protocols achieve this tradeoff. Specifically, instead of using all sets of size K, we can use any subcollection such that every element of [N ] is contained in the same number of sets. For example, if N = K · L for an integer L, then we can view the universe as The optimality of such a trivial protocol tells us that, ultimately, multiplicative guarantees are not by themselves very interesting metrics of study for two-party random selection protocols. Optimal protocols are very easy to come by, and protocols that do not feel very effective prove to be the best possible. We must look to other metrics that are more capable of separating out protocols that are intuitively "good" from those that are not.
That said, it is interesting to consider multiplicative guarantees as one half of the equation: producing a protocol with an optimal tradeoff of statistical guarantee to one player and multiplicative guarantee to the other player is certainly nontrivial. Though we will not be able to match the lower bound of Proposition 5.3, in the next subsection we show that the Iterated Random Shift Protocol presented earlier achieves simultaneous constant statistical and multiplicative guarantees, and prove it has optimal round complexity up to a factor of 2 + o(1).

The Multiplicative Guarantees of the Iterated Random Shift Protocol
In this section we discuss the multiplicative guarantees provided by the Iterated Random Shift Protocol. Although we have seen how lower bounds require that one of the players (in this case, Alice) receives a very poor multiplicative guarantee, we can show that Bob receives a very strong guarantee. In this way, we can say something about the ability of a protocol to provide a strong multiplicative guarantee to one player, while providing a strong statistical guarantee to the other. Theorem 4.3 implies a lower bound on the round complexity of protocols achieving simultaneous statistical and multiplicative guarantees: Theorem 5.10 For every two constants A < 1 and ρ B , there exists a constant c such that any protocol Π selecting from a universe of size N and achieving statistical guarantee A and multiplicative guarantee ρ B will have at least log * N − log * log * N − c rounds. Similarly for ρ A and B .
This theorem follows immediately from Theorem 4.3 and the second part of Lemma 5.2. We can show that the Iterated Random Shift Protocol achieves this: Theorem 5.11 There exist constants < 1 and ρ such that the Iterated Random Shift Protocol with the cutoff parameter M taken to be a sufficiently large constant achieves guarantees ρ B ≤ ρ and A ≤ . This is the first protocol achieving constant statistical and multiplicative guarantees we know of, and according to the lower bound of Theorem 5.10 it has optimal round complexity up to a factor of 2 + o(1).
Given Theorem 1.2, it suffices to show the following: Proposition 5.12 Let Π be a Iterated Random Shift Protocol defined with constant cutoff parameter M . Then Π provides a constant multiplicative guarantee to Bob: there exists constant ρ such that, as long as Bob plays honestly, the output of the Iterated Random Shift Protocol will fall in a set T with probability at most M 2 · µ(T ), for any set T .
Proof of Proposition 5.12: Fix an arbitrary set T ⊆ U. Recall the definition of the following variables in the proof of Theorem 1. 2: N 0 , N 1 , . . . , N k * are the universe sizes in an execution of the Iterated Random Shift Protocol, k * is the first value of k such that N k = M 2 , m k = max{M, log 3 N k−1 } is the parameter used in the k'th execution of the Random Shift Protocol, and we define: The following is the key lemma: Applying this logic within the Iterated Random Shift Protocol, the lemma is proven (since for given µ(T k−1 ), we know E[µ(T k )] = µ(T k−1 )).
By induction, we then have that for all k, E[µ(T k )] = µ(T ). In particular, this is true for k = k * . We can then derive: Since if |T k * | = 0, the protocol's output cannot possibly fall in T , we conclude that the probability the output falls in T is at most M 2 · µ(T ). This proves the proposition, and thus Theorem 5.11.
As an aside, notice that by using Lemma 5.2, this logic allows us to conclude one half of Theorem 1.2: the Iterated Random Shift Protocol provides a constant statistical guarantee to Bob.
We can conclude that the Iterated Random Shift Protocol has the following properties: • It has only 2 log * N + O(1) rounds.
• It provides both Alice and Bob with constant statistical guarantees (equivalently, it satisfies the Statistical Criterion).
• It provides Bob with a constant multiplicative guarantee.
Notice that in the above proof, we never used the multiplicative guarantee properties of the GGL Protocol-we simply relied on the initial recursions of Random Shift to provide the strong guarantee to Bob.
In fact, by changing the protocol used when the universe size becomes of size M 2 in the definition of the Iterated Random Shift Protocol, we can improve even further the multiplicative guarantee given to Bob. The current protocol only implies that Bob gets some constant multiplicative guarantee. By using the Random Set Protocol on the universe of size M 2 with parameter K = M 2 /(1 + γ) instead of GGL, however, (see definition in Proposition 5.7), Bob can achieve a multiplicative guarantee 1+γ, while still keeping Alice's statistical guarantee constant (when γ is constant).
Proof: First, bound r by log * N . Again, for shorthand we will write s k for g(k, r, , µ). Thus, we have that s k = (r/ )(re/µ) s k−1 s k−1 Notice that this is no more than (r 2 e/( µ)) s k−1 . (xy ≤ x y if x, y ≥ 2.) Letting d = r 2 e/( µ), we can then bound s k by t k , where t k is defined by t 0 = 1 and t k = d t k−1 . This means that we can set k = log * d N − 1 and still have s k ≤ t k ≤ log N (recall that by our definition, log * b N is always an integer, for any b or N ). It only remains to relate this to a base 2 logarithm: Proof: Recall log (k) N is k iterated logarithms of N . We claim the following: Proof: The base case k = 0 is clear. Assume, then, that log (k−1) N ≤ (2 log d) log Plugging in k = log * d N , then we have that log (log * d N ) N ≤ 2 log d. Applying log * (2 log d) logarithms to both sides, we have log (log * d N +log * (2 log d)) N ≤ 1. Since log * N is defined to be the least k such that log (k) N ≤ 1, it follows that log * N ≤ log * d N + log * (2 log d).
Thus we have that we can set k to be at least log * d N − 1 and s k will be no more than log N . Moreover, log * d N − 1 ≥ log log * N − log * (2 log d) − 1 = log * N − log * (2 log((r 2 e)/ µ)) − 1 ≥ log * N − log * log * N − ζ( , µ) for an appropriately chosen constant ζ, since we can bound r by log * N .
We can thus remove the covariance term from the upper bound: