Open Problem in Voting Theory

Scott Garrabrant

Maximal Lottery-Lotteries

Open Problem in Voting Theory

by Scott Garrabrant

7 min read17th Oct 202216 comments

75

Voting TheoryWorld Optimization

Frontpage

In the previous post, I presented the concept of a maximal lottery-lottery as follows:

We start with a finite set of candidates and a distribution $V \in Δ (C \to [0, 1])$ on utility functions on $C$ . Given lottery-lotteries $A, B \in Δ (Δ (C))$ , we say that $A$ dominates $B$ if $P_{A \sim A, B \sim B, v \sim V} (v (A) > v (B)) \geq P_{A \sim A, B \sim B, v \sim V} (v (B) > v (A))$ . A maximal lottery-lottery is an $M \in Δ (Δ (C))$ such that for all other $L \in Δ (Δ (C))$ , $M$ dominates $L$ .

Conjecture: For any $C$ and $V$ , there always exists a maximal lottery-lottery.

We discussed how we can construct a powerful and elegant new voting system, if only we can guarantee that a maximal lottery-lottery exists. However it is an open problem whether or not they exist. In this post, I will discuss a few different angles of attacking this conjecture.

An Infinite Game

First, consider the following partial result:

Theorem: Let $D$ be any nonempty finite subset of $Δ (C)$ . Then, there exists an $M \in Δ (D)$ such that for all other $L \in Δ (D)$ , $M$ dominates $L$ .

Proof: Consider the following symmetric two-player zero-sum game between Alice and Bob: Alice chooses an $A \in D$ , and Bob chooses a $B \in D$ . Alice gets utility equal to $P_{v \sim V} (v (A) > v (B)) - P_{v \sim V} (v (B) > v (A))$ , and Bob gets the negative of this utility. This game must have a Nash equilibrium, and in this Nash equilibrium, Alice must play a mixed strategy $M \in Δ (D)$ . Since this is a Nash equilibrium in a symmetric zero-sum game, Alice must get at least 0 utility in expectation, no matter what $L \in Δ (D)$ Bob plays. Alice getting at least 0 utility in expectation is exactly saying that $M$ dominates $L$ . $□$

This proof is valid (you can check the details), so why doesn't it generalize to infinite subsets $D$ , and in particular $D = Δ (C)$ ? The problem is that games with infinitely many options need not have Nash equilibria. For example consider the game where Alice and Bob each choose a natural number, and whoever chooses the larger number wins.

Fortunately, often in cases like this, you can get past finiteness assumptions by instead using compactness assumptions, and $Δ (C)$ is compact. Indeed, games with a compact option space and continuous utility functions always have mixed strategy equilibria.

Unfortunately, the game we constructed does not have a continuous utility function! This is easy to see, because if $V$ assigns probability 1 to a single utility function $v$ , if $v (A) > v (B)$ , then Alice wins and gets utility 1, but as we continuously move around $A$ , we can discontinuously jump to the game being a tie, and then to Bob winning.

Fortunately, there are a bunch of papers analyzing situations like this. Discontinuities like this show up a bunch in auctions, which also have an infinite compact option space and discontinuous utilities, resulting from discontinuities in who wins the auction.

Unfortunately, Sam Eisenstat and I spent a while looking through a bunch of papers like this, and failed to find anything that we could apply. Several things looked like they might be able to work, but everything we tried ultimately turned out to be a dead end.

A Limit of Finite Games

The above partial result is actually pretty practically useful on its own. Even if maximal lottery-lotteries do not exist, you can always just take $D = D_{n}$ to be the set of all lotteries that assign each candidate a probability that is a multiple of $\frac{1}{n}$ for some large natural number $n$ . Because of this, practically speaking, we can use the maximal lottery-lotteries voting system today, without ever proving the conjecture!

Further, we can consider what happens as we send $n$ to infinity. If there is an odd number of voters that each have a utility function in general position, we will always get a unique distribution on $M_{n} \in Δ (D_{n})$ that dominates all other distributions on $D_{n}$ , and we can look at what happens to $M_{n}$ as $n$ goes to infinity. Hopefully this converges, but even if it doesn't, we can take any limit point, and ask whether that limit point is a maximal lottery-lottery. My guess is that the sequence does in fact converge, and that the (unique) limit point is a maximal lottery-lottery, but I have not had any luck yet proving either of these.

However, this enables us to get some empirical results! We can just compute some of these finite approximations. Even with three candidates, we have to solve a game with $n^{2}$ different options, and I have not yet optimized it beyond using an off the shelf linear programming solver in Haskell, running for a few minutes, so the highest I got was $n = 45$ . (I am sure we can do much better.)

I took three candidates and three voters, and gave each voter a utility function in general position. Here is a picture of $M_{30}$ . The blank space means that 0 (or near 0) utility is put there. The numbers show the negative of the natural logarithm of the probability that that point is chosen. (Remember, this is a distribution on distributions on a three element set, or equivalently a distribution on the triangle.)

______________________4.4.5.5.5.3.4.4.5.5.5.5.4.5.5.__________
 __________________________7.____________4.6.5.8.5.4.________
  __________________4.5.5.______6.5.__5.__5.9.5.4.4.4.______
   ________________4.5.5.4.______5.____5.__5.__4.4.5.5.____
    ______________5.5.5.5.7.____5.4.5.7.5.108.6.5.4.4.4.5.
     ______________5.________________________5.4.5.____7.
      ____________7.5.4.5.____________6.5.6.__4.__5.6.__
       ______5.4.5.8.__5.8.____6.5.__6.__7.__6.4.6.4.5.
        ____6.7.5.5.5.5.__5.6.8.6.5.7.5.____6.5.5.__4.
         6.6.5.4.5.4.4.5.6.__6.__6.5.__6.____5.6.7.4.
          __5.6.5.________5.__5.6.________5.6.8.5.5.
           4.5.5.__4.6.__5.6.7.5.6.6.____6.4.5.7.4.
            5.4.5.5.5.__5.5.5.7.5.________5.6.4.4.
             6.5.5.5.6.__5.__5.5.6.____7.8.4.7.7.
              5.5.5.5.5.____5.5.5.5.__5.6.7.__4.
               7.4.5.6.__6.5.5.4.6.____4.4.5.5.
                4.__8.5.__5.7.__5.5.5.__6.____
                 5.6.6.6.5.6.5.______________
                  4.________________________
                   ________________________
                    ______________________
                     ____________________
                      __________________
                       ________________
                        ______________
                         ____________
                          __________
                           ________
                            ______
                             ____
                              __

$M_{40}$ for the same candidates and voters looks similar, but with less detail, and some minor changes. This is a lot of data, but since we are going to use it to sample and then sample from the sample, for the purpose of maximal lottery-lotteries, we only really care about the probability each candidate is elected. For the above example, we get (0.262, 0.415, 0.323) for $M_{30}$ and (0.263,0.417,0.320) for $M_{40}$ , which seems like some slight empirical evidence towards converging. Obviously there is a lot of room for a much more careful and larger experimental analysis.

Fractals!?!

I believe that maximal lottery-lotteries exist, and are the limit of the above finite approximations, and (when they don't put all probability mass on the distribution that puts all probability mass on a single candidate) usually have some fractal structure. I think the complexity of the above picture is evidence of this, but I also have some mathematical intuition for why this makes sense (which feels hard to put into words):

Being a Nash equilibrium involves equalizing a bunch of constraints. These constraints are combined with the fact that the distribution is supported entirely within the triangle, and so where probability mass would naturally want to be placed outside the triangle, it gets cut off on the boundary. When it is cut off like this, something has to change inside the triangle to fix new constraints that become broken, and this causes the constraints to sort of reflect off the boundary of the triangle, somewhat like ripples. (Some of this intuition is coming from seeing similar things happen when studying nim.) Sorry that I don't have a more articulate description, but the point is, I think we are getting fractals.

Unfortunately, I think this is going to make it harder to show that maximal lottery-lotteries exist, since we will likely not have simple closed forms of what the maximal lottery-lotteries are.

A Generalization of Colonel Blotto

Another way to look at maximal lottery-lotteries is as a generalization of the Colonel Blotto game.

In the the continuous Colonel Blotto game, there are two opposing officers that each have some resources, that they can distribute between some battlefields. We will assume that they each have 1 total unit of resources. Whoever sends the most resources to any given battlefield wins that battlefield. (If they send the same amount, they each win half the battlefield.) Each officer gets utility equal to the number of battlefields that they win. This game and its variants have been analyzed quite a bit over the last century.

Finding maximal lotteries is like solving a variant of the Colonel Blotto game. In the original game, you choose how to distribute resources from the simplex where each battlefield gets non-negative resources, and the total of all the resources is 1, In this variant, you instead have some other convex polytope representing the allowable ways to distribute resources.

To reduce finding maximal lottery-lotteries to a Colonel Blotto game, we consider each voter to be a battlefield. (We will only discuss the case where there is a finite set of equal weight voters here.) Each candidate can be thought of as assigning a utility to each voter, which we can also interpret as sending a number of resources equal to that utility. Thus, the polytope of allowable ways to send resources to battlefields is just the convex hull of these points that come from the candidates. Each point in this convex hull corresponds to (at least) one distribution on candidates, and a solution to this Colonel Blotto game corresponds to a maximal lottery-lottery.

This old RAND paper analyzing Colonel Blotto, seems like some further evidence that we might be looking at fractals.

A Call for Help

I think that proving the existence of maximal lottery-lotteries would actually be a substantial advance for voting theory. (Although it might take some work to get people interested in something that challenges this many of the assumptions of the field.) However, I am not intending to work on this anymore. If you want to take a shot at proving they exist, and have questions, get in touch with me.

Without a proof of existence, I am probably not going to bother writing this up as more than these blogposts, but if anyone manages to prove it, I would be happy to write a paper together. Also if anyone feels very excited about working on this and getting it published in spite of not having a proof, I could maybe be up for collaborating on something like that, especially if someone collects a bunch of experimental evidence and better fractal pictures and is willing to do a lot of the writing.

Voting TheoryWorld Optimization

Frontpage

75

Maximal Lottery-Lotteries

15 comments72 karma

Mentioned in

80Voting Theory Introduction

15(Geometrically) Maximal Lottery-Lotteries Exist

15(Geometrically) Maximal Lottery-Lotteries Are Probably Not Unique

Open Problem in Voting Theory

New Comment

16 comments, sorted by

top scoring

Click to highlight new comments since: Today at 11:05 PM

[-]paulfchristiano2y170

(I no longer thing this strategy works, see the thread here; sorry for wasting your time Scott!)

Here's an approach which I think could work, though there are definitely some finicky details (including a potentially-fatal question about convergence at the end, which I couldn't deal with---and which I think may not be the weakest link given that the other steps are also a bit shaky).

For lottery-lotteries and $B$ , write:

$f (A, B) = P_{L_{A} \sim A, L_{B} \sim B, u \sim V} [E_{a \sim L_{A}} [u (a)] > E_{B \sim L_{B}} [u (b)]] - P_{L_{A} \sim A, L_{B} \sim B, u \sim V} [E_{B \sim L_{B}} [u (b)] > E_{a \sim L_{A}} [u (a)]]$

We say that B strictly dominates A if $f (A, B) < 0$ .

Lemma: if $B$ strictly dominates $A$ , then there is a continuous $B^{'}$ that also strictly dominates $A$ , i.e. such that $B^{'}$ has a density with respect to Lebesgue measure and $f (A, B^{'}) < 0$ .

(ETA: this lemma is false, see comment from Scott below. I think it's fixed but am not sure.)

Proof: (I'll assume for convenience that there is a finite set of $m$ voters although I think this is easy to relax. We could assume that $B$ is a point mass if we wanted though obviously $B^{'}$ won't be. I think this could be greatly simplified.)

Suppose that $f (A, B) = - ε$ . For each utility function $u$ in the support of the electorate $V$ we'll define $X_{u}$ as the space of values for utility function $u$ that occur with positive probability under $A$ . Note that $X_{u}$ is countable and hence that there is a finite set $X_{u}^{'} \subset X_{u}$ which has at least $1 - \frac{ε}{2 m}$ of the probability mass. Let $L_{u}$ be the space of lotteries that have a utility in $X_{u}^{'}$ . Let $L$ be the union of $L_{u}$ over all the utility functions $u$ .

The idea is that $L$ is just a finite union of hyperplanes, and so it divides the space of lotteries into a finite set of open regions. Within each of these regions, continuously shifting $B$ 's probability between nearby lotteries results in continuous changes in $f (A, B)$ , plus a discontinuous term whose magnitude is at most $m \frac{ε}{2 m} = \frac{ε}{2}$ (corresponding to the positive-probability utilities in $X_{u}$ that aren't in $X_{u}^{'}$ ).

If $B$ assigns probability $0$ to $L$ then it is trivial to construct $B^{'}$ by spreading out $B$ continuously within each of these regions: there is guaranteed to be some spreading that changes $f (A, B)$ by strictly less than $ε$ (since we can make the continuous part of the change arbitrarily small and the discontinuous part is at most $\frac{ε}{2}$ ) and so we can find some way of spreading out mass that results in a continuous $B^{'}$ with $f (A, B^{'}) < 0$ .

If $B$ assigns positive probability to $L$ we need one more step. The idea is that for each point in $L$ we can find a "safe" direction to shift the probability from $B$ so that it's no longer in $L$ : if we pick a random direction each voter prefers to move one way or the other, and so one direction must have majority approval (note that the argument would break at this point if $f (A, B)$ depended on the probability that $B$ was at least as good as $A$ , rather than being defined symmetrically). Once we've picked such a direction for every point on $L$ , we can associate all of the probability from $L$ into one of the adjacent open regions and proceed as before. (ETA: this step fails if you are on the edge of the simplex.)

(This lemma might be enough to prove the result when combined with standard fixed point theorems. I'll finish by using standard tools from CS but that may just be because it's what I'm familiar with. I also can't easily do the convergence analysis at the end so that might not work and/or be much simpler from a mathematical perspective.)

Proving the theorem given the lemma: We'll describe a sequence of distributions $A_{T} \to A_{\infty}$ such that $f (A_{T}, B) < - ε$ implies that $B$ has $- Ω (ε \sqrt{T})$ entropy.

We'd like the sequence to be convergent $A_{T} \to A_{\infty}$ in the sense that for any continuous $B$ , $f (A_{T}, B) \to f (A_{\infty}, B)$ . If this were the case then we'd infer that we can't have $f (A_{\infty}, B) < 0$ for a continuous $B$ , since that would give us an absolute $ε$ which $f (A_{T}, B) < - ε$ for all sufficiently large $T$ , implying that $B$ has $- \infty$ entropy. By the main lemma, we'd conclude that no $B$ (continuous or otherwise) can strictly dominate $A_{\infty}$ . For now I'll present the algorithm to generate $A_{T}$ , in the next section I'll discuss convergence.

Our construction is follow-the-regularized leader with a standard analysis for playing 2-player games. It doesn't converge to a Nash equilibrium but I think it will still be good enough for us.

Let $H (A)$ be the entropy of a lottery-lottery (relative to Lebesgue measure on the simplex) and take $A_{T} = arg {max}_{A} (H (A) + \sum_{t < T} α_{t} f (A, A_{t}))$ for a sequence $α_{t} = Θ (\frac{1}{\sqrt{T}})$ (the exact decay rate of $α_{t}$ isn't important as long as it goes to zero but the sum diverges). Note that $A_{T} (L) \propto exp (- \sum_{t < T} α_{t} f (A, A_{t}))$ . and that $f \in [- 1, 1]$ . As a result, the distributions $A_{T} (L)$ change very slowly, and we have $f (A_{T + 1}, X) = f (A_{T}, X) + O (α_{T})$ for any lottery-lottery $X$ .

Now suppose that $lim f (B, A_{t}) > ε$ . Since $\sum_{t} α_{t}$ diverges, we can find arbitrarily large $T$ such that $\sum_{t < T} α_{t} f (B, A_{t}) \geq Ω (ε \sum_{t < T} α_{t}) = Ω (ε \sqrt{T})$ . Then:

Since $A_{T}$ was picked as the maximizer: $H (B) + \sum_{t < T} α_{t} f (B, A_{t}) \leq H (A_{T}) + \sum_{t < T} α_{t} f (A_{T}, A_{t})$
Then we can split off the last term from that sum: $H (A_{T}) + \sum_{t < T - 1} α_{t} f (A_{T}, A_{t}) + α_{T - 1} f (A_{T}, A_{T - 1})$
Then we can use the fact that $A_{T - 1}$ was itself a maximizer, to replace every $A_{T}$ except the last with $A_{T - 1}$ , and obtain: $H (B) + \sum_{t < T} α_{t} f (B, A_{t}) \leq H (A_{T - 1}) + \sum_{t < T - 1} α_{t} f (A_{T - 1}, A_{t}) + α_{T - 1} f (A_{T}, A_{T - 1})$
Continuing in this way, we get $H (B) + \sum_{t < T} α_{t} f (B, A_{t}) \leq H (A_{0}) + \sum_{t < T} α_{t} f (A_{t + 1}, A_{t})$
But now we use the fact that each summand $f (A_{t + 1}, A_{t}) = f (A_{t}, A_{t}) + O (α_{t}) = O (α_{t})$ , since $f$ is symmetric and $f (A_{t + 1}, X) \approx f (A_{t}, X)$ .
Thus $H (B) + Ω (ε \sqrt{T}) \leq H (B) + \sum_{t < T} α_{t} f (B, A_{t}) \leq H (A_{0}) + \sum_{t < T} O (α_{t}^{2}) = H (A_{0}) + O (log T)$
Thus $H (B) < - Ω (ε \sqrt{T})$

(Note that we used the finite number of candidates at this step to get $H (A_{0}) = O (1)$ . Effectively we use finite candidates to show that there is only finitely much space, and use the continuity of $B$ to argue that we only need to learn up to finite precision, and therefore this is no worse than a finite learning problem.)

Convergence: we want to argue that there exists a distribution $A_{\infty}$ such that $f (A_{T}, B) \to f (A_{\infty}, B)$ for any continuous $B$ . I don't remember my topology very well, but I think we can do this as long as $A_{T} (S)$ converges to a limit for every open set $S$ . I can just take $A_{\infty} (S)$ to be that limit, and under compactness I get a probability distribution.

(ETA: I think we can actually use any limit point and don't need convergence, see below. So I think this is fine but am not sure.)

So we need to rule out the situation where there is some open set $S$ and real numbers $ℓ < h$ such that $A_{T} (S) < ℓ$ and $A_{T} (S) > h$ each happen infinitely often. This seems like a really messed up situation that shouldn't happen, but I don't have a clean proof tonight.

Some notes that make me think we should be fine:

We don't actually need the $A_{T}$ to converge to a limit, we could define $A_{\infty} = lim \frac{\sum α_{t} A_{t}}{\sum α_{t}}$ if that limit exists. And if we want we can take $α_{T} = \frac{1}{T}$ or $α_{T} = \frac{1}{log T}$ and the argument above still goes through.
If $A_{T_{1}} (S) < ℓ$ but $A_{T_{2}} (S) > h$ then the two distributions must be fairly far apart and so the average of the distributions has significantly higher entropy than the average of their entropies. That's a strange situation, since we picked each $A_{T}$ to maximize entropy (plus the linear function $\sum α_{t} f (A, A_{t})$ ).
By sequential compactness, we can extract a pair of lottery-lotteries $A_{ℓ}$ and $A_{h}$ at a significant distance from one another such that the iterates get arbitrarily close to each of them infinitely often. That seems even more messed up.
If we really have a problem here, it seems like we can extend the algorithm above to make it converge more nicely by adding traders who push $A (S)$ back into the interval $[ℓ, h]$ (such that if you go out of the interval infinitely often they make an infinite profit and eventually enforce a hard constraint on $A (S)$ ).

I'll probably revisit this tomorrow, but overall non-convergence seems weird up enough that I have a higher probability of something going wrong at some other step in the argument. I think that the main issues with convergence are really the ones in the first lemma that let us focus on continuous $B$ .

[This comment is no longer endorsed by its author]Reply

[-]Scott Garrabrant2y62

If assigns positive probability to $L$ we need one more step. The idea is that for each point in $L$ we can find a "safe" direction to shift the probability from $B$ so that it's no longer in $L$ : if we pick a random direction each voter prefers to move one way or the other, and so one direction must have majority approval (note that the argument would break at this point if $f (A, B)$ depended on the probability that $B$ was at least as good as $A$ , rather than being defined symmetrically). Once we've picked such a direction for every point on $L$ , we can associate all of the probability from $L$ into one of the adjacent open regions and proceed as before.

Haven't ready through everything yet, but I am skeptical here with respect to points on the boundary of the simplex.

[-]paulfchristiano2y120

I agree that's a bug in the proof (and the lemma obviously can't be true as written given that e.g. if 90% of voters are single-issue voters who hate candidate X, then no continuous lottery-lottery can dominate a lottery that puts 0 probability on X).

I haven't thought about how serious this is. It feels like we could hope to fix things by considering a game where you pick (i) which candidates to give probability 0, which is a discrete choice from a set of size and hence relatively unproblematic, (ii) what distribution to play within the corresponding simplex, where we can make a similar continuity argument to handle the infinitely many options.

I don't know if that plays nicely, I'll probably think about it tomorrow.

At face value it looks like it will be OK: we just define the same algorithm, but now we define a probability distribution over $2^{n}$ simplices and define conditional entropy as the sum of the discrete entropy of the outer choice + the continuous entropy over the chosen simplex. (But there are a bunch of subtleties.)

ETA: you can't just take the topology generated by the open sets of every face and then run my argument in exactly the same way, because that isn't compact and so we don't get a limit point $A_{\infty}$ . But we can define our algorithm over the disjoint union of all the faces which is compact (note that a given lottery-lottery is no longer uniquely represented but that's not a problem). The lemma still shows that if any $B$ strictly dominates $A_{\infty}$ then there must be a continuous $B$ (in the disjoint union) that dominates $A_{\infty}$ . And we can still find arbitrarily large $T$ for which $f (B, \sum_{t < T} α_{t} A_{t}) > ε$ , which leads to a contradiction just as before. So overall I think that this fix works.

[-]Scott Garrabrant2y40

Ok, here are some questions to help me understand/poke holes in this proof. (Don't think too hard on these questions. If the answers are not obvious to you, then I am asking the wrong questions.

Does the argument (or a simple refactorization of the argument you also believe) decompose through "If strictly dominates $A_{\infty}$ , then there is a $B^{'}$ that also strictly dominates $A_{\infty}$ such that the probability of any voter being indifferent between something sampled from $B^{'}$ and something sampled from $A_{\infty}$ is 0 (or negligable)."
If Yes to 1, do you believe the above lemma is also true for an arbitrary $A$ ?
If Yes to 1, do you believe the above lemma is true if we replace "strictly dominates" with " $f (B, A) > x$ " for some fixed $x > 0$ .
If No to 1, is there some minor modification that will give me a similar looking lemma the argument does decompose through?

[-]paulfchristiano2y40

#1: It doesn't. The previous version implied that there was a for which the probability of ties was arbitrarily low, but the new version can have lots of voters who are indifferent. If B puts its mass in the interior of a face F, then we redistribute probability mass within the interior of F, but some voters assign the same utility to everything in F.

#4: The current lemma is:

If B strictly dominates A, then there is a face F of the simplex and a B' which is continuous over F such that B' strictly dominates A.

[-]Scott Garrabrant2y20

I still haven't understood all of your argument, but have you missed the fact that some faces are entirely contained in ?

(Your arguments look similar to stuff we did when trying to apply this paper.)

[-]paulfchristiano2y20

I think this is OK (though still lots of room for subtleties). Spelling this aspect out in more detail:

Fix some arbitrary A which is strictly dominated by B.
We claim that there exists a face F and a continuous B' over F such that B' also dominates A.
Sample some lottery from B to obtain a concrete lottery b that strictly dominates A.
If b is a vertex we are done. Otherwise, let F be the face such that b lies in the interior of F.
For each voter, their level sets are either hyperplanes in F or else they are all of F.
We can ignore the voters who are indifferent within all of F, because any B' supported in F will be the same as b from those voters' perspectives.
Now define as before, but restricting to the voters who have preferences within F.
We obtain a continuous distribution B' for which $f (B^{'}, A) \approx f (b, A)$ if we ignore the voters who were indifferent. But $f (B^{'}, A) = f (b, A)$ for the voters who are indifferent, so we have $f (B^{'}, A) \approx f (b, A)$ overall.
(Of course this just goes through the existence of an open set of lotteries all of which strictly dominate A, we can just take B' uniform over that set.)

This lemma is what we need, because we will run follow the leader over the space of pairs (F, A) where F is a face and A is a distribution over that face. So we conclude that the limit is not dominated by any pair (F, B') where F is a face and B' is a continuous distribution.

[-]Scott Garrabrant2y20

Ok, I believe this version of the Lemma, and am moving on to trying to get the rest of the argument.

[-]paulfchristiano2y20

Actually we could take to be any limit point of $A_{< T} = \frac{\sum_{t < T} α_{t} A_{t}}{\sum_{t < T} α_{t}}$ (in the sense that $f (A_{\infty}, B)$ is a limit point of $f (A_{< T}, B)$ for any continuous $B$ ) and then get the same conclusion. I think compactness guarantees the existence of such a limit point (e.g. choose some countable basis for the topology and then restrict the sequence so that one open set after another has a limit), and so the convergence worries are resolved.

[-]paulfchristiano2y40

Hm, I'm now pretty skeptical about the limit step. In particular, if converges to a limit for every open set $S$ , we can't take $A_{\infty} (S)$ to be the limit of those probabilities (since it doesn't satisfy countable additivity even though the simplex is compact). In general the space of probability distributions is not compact in the relevant topology, and so I think we can't possibly have $f (A_{T}, B) \to f (A_{\infty}, B)$ based only on the fact that $f (\cdot, B)$ is the expectation of a bounded function.

This seems like it's plausibly the same topological problem that would break fixed-point theorems, and so I think it's the most likely candidate for where the whole thing breaks and why this strategy doesn't make any progress.

There are various ways to try to route around the problem, but it feels like it may just be the same thing you've been working on, so it seems probably easier to start with looking for an examples where this breaks my algorithm and then confirming that it's the same problem the other approaches run into.

To really break the strategy, what we'd want is a sequence $A_{T}$ and a lottery $B$ such that $f (A_{T}, B) \to 0$ but there is no way to take the limit $A_{\infty}$ (e.g. in earth mover distance) for which $f (A_{\infty}, B) = 0$ . If we found that then I could still try to argue that such sequences are never produced by the algorithm, but it would at least show I need a different proof strategy and likely indicate that the strategy didn't make progress.

ETA here is easy counterexample:

There are three candidates, X, Y, Z. There is a voter who only likes X and a voter who only likes Y.
B puts 1/2 of its probability mass on Z, and the other half spread uniformly over X/Y lotteries.
$A_{T}$ puts its probability mass on lotteries between X and Y where X is almost certain to win (converging to certainty as $T \to \infty$ ).
In the limit, $A_{T}$ always wins the X voters, and it wins Y voters half of the time (whenever B picks Z) and so it has a 3/4 win probability.
It's clear that $A_{T} \to A_{\infty}$ which puts all of its mass on X. So it always wins the X voters, and ties on the Y voters half of the time (when B samples Z), so it has a 5/8 win probability.
If we add a Z voter with half weight or something, then we'll have $f (A_{T}, B) \to 0$ but $f (A_{\infty}, B) < 0$ .

Still would take work to turn it into a counterexample for the original algorithm. Would be curious to do that and see if I actually believe that game ought to have an equilibrium (or to understand why it can't be done).

[-]paulfchristiano2y40

And here's a significantly worse counterexample, showing that there need not be any near $B$ such that $lim f (A_{< T}, B^{'}) < f (A_{\infty}, B)$ :

There are three candidates X, Y, Z and there are three voters: one only likes X, one has u(X) = 2 and u(Y) = 3, one has u(X) = 2 and u(Z) = 3.
B puts all of its mass on X.
$A_{< T}$ puts 1/3 of its mass on X, 1/3 of its mass on a lottery between Y and Z with the probability of Y approaching 2/3 from above as $T \to \infty$ , and 1/3 on a lottery between Y and Z with the probability of Y approaching 1/3 from below.

Under these conditions:

$A_{\infty}$ puts 1/3 of its mass on X, 1/3 on (2/3 Y, 1/3 Z), and 1/3 on (1/3 Y, 2/3 Z).
We can compute that $A_{\infty}$ never beats $B$ , and loses 4/9 of the time.
On the other hand, every $A_{< T}$ beats $B$ 2/9 of the time (and still loses 4/9 of the time).
If $B^{'}$ puts any mass on gambles where X doesn't get 100% of the probability, it loses the X-loving voter.
As a result, any $B^{'}$ near $B$ must lose against $A_{< T}$ at least 2/9 of the time, and can win at most 5/9 of the time.
So there's no way for it to match $f (A_{\infty}, B) = - 4 / 9$ .

In light of that example I think the basic proof strategy is probably no good. It may be tough to construct a concrete example where the algorithm fails, but if it works it would have to be due to some property other than the fact that no continuous $B$ can beat infinitely many $A_{< T}$ , which is all I really wanted to do.

(I'll retract the original proposal.)

[-]Vanessa Kosoy2y82

Epistemic status: rough sketch, there might be holes but I don't see any atm.

I think you need to slightly weaken the definition of an MLL and then you'll have an existence theorem.

Let's start with general discontinuous games. We have a game between two players, with pure strategy spaces and $X_{2}$ that we assume to be compact Polish spaces and (discontinuous) utility functions $u_{1, 2} : X_{1} \times X_{2} \to [0, 1]$ . Then we can take the closure of the graphs of $u_{1, 2}$ and get upper hemicontinuous multivalued utility functions. I claim that for such multivalued games, Nash equilibria (in some sense) exist.

Let $X$ be a compact Polish space and $f \subseteq X \times [0, 1]$ a multivalued upper hemicontinuos function on it. What does it mean for $x \in X$ to be a "maximum" of $f$ ? We can define it as follows: for any $y \in X$ , the maximal possible value of $f (x)$ is greater or equal to the lowest possible value of $f (y)$ . It is not hard to see that maxima form a non-empty closed set. This leads to a corresponding notion of "best response" in multivalued games, and a corresponding notion of Nash equilibrium.

Notice that since we only care about the maximal and minimal possible values of $f (x)$ , we might as well require that $f$ takes convex multivalues (i.e. closed intervals), since otherwise we can always take pointwise convex hull. Expected values of intervals can be defined by separately taking the expected value of the upper and lower ends of the interval.

I think that the existence of Nash equilibria follows from the Kakutani theorem in the usual way. The key observation is, for an upper hemicontinuous function, the maximal possible value is a (single-valued) upper semicontinuous function and the lowest possible value is a (single-valued) lower semicontinuous function. This implies that the best-response mapping is upper hemicontinuous.

Applying it to MLL, what I think we get is a notion of "weak" dominance where we only require that $Pr [v (B) > v (A)] \leq \frac{1}{2}$ , and otherwise the definition is the same.

[-]Scott Garrabrant2y20

Yeah, I believe this works, and that it feels too weak.

[-]Scott Garrabrant2y40

For example, if there is a unanimous winner, you only have to pick them half the time, and can do whatever you want the other half of the time.

[-]Vanessa Kosoy2y20

Yes, this is a good point. Maybe we can strengthen the "weak-MLL" criterion in other ways while preserving existence. For example, we can consider the "-dominance" condition $Pr [v (B) > v (A)] \leq 1 - p$ and look for an LL that is "weak $p$ -maximal" for the highest possible $p$ . The function on the LHS is lower-semincontinuous, hence there exists a maximal $p$ for which a weak $p$ -maximal LL exists.

[-]Diffractor2y42

I have a reduction of this problem to a (hopefully) simpler problem. First up, establish the notation used.

[n] refers to the set . $n$ is the number of candidates. Use $C$ as an abbreviation for the space $Δ [n]$ , it's the space of probability distributions over the candidates. View $C$ as embedded in $R^{n - 1}$ , and set the origin at the center of $C$ .

At this point, we can note that we can biject the following:
1: Functions of type $[n] \to [0, 1]$
2: Affine functions of type $C \to [0, 1]$
3: Functions of the form $λ x . ⟨ a, x ⟩ + c$ , where $x, a \in R^{n - 1}$ , and and $c \in R$ , and everything's suitably set so that these functions are bounded in $[0, 1]$ over $C$ . (basically, we extend our affine function to the entire space with the Hahn-Banach theorem, and use that every affine function can be written as a linear function plus a constant) We can reexpress our distribution $V$ over utility functions as a distribution over these normal vectors $a$ .

Now, we can reexpress the conjecture as follows. Is it the case that there exists a $μ : Δ C$ s.t. for all $ν : Δ C$ , we have

E_{x, y, a \sim μ \times ν \times V} [sgn (⟨ a, x - y ⟩)] \geq 0

Where $sgn$ is the function that's -1 if the quantity is negative, 0 if 0, and 1 if the quantity is positive. To see the equivalence to the original formulation, we can rewrite things as

E_{x, y, a \sim μ \times ν \times V} [1_{⟨ a, x ⟩ > ⟨ a, y ⟩} - 1_{⟨ a, y ⟩ > ⟨ a, x ⟩}] \geq 0

Where the bold 1 is an indicator function. And split up the expectation and realize that this is a probability, so we get

P_{x, y, a \sim μ \times ν \times V} [⟨ a, x ⟩ > ⟨ a, y ⟩] - P_{x, y, a \sim μ \times ν \times V} [⟨ a, y ⟩ > ⟨ a, x ⟩] \geq 0

P_{x, y, a \sim μ \times ν \times V} [⟨ a, x ⟩ > ⟨ a, y ⟩] \geq P_{x, y, a \sim μ \times ν \times V} [⟨ a, y ⟩ > ⟨ a, x ⟩]

And this then rephrases as

P_{x, y, U \sim μ \times ν \times V} [U (x) > U (y)] \geq P_{x, y, U \sim μ \times ν \times V} [U (y) > U (x)]

Which was the original formulation of the problem.

Abbreviating the function $E_{x, y, a \sim μ \times ν \times V} [sgn (⟨ a, x - y ⟩)]$ as $f (μ, ν)$ , then a necessary condition to have a $μ : Δ C$ that dominates everything is that

sup μ \in Δ C inf ν \in Δ C f (μ, ν) \geq 0

If you have this property, then you might not necessarily have an optimal $μ$ that dominates everything, but there are $μ$ that get a worst-case expectation arbitrarily close to 0. Namely, even if the worst possible $ν$ is selected, then the violation of the defining domination inequality happens with arbitrarily small magnitude. There might not be an optimal lottery-lottery, but there are lottery-lotteries arbitrarily close to optimal where this closeness-to-optimality is uniform over every foe. Which seems good enough to me. So I'll be focused on proving this slightly easier statement and glossing over the subtle distinction between that, and the existence of truly optimal lottery-lotteries.

As it turns out, this slightly easier statement (that sup inf is 0 or higher) can be outright proven assuming the following conjecture.

Stably-Good-Response Conjecture: For every $ν : Δ C$ , and $ϵ > 0$ , there exists a $μ : Δ C$ and a $δ > 0$ s.t.

inf ν^{'} : d (ν, ν^{'}) < δ f (μ, ν^{'}) > - ϵ

Pretty much, for any desired level of suckage and any foe $ν$ , there's a probability distribution $μ$ you can pick which isn't just a good response (this always exists, just pick $ν$ itself), but a stably good response, in the sense that there's some nonzero level of perturbation to the foe where $μ$ remains a good response no matter how the foe is perturbed.

Theorem 1 Assuming the Stably-Good-Response Conjecture, ${sup}_{μ} {inf}_{ν} f (μ, ν) \geq 0$ .

I'll derive a contradiction from the assumption that $0 > {sup}_{μ} {inf}_{ν} f (μ, ν)$ . Accordingly, assume the strict inequality.

In such a case, there is some $ϵ$ s.t. $0 > - ϵ > {sup}_{μ} {inf}_{ν} f (μ, ν)$ . Let the set $A_{μ} := {ν | f (μ, ν) > - ϵ}$ . Now, every $ν$ lies in the interior of $A_{μ}$ for some $μ$ , by the Stably-Good-Response Conjecture. Since $Δ C$ is a compact set, we can isolate a finite subcover and get some finite set $M$ of probability distributions $μ$ s.t. $\forall ν \exists μ \in M : f (μ, ν) > - ϵ$ .

Now, let the set $B_{ν} := {μ \in c . h (M) | f (μ, ν) < - ϵ}$ . Since $- ϵ > {sup}_{μ} {inf}_{ν} f (μ, ν)$ , this family of sets manages to cover all of $c . h (M)$ (convex hull of our finite set.) Further, for any fixed $ν$ , $f (μ, ν)$ is a continuous function $c . h (M) \to R$ (a bit nonobvious, but true nontheless because there's only finitely many vertices to worry about). Due to continuity, all the sets $B_{ν}$ will be open. Since we have an open cover of $c . h (M)$ , which is a finite simplex (and thus compact), we can isolate a finite subcover, to get a finite set $N$ of $ν$ s.t. $\forall μ \in c . h (M) \exists ν \in N : f (μ, ν) < - ϵ$ . And now we can go

- ϵ > max μ \in c . h (M) min ν \in N f (μ, ν) \geq max μ \in c . h (M) min ν \in c . h (N) f (μ, ν) = min ν \in c . h (N) max μ \in c . h (M) f (μ, ν) \geq min ν \in c . h (N) max μ \in M f (μ, ν) > - ϵ

The first strict inequality was from how all $μ \in c . h (M)$ had some $ν \in N$ which made $f (μ, ν)$ get a bad score. The $\geq$ was from expanding the set of options. The $=$ was from how $f$ is a continuous linear function when restricted to $c . h (M) \times c . h (N)$ , both of which are compact convex sets, so the minimax theorem can be applied. Then the next $\geq$ was from restricting the set of options, and the $>$ was from how every $ν \in Δ C$ had some $μ \in M$ that'd make $f (μ, ν)$ get a good score, by construction of $M$ (and compactness to make the inequality a strict one).

But wait, we just showed $- ϵ > - ϵ$ , that's a contradiction. Therefore, our original assumption must have been wrong. Said original assumption was that $0 > {sup}_{μ} {inf}_{ν} f (μ, ν)$ , so negating it, we've proved that

sup μ inf ν f (μ, ν) \geq 0

As desired.

Moderation Log