# Against the Linear Utility Hypothesis and the Leverage Penalty

[Roughly the second half of this is a reply to: Pascal's Muggle]

There's an assumption that people often make when thinking about decision theory, which is that utility should be linear with respect to amount of stuff going on. To be clear, I don't mean linear with respect to amount of money/cookies/etc that you own; most people know better than that. The assumption I'm talking about is that the state of the rest of the universe (or multiverse) does not affect the marginal utility of there also being someone having certain experiences at some location in the uni-/multi-verse. For instance, if 1 util is the difference in utility between nothing existing, and there being a planet that has some humans and other animals living on it for a while before going extinct, then the difference in utility between nothing existing and there being n copies of that planet should be n utils. I'll call this the Linear Utility Hypothesis. It seems to me that, despite its popularity, the Linear Utility Hypothesis is poorly motivated, and a very poor fit to actual human preferences.

The Linear Utility Hypothesis gets implicitly assumed a lot in discussions of Pascal's mugging. For instance, in Pascal's Muggle, Eliezer Yudkowsky says he “[doesn't] see any way around” the conclusion that he must be assigning a probably at most on the order of 1/3↑↑↑3 to the proposition that Pascal's mugger is telling the truth, given that the mugger claims to be influencing 3↑↑↑3 lives and that he would refuse the mugger's demands. This implies that he doesn't see any way that influencing 3↑↑↑3 lives could not have on the order of 3↑↑↑3 times as much utility as influencing one life, which sounds like an invocation of the Linear Utility Hypothesis.

One argument for something kind of like the Linear Utility Hypothesis is that there may be a vast multiverse that you can influence only a small part of, and unless your utility function is weirdly nondifferentiable and you have very precise information about the state of the rest of the multiverse (or if your utility function depends primarily on things you personally control), then your utility function should be locally very close to linear. That is, if your utility function is a smooth function of how many people are experiencing what conditions, then the utility from influencing 1 life should be 1/n times the utility of having the same influence on n lives, because n is inevitably going to be small enough that a linear approximation to your utility function will be reasonably accurate, and even if your utility function isn't smooth, you don't know what the rest of the universe looks like, so you can't predict how the small changes you can make will interact with discontinuities in your utility function. This is a scaled-up version of a common argument that you should be willing to pay 10 times as much to save 20,000 birds as you would be willing to pay to save 2,000 birds. I am sympathetic to this argument, though not convinced of the premise that you can only influence a tiny portion of what is actually valuable to you. More importantly, this argument does not even attempt to establish that utility is globally linear, and counterintuitive consequences of the Linear Utility Hypothesis, such as Pascal's mugging, often involve situations that seem especially likely to violate the assumption that all choices you make have tiny consequences.

I have never seen anyone provide a defense of the Linear Utility Hypothesis itself (actually, I think I've been pointed to the VNM theorem for this, but I don't count that because it's a non-sequitor; the VNM theorem is just a reason to use a utility function in the first place, and does not place any constraints on what that utility function might look like), so I don't know of any arguments for it available for me to refute, and I'll just go ahead and argue that it can't be right because actual human preferences violate it too dramatically. For instance, suppose you're given a choice between the following two options: 1: Humanity grows into a vast civilization of 10^100 people living long and happy lives, or 2: a 10% chance that humanity grows into a vast civilization of 10^102 people living long and happy lives, and a 90% chance of going extinct right now. I think almost everyone would pick option 1, and would think it crazy to take a reckless gamble like option 2. But the Linear Utility Hypothesis says that option 2 is much better. Most of the ways people respond to Pascal's mugger don't apply to this situation, since the probabilities and ratios of utilities involved here are not at all extreme.

There are smaller-scale counterexamples to the Linear Utility Hypothesis as well. Suppose you're offered the choice between: 1: continue to live a normal life, which lasts for n more years, or 2: live the next year of a normal life, but then instead of living a normal life after that, have all your memories from the past year removed, and experience that year again n more times (your memories getting reset each time). I expect pretty much everyone to take option 1, even if they expect the next year of their life to be better than the average of all future years of their life. If utility is just a naive sum of local utility, then there must be some year in which has at least as much utility in it as the average year, and just repeating that year every year would thus increase total utility. But humans care about the relationship that their experiences have with each other at different times, as well as what those experiences are.

Here's another thought experiment that seems like a reasonable empirical test of the Linear Utility Hypothesis: take some event that is familiar enough that we understand its expected utility reasonably well (for instance, the amount of money in your pocket changing by $5), and some ludicrously unlikely event (for instance, the event in which some random person is actually telling the truth when they claim, without evidence, to have magic powers allowing them to control the fates of arbitrarily large universes, and saying, without giving a reason, that the way they use this power is dependent on some seemingly unrelated action you can take), and see if you become willing to sacrifice the well-understood amount of utility in exchange for the tiny chance of a large impact when the large impact becomes big enough that the tiny chance of it would be more important if the Linear Utility Hypothesis were true. This thought experiment should sound very familiar. The result of this experiment is that basically everyone agrees that they shouldn't pay the mugger, not only at much higher stakes than the Linear Utility Hypothesis predicts should be sufficient, but even at arbitrarily large stakes. This result has even stronger consequences than that the Linear Utility Hypothesis is false, namely that utility is bounded. People have come up with all sorts of absurd explanations for why they wouldn't pay Pascal's mugger even though the Linear Utility Hypothesis is true about their preferences (I will address the least absurd of these explanations in a bit), but there is no better test for whether an agent's utility function is bounded than how it responds to Pascal's mugger. If you take the claim “My utility function is unbounded”, and taboo “utility function” and "unbounded", it becomes “Given outcomes A and B such that I prefer A over B, for any probability p>0, there is an outcome C such that I would take B rather than A if it lets me control whether C happens instead with probability p.” If you claim that one of these claims is true and the other is false, then you're just contradicting yourself, because that's what “utility function” means. That can be roughly translated into English as “I would do the equivalent of paying the mugger in Pascal's mugging-like situations”. So in Pascal's mugging-like situations, agents with unbounded utility functions don't look for clever reasons not to do the equivalent of paying the mugger; they just pay up. The fact that this behavior is so counterintuitive is an indication that agents with unbounded utility functions are so alien that you have no idea how to empathize with them. The “least absurd explanation” I referred to for why an agent satisfying the Linear Utility Hypothesis would reject Pascal's mugger, is, of course, the leverage penalty that Eliezer discusses in Pascal's Muggle. The argument is that any hypothesis in which there are n people, one of whom has a unique opportunity to affect all the others, must imply that a randomly selected one of those n people has only a 1/n chance of being the one who has influence. So if a hypothesis implies that you have a unique opportunity to affect n people's lives, then this fact is evidence against this hypothesis by a factor of 1:n. In particular, if Pascal's mugger tells you that you are in a unique position to affect 3↑↑↑3 lives, the fact that you are the one in this position is 1 : 3↑↑↑3 evidence against the hypothesis that Pascal's mugger is telling the truth. I have two criticisms of the leverage penalty: first, that it is not the actual reason that people reject Pascal's mugger, and second, that it is not a correct reason for an ideal rational agent to reject Pascal's mugger. The leverage penalty can't be the actual reason people reject Pascal's mugger because people don't actually assign probability as low as 1/3↑↑↑3 to the proposition that Pascal's mugger is telling the truth. This can be demonstrated with thought experiments. Consider what happens when someone encounters overwhelming evidence that Pascal's mugger actually is telling the truth. The probability of the evidence being faked can't possibly be less than 1 in 10^10^26 or so (this upper bound was suggested by Eliezer in Pascal's Muggle), so an agent with a leverage prior will still be absolutely convinced that Pascal's mugger is lying. Eliezer suggests two reasons that an agent might pay Pascal's mugger anyway, given a sufficient amount of evidence: first, that once you update to a probability of something like 10^100 / 3↑↑↑3, and multiply by the stakes of 3↑↑↑3 lives, you get an expected utility of something like 10^100 lives, which is worth a lot more than$5, and second, that the agent might just give up on the idea of a leverage penalty and admit that there is a non-infinitesimal chance that Pascal's mugger may actually be telling the truth. Eliezer concludes, and I agree, that the first of these explanations is not a good one. I can actually demonstrate this with a thought experiment. Suppose that after showing you overwhelming evidence that they're telling the truth, Pascal's mugger says “Oh, and by the way, if I was telling the truth about the 3↑↑↑3 lives in your hands, then X is also true,” where X is some (a priori fairly unlikely) proposition that you later have the opportunity to bet on with a third party. Now, I'm sure you'd be appropriately cautious in light of the fact that you would be very confused about what's going on, so you wouldn't bet recklessly, but you probably would consider yourself to have some special information about X, and if offered good enough odds, you might see a good opportunity for profit with an acceptable risk, which would not have looked appealing before being told X by Pascal's mugger. If you were really as confident that Pascal's mugger was lying as the leverage prior would imply, then you wouldn't assume X was any more likely than you thought before for any purposes not involving astronomical stakes, since your reason for believing X is predicated on you having control over astronomical stakes, which is astronomically unlikely.

And if you do assign a probability of 1/3↑↑↑3 to some proposition, what is the empirical content of this claim? One possible answer is that this means that the odds at which you would be indifferent to betting on the proposition are 1 : 3↑↑↑3, if the bet is settled with some currency that your utility function is close to linear with respect to across such scales. But the existence of such a currency is under dispute, and the empirical content to the claim that such a currency exists is that you would make certain bets with it involving arbitrarily extreme odds, so this is a very circular way to empirically ground the claim that you assign a probability of 1/3↑↑↑3 to some proposition. So a good empirical grounding for this claim is going to have to be in terms of preferences between more familiar outcomes. And in terms of payoffs at familiar scales, I don't see anything else that the claim that you assign a probability of 1/3↑↑↑3 to a proposition could mean other than that you expect to continue to act as if the probability of the proposition is 0, even conditional on any observations that don't give you a likelihood ratio on the order of 1/3↑↑↑3. If you claim that you would superupdate long before then, it's not clear to me what you could mean when you say that your current probability for the proposition is 1/3↑↑↑3.

There's another way to see that bounded utility functions, not leverage priors, are Eliezer's (and also pretty much everyone's) true rejection to paying Pascal's mugger, and that is the following quote from Pascal's Muggle: “I still feel a bit nervous about the idea that Pascal's Muggee, after the sky splits open, is handing over five dollars while claiming to assign probability on the order of 10^9/3↑↑↑3 that it's doing any good.” This is an admission that Eliezer's utility function is bounded (even though Eliezer does not admit that he is admitting this) because the rational agents whose utility functions are bounded are exactly (and tautologically) characterized by those for which there exists a probability p>0 such that the agent would not spend [fixed amount of utility] for probability p of doing any good, no matter what the good is. An agent satisfying the Linear Utility Hypothesis would spend $5 for a 10^9/3↑↑↑3 chance of saving 3↑↑↑3 lives. Admitting that it would do the wrong thing if it was in that situation, but claiming that that's okay because you have an elaborate argument that the agent can't be in that situation even though it can be in situations in which the probability is lower and can also be in situations in which the probability is higher, strikes me as an exceptionally flimsy argument that the Linear Utility Hypothesis is compatible with human values. I also promised a reason that the leverage penalty argument is not a correct reason for rational agents (regardless of computational constraints) satisfying the Linear Utility Hypothesis to not pay Pascal's mugger. This is that in weird situations like this, you should be using updateless decision theory, and figure out which policy has the best a priori expected utility and implementing that policy, instead of trying to make sense of weird anthropic arguments before updatefully coming up with a strategy. Now consider the following hypothesis: “There are 3↑↑↑3 copies of you, and a Matrix Lord will approach one of them while disguised as an ordinary human, inform that copy about his powers and intentions without offering any solid evidence to support his claims, and then kill the rest of the copies iff this copy declines to pay him$5. None of the other copies will experience or hallucinate anything like this.” Of course, this hypothesis is extremely unlikely, but there is no assumption that some randomly selected copy coincidentally happens to be the one that the Matrix Lord approaches, and thus no way for a leverage penalty to force the probability of the hypothesis below 1/3↑↑↑3. This hypothesis and the Linear Utility Hypothesis suggest that having a policy of paying Pascal's mugger would have consequences 3↑↑↑3 times as important as not dying, which is worth well over \$5 in expectation, since the probability of the hypothesis couldn't be as low as 1/3↑↑↑3. The fact that actually being approached by Pascal's mugger can be seen as overwhelming evidence against this hypothesis does nothing to change that.

Edit: I have written a follow-up to this.

magical algorithm
Highlighting new comments since Today at 10:39 AM
Moderation Guidelinesexpand_more

I'm not sure your refutation of the leverage penalty works. If there really are 3 ↑↑↑ 3 copies of you, your decision conditioned on that may still not be to pay. You have to compare

P(A real mugging will happen) x U(all your copies die)

against

P(fake muggings happen) x U(lose five dollars) x (expected number of copies getting fake-mugged)

where that last term will in fact be proportional to 3 ↑↑↑ 3. Even if there is an incomprehensibly vast matrix, its Dark Lords are pretty unlikely to mug you for petty cash. And this plausibly does make you pay in the Muggle case, since P(fake muggings happen) is way down if 'mugging' involves tearing a hole in the sky.

Yes, it looks like you're right. I'll think about this and probably write a follow-up later. Edit: I have finally written that follow-up.

I think I disagree with your approach here.

I, and I think most people in practice, use reflective equilibrium to decide what our ethics are. This means that we can notice that our ethical intuitions are insensitive to scope, but also that upon reflection it seems like this is wrong, and thus adopt an ethics different from that given by our naive intuition.

When we're trying to use logic to decide whether to accept an ethical conclusion counter to our intuition, it's no good to document what our intuition currently says as if that settles the matter.

A priori, 1,000 lives at risk may seem just as urgent as 10,000. But we think about it, and we do our best to override it.

And in fact, I fail pretty hard at it. I'm pretty sure the amount I give to charity wouldn't be different in a world where the effectiveness of the best causes were an order of magnitude different. I suspect this is true of many; certainly anyone following the Giving What We Can pledge is using an ancient Schelling Point rather than any kind of calculation. But that doesn't mean you can convince me that my "real" ethics doesn't care how many lives are saved.

When we talk about weird hypotheticals like Pascallian deals, we aren't trying to figure out what our intuition says; we're trying to figure out whether we should overrule it.

When you use philosophical reflection to override naive intuition, you should have explicit reasons for doing so. A reason for valuing 10,000 lives 10 times as much as 1,000 lives is that both of these are tiny compared to the total number of lives, so if you valued them at a different ratio, this would imply an oddly sharp bend in utility as a function of lives, and we can tell that there is no such bend because if we imagine that there were a few thousand more or fewer people on the planet, our intuitions about that that particular tradeoff would not change. This reasoning does not apply to decisions affecting astronomically large numbers of lives, and I have not seen any reasoning that does which I find compelling.

It is also not true that people are trying to figure out whether to overrule their intuition when they talk about Pascal's mugging; typically, they are trying to figure out how to justify not overruling their intuition. How else can you explain the preponderence of shaky "resolutions" to Pascal's mugging that accept the Linear Utility Hypothesis and nonetheless conclude that you should not pay Pascal's mugger, when "I tried to estimate the relevent probabilities fairly conservatively, multiplied probabilities times utilities, and paying Pascal's mugger came out far ahead" is usually not considered a resolution?

Good points (so I upvoted), but the post could be half as long and make the same points better.

Absolutely it is the case that utility should be bounded. However as best I can tell you've left out the most fundamental reason why, so I think I should explain that here. (Perhaps I should make this a separate post?)

The basic question is: Where do utility functions come from? Like, why should one model a rational agent as having a utility function at all? The answer of course is either the VNM theorem or Savage's theorem, depending on whether or not you're pre-assuming probability (you shouldn't, really, but that's another matter). Right, both these theorems take the form of, here's a bunch of conditions any rational agent should obey, let's show that such an agent must in fact be acting according to a utility function (i.e. trying to maximize its expected value).

Now here's the thing: The utility functions output by Savage's theorem are always bounded. Why is that? Well, essentially, because otherwise you could set up a St. Petersburg paradox that would contradict the assumed rationality conditions (in short, you can set up two gambles, both of "infinite expected utility", but where one dominates the other, and show that both A. the agent must prefer the first to the second, but also B. the agent must be indifferent between them, contradiction). Thus we conclude that the utility function must be bounded.

OK, but what if we base things around the VNM theorem, then? It requires pre-assuming the notion of probability, but the utility functions output by the VNM theorem aren't guaranteed to be bounded.

Here's the thing: The VNM theorem only guarantees that the utility function it outputs works for finite gambles. Seriously. The VNM theorem gives no guarantee that the agent is acting according to the specified utility function when presented with a gamble with infinitely many possible outcomes, only when presented with a gamble with finitely many outcomes.

Similarly, with Savage's theorem, the assumption that forces utility functions to be bounded -- P7 -- is the same one that guarantees that the utility function works for infinite gambles. You can get rid of P7, and you'll no longer be guaranteed to get a bounded utility function, but neither will you be guaranteed that the utility function will work for gambles with infinitely many possible outcomes.

This means that, fundamentally, if you want to work with infinite gambles, you need to only be talking about bounded utility functions. If you talk about infinite gambles in the context of unbounded utility functions, well, you're basically talking nonsense, because there's just absolutely no guarantee that the utility function you're using applies in such a situation. The problems of unbounded utility that Eliezer keeps pointing out, that he insists we need to solve, really are just straight contradictions arising from him making bad assumptions that need to be thrown out. Like, they all stem from him assuming that unbounded utility functions work in the case of infinite gambles, and there simply is no such guarantee; not in the VNM theorem, not in Savage's theorem.

If you're assuming infinite gambles, you need to assume bounded utility functions, or else you need to accept that in cases of infinite gambles the utility function doesn't actually apply -- making the utility function basically useless, because, well, everything has infinitely many possible outcomes. Between a utility function that remains valid in the face of infinite gambles, and unbounded utility, it's pretty clear you should choose the former.

And between Savage's axiom P7 and unbounded utility, it's pretty clear you should choose the former. Because P7 is an assumption that directly describes a rationality condition on the agent's preferences, a form of the sure-thing principle, one we can clearly see had better be true of any rational agent; while unbounded utility... means what, exactly, in terms of the agent's preferences? Something, certainly, but not something we obviously need. And in fact we don't need it.

As best I can tell, Eliezer keeps insisting we need unbounded utility functions out of some sort of commitment to total utilitarianism or something along the lines of such (that's my summary of his position, anyway). I would consider that to be on much shakier ground (there are so many nonobvious large assumptions for something like that to even make sense, seriously I'm not even going into it) than obvious things like the sure-thing principle, or that a utility function is nearly useless if it's not valid for infinite gambles. And like I said, as best I can tell, Eliezer keeps assuming that the utility function is valid in such situations even though there's nothing guaranteeing this; and this assumption is just in contradiction with his assumption of an unbounded utility function. He should keep the validity assumption (which we need) and throw out the unboundedness one (which we don't).

That, to my mind, is the most fundamental reason we should only be considering bounded utility functions!

The problems of unbounded utility that Eliezer keeps pointing out, that he insists we need to solve, really are just straight contradictions arising from him making bad assumptions that need to be thrown out. Like, they all stem from him assuming that unbounded utility functions work in the case of infinite gambles

Just to be clear, you're not thinking of 3↑↑↑3 when you talk about infinite gambles, right?

I'm not sure I know what argument of Eliezer's you're talking about when you reference infinite gambles. Is there an example you can link to?

He means gambles that can have infinitely many different outcomes. This causes problems for unbounded utility functions because of the Saint Petersburg paradox.

But the way you solve the St Petersburg paradox in real life is to note that nobody has infinite money, nor infinite time, and therefore it doesn't matter if your utility function spits out a weird outcome for it because you can have a prior of 0 that it will actually happen. Am I missing something?

you can have a prior of 0 that it will actually happen

No.

I'm not familiar with Savage's theorem, but I was aware of the things you said about the VNM theorem, and in fact, I often bring up the same arguments you've been making. The standard response that I hear is that some probability distributions cannot be updated to without an infinite amount of information (e.g. if a priori the probability of the nth outcome is proportional to 1/3^n, then there can't be any evidence that could occur with nonzero probability that would convince you that the probability of the nth outcome is 1/2^n for each n), and there's no need for a utility function to converge on gambles that it is impossible even in theory for you to be convinced are available options.

When I ask why they assume that their utility function should be valid on those infinite gambles that are possible for them to consider, if they aren't assuming that their preference relation is closed in the strong topology (which implies that the utility function is bounded), they'll say something like that their utility function not being valid where their preference relation is defined seems weirdly discontinuous (in some sense that they can't quite formalize and definitely isn't the preference relation being closed in the strong topology), or that the examples I gave them of VNM-rational preference relations for which the utility function isn't valid for infinite gambles all have some other pathology, like that there's an infinite gamble which is considered either better than all of or worse than all of the constituent outcomes, and there might be a representation theorem saying something like that has to happen, even though they can't point me to one.

Anyway, I agree that that's a more fundamental reason to only consider bounded utility functions, but I decided I could probably be more convincing by abandoning that line of argument, and showing that if you sweep convergence issues under the rug, unbounded utility functions still suggest insane behavior in concrete situations.

This is all basically right.

However, as I said in a recent comment, people do not actually have utility functions. So in that sense, they have neither a bounded nor an unbounded utility function. They can only try to make their preferences less inconsistent. And you have two options: you can pick some crazy consistency very different from normal, or you can try to increase normality at the same time as increasing consistency. The second choice is better. And in this case, the second choice means picking a bounded utility function, and the first choice means choosing an unbounded one, and going insane (because agreeing to be mugged is insane.)

Yes, that's true. The fact that humans are not actually rational agents is an important point that I was ignoring.

"For instance, suppose you're given a choice between the following two options: 1: Humanity grows into a vast civilization of 10^100 people living long and happy lives, or 2: a 10% chance that humanity grows into a vast civilization of 10^102 people living long and happy lives, and a 90% chance of going extinct right now. I think almost everyone would pick option 1, and would think it crazy to take a reckless gamble like option 2. But the Linear Utility Hypothesis says that option 2 is much better. "

It seems like selectively choosing a utility function that does not weight the negative utility of the state of 'anti-lives' that having NO civilization of people living long and happy lives at all in the entire universe would represent.

I think you could tune a linear relationship of these negative values and accurately get the behavior that people have regarding these options.

This seems a lot like picking the weakest and least credible possible argument to use as way to refute the entire idea. Which made it much more difficult for me to read the rest of your article with the benefit of the doubt I would have prefered to have held through out.

The Linear Utility Hypothesis does imply that there is no extra penalty (on top of the usual linear relationship between population and utility) for the population being zero, and it seems to me that it is common for people to assume the Linear Utility Hypothesis unmodified by such a zero-population penalty. Furthermore, a zero-population penalty seems poorly motivated to me, and still does not change the answer that Linear Utility Hypothesis + zero-population penalty would suggest in the thought experiment that you quoted, since you can just talk about populations large enough to dwarf the zero-population penalty.

Refuting a weak argument for a hypothesis is not a good way to refute the hypothesis, but that's not what I'm doing; I'm refuting weak consequences of the Linear Utility Hypothesis, and "X implies Y, but not Y" is a perfectly legitimate form of argument for "not X".

I promoted this to featured, in small part because I'm happy about additions to important conversations (pascal's mugging), in medium part because of the good conversation it inspired (both in the comments and in this post by Zvi), and in large part because I want people to know that these types of technical posts are really awesome and totally a good fit for the LW frontpage, and if they contain good ideas they'll totally be promoted.

I've changed my mind. I agree with this.

The probability of you getting struck by lightning and dying while making your decision is .

The probability of you dying by a meteor strike, by an earthquake, by .... is .

The probability that you don't get to complete your decision for one reason or the other is .

It doesn't make sense then to entertain probabilities vastly lower than , but not entertain probabilities much higher than