Real World Solutions to Prisoners' Dilemmas

Why should there be real world solutions to Prisoners' Dilemmas? Because such dilemmas are a real-world problem.

If I am assigned to work on a school project with a group, I can either cooperate (work hard on the project) or defect (slack off while reaping the rewards of everyone else's hard work). If everyone defects, the project doesn't get done and we all fail - a bad outcome for everyone. If I defect but you cooperate, then I get to spend all day on the beach and still get a good grade - the best outcome for me, the worst for you. And if we all cooperate, then it's long hours in the library but at least we pass the class - a “good enough” outcome, though not quite as good as me defecting against everyone else's cooperation. This exactly mirrors the Prisoner's Dilemma.

Diplomacy - both the concept and the board game - involves Prisoners' Dilemmas. Suppose Ribbentrop of Germany and Molotov of Russia agree to a peace treaty that demilitarizes their mutual border. If both cooperate, they can move their forces to other theaters, and have moderate success there - a good enough outcome. If Russia cooperates but Germany defects, it can launch a surprise attack on an undefended Russian border and enjoy spectacular success there (for a while, at least!) - the best outcome for Germany and the worst for Russia. But if both defect, then neither has any advantage at the German-Russian border, and they lose the use of those troops in other theaters as well - a bad outcome for both. Again, the Prisoner's Dilemma.

Civilization - again, both the concept and the game - involves Prisoners' Dilemmas. If everyone follows the rules and creates a stable society (cooperates), we all do pretty well. If everyone else works hard and I turn barbarian and pillage you (defect), then I get all of your stuff without having to work for it and you get nothing - the best solution for me, the worst for you. If everyone becomes a barbarian, there's nothing to steal and we all lose out. Prisoner's Dilemma.

If everyone who worries about global warming cooperates in cutting emissions, climate change is averted and everyone is moderately happy. If everyone else cooperates in cutting emissions, but one country defects, climate change is still mostly averted, and the defector is at a significant economic advantage. If everyone defects and keeps polluting, the climate changes and everyone loses out. Again a Prisoner's Dilemma,

Prisoners' Dilemmas even come up in nature. In baboon tribes, when a female is in “heat”, males often compete for the chance to woo her. The most successful males are those who can get a friend to help fight off the other monkeys, and who then helps that friend find his own monkey loving. But these monkeys are tempted to take their friend's female as well. Two males who cooperate each seduce one female. If one cooperates and the other defects, he has a good chance at both females. But if the two can't cooperate at all, then they will be beaten off by other monkey alliances and won't get to have sex with anyone. Still a Prisoner's Dilemma!

So one might expect the real world to have produced some practical solutions to Prisoners' Dilemmas.

One of the best known such systems is called “society”. You may have heard of it. It boasts a series of norms, laws, and authority figures who will punish you when those norms and laws are broken.

Imagine that the two criminals in the original example were part of a criminal society - let's say the Mafia. The Godfather makes Alice and Bob an offer they can't refuse: turn against one another, and they will end up “sleeping with the fishes” (this concludes my knowledge of the Mafia). Now the incentives are changed: defecting against a cooperator doesn't mean walking free, it means getting murdered.





Both prisoners cooperate, and amazingly the threat of murder ends up making them both better off (this is also the gist of some of the strongest arguments against libertarianism: in Prisoner's Dilemmas, threatening force against rational agents can increase the utility of all of them!)

Even when there is no godfather, society binds people by concern about their “reputation”. If Bob got a reputation as a snitch, he might never be able to work as a criminal again. If a student gets a reputation for slacking off on projects, she might get ostracized on the playground. If a country gets a reputation for backstabbing, others might refuse to make treaties with them. If a person gets a reputation as a bandit, she might incur the hostility of those around her. If a country gets a reputation for not doing enough to fight global warming, it might...well, no one ever said it was a perfect system.

Aside from humans in society, evolution is also strongly motivated to develop a solution to the Prisoner's Dilemma. The Dilemma troubles not only lovestruck baboons, but ants, minnows, bats, and even viruses. Here the payoff is denominated not in years of jail time, nor in dollars, but in reproductive fitness and number of potential offspring - so evolution will certainly take note.

Most people, when they hear the rational arguments in favor of defecting every single time on the iterated 100-crime Prisoner's Dilemma, will feel some kind of emotional resistance. Thoughts like “Well, maybe I'll try cooperating anyway a few times, see if it works”, or “If I promised to cooperate with my opponent, then it would be dishonorable for me to defect on the last turn, even if it helps me out., or even “Bob is my friend! Think of all the good times we've had together, robbing banks and running straight into waiting police cordons. I could never betray him!”

And if two people with these sorts of emotional hangups play the Prisoner's Dilemma together, they'll end up cooperating on all hundred crimes, getting out of jail in a mere century and leaving rational utility maximizers to sit back and wonder how they did it.

Here's how: imagine you are a supervillain designing a robotic criminal (who's that go-to supervillain Kaj always uses for situations like this? Dr. Zany? Okay, let's say you're him). You expect to build several copies of this robot to work as a team, and expect they might end up playing the Prisoner's Dilemma against each other. You want them out of jail as fast as possible so they can get back to furthering your nefarious plots. So rather than have them bumble through the whole rational utility maximizing thing, you just insert an extra line of code: “in a Prisoner's Dilemma, always cooperate with other robots”. Problem solved.

Evolution followed the same strategy (no it didn't; this is a massive oversimplification). The emotions we feel around friendship, trust, altruism, and betrayal are partly a built-in hack to succeed in cooperating on Prisoner's Dilemmas where a rational utility-maximizer would defect a hundred times and fail miserably. The evolutionarily dominant strategy is commonly called “Tit-for-tat” - basically, cooperate if and only if your opponent did so last time.

This so-called "superrationality” appears even more clearly in the Ultimatum Game. Two players are given $100 to distribute among themselves in the following way: the first player proposes a distribution (for example, “Fifty for me, fifty for you”) and then the second player either accepts or rejects the distribution. If the second player accepts, the players get the money in that particular ratio. If the second player refuses, no one gets any money at all.

The first player's reasoning goes like this: “If I propose $99 for myself and $1 for my opponent, that means I get a lot of money and my opponent still has to accept. After all, she prefers $1 to $0, which is what she'll get if she refuses.

In the Prisoner's Dilemma, when players were able to communicate beforehand they could settle upon a winning strategy of precommiting to reciprocate: to take an action beneficial to their opponent if and only if their opponent took an action beneficial to them. Here, the second player should consider the same strategy: precommit to an ultimatum (hence the name) that unless Player 1 distributes the money 50-50, she will reject the offer.

But as in the Prisoner's Dilemma, this fails when you have no reason to expect your opponent to follow through on her precommitment. Imagine you're Player 2, playing a single Ultimatum Game against an opponent you never expect to meet again. You dutifully promise Player 1 that you will reject any offer less than 50-50. Player 1 offers 80-20 anyway. You reason “Well, my ultimatum failed. If I stick to it anyway, I walk away with nothing. I might as well admit it was a good try, give in, and take the $20. After all, rejecting the offer won't magically bring my chance at $50 back, and there aren't any other dealings with this Player 1 guy for it to influence.”

This is seemingly a rational way to think, but if Player 1 knows you're going to think that way, she offers 99-1, same as before, no matter how sincere your ultimatum sounds.

Notice all the similarities to the Prisoner's Dilemma: playing as a "rational economic agent" gets you a bad result, it looks like you can escape that bad result by making precommitments, but since the other player can't trust your precommitments, you're right back where you started

If evolutionary solutions to the Prisoners' Dilemma look like trust or friendship or altruism, solutions to the Ultimatum Game involve different emotions entirely. The Sultan presumably does not want you to elope with his daughter. He makes an ultimatum: “Touch my daughter, and I will kill you.” You elope with her anyway, and when his guards drag you back to his palace, you argue: “Killing me isn't going to reverse what happened. Your ultimatum has failed. All you can do now by beheading me is get blood all over your beautiful palace carpet, which hurts you as well as me - the equivalent of pointlessly passing up the last dollar in an Ultimatum Game where you've just been offered a 99-1 split.”

The Sultan might counter with an argument from social institutions: “If I let you go, I will look dishonorable. I will gain a reputation as someone people can mess with without any consequences. My choice isn't between bloody carpet and clean carpet, it's between bloody carpet and people respecting my orders, or clean carpet and people continuing to defy me.”

But he's much more likely to just shout an incoherent stream of dreadful Arabic curse words. Because just as friendship is the evolutionary solution to a Prisoner's Dilemma, so anger is the evolutionary solution to an Ultimatum Game. As various gurus and psychologists have observed, anger makes us irrational. But this is the good kind of irrationality; it's the kind of irrationality that makes us pass up a 99-1 split even though the decision costs us a dollar.

And if we know that humans are the kind of life-form that tends to experience anger, then if we're playing an Ultimatum Game against a human, and that human precommits to rejecting any offer less than 50-50, we're much more likely to believe her than if we were playing against a rational utility-maximizing agent - and so much more likely to give the human a fair offer.

It is distasteful and a little bit contradictory to the spirit of rationality to believe it should lose out so badly to simple emotion, and the problem might be correctable. Here we risk crossing the poorly charted border between game theory and decision theory and reaching ideas like timeless decision theory: that one should act as if one's choices determined the output of the algorithm one instantiates (or more simply, you should assume everyone like you will make the same choice you do, and take that into account when choosing.)

More practically, however, most real-world solutions to Prisoner's Dilemmas and Ultimatum Games still hinge on one of three things: threats of reciprocation when the length of the game is unknown, social institutions and reputation systems that make defection less attractive, and emotions ranging from cooperation to anger that are hard-wired into us by evolution. In the next post, we'll look at how these play out in practice.

Comments

sorted by
magical algorithm
Highlighting new comments since Today at 12:36 PM
Select new highlight date
All comments loaded

Yay Dr. Zany! And a good post in general.

However, Western behavior in the Ultimatum Game seems to be a cultural, not biological, phenomenon.

By the mid‐1990s researchers were arguing that a set of robust experimental findings from behavioral economics were evidence for set of evolved universal motivations (Fehr & Gächter 1998, Hoffman et al. 1998). Foremost among these experiments, the Ultimatum Game, provides a pair of anonymous subjects with a sum of real money for a one‐shot interaction. One of the pair—the proposer—can offer a portion of this sum to a second subject, the responder. Responders must decide whether to accept or reject the offer. If a responder accepts, she gets the amount of the offer and the proposer takes the remainder; if she rejects both players get zero. If subjects are motivated purely by self‐interest, responders should always accept any positive offer; knowing this, a self‐interested proposer should offer the smallest non‐zero amount. Among subjects from industrialized populations—mostly undergraduates from the U.S., Europe, and Asia—proposers typically offer an amount between 40% and 50% of the total, with a modal offer of usually 50% (Camerer 2003). Offers below about 30% are often rejected.

With this seemingly robust empirical finding in their sights, Nowak, Page and Sigmund (2000) constructed an evolutionary analysis of the Ultimatum Game. When they modeled the Ultimatum Game exactly as played, they did not get results matching the undergraduate findings. However, if they added reputational information, such that players could know what their partners did with others on previous rounds of play, the analysis predicted offers and rejections in the range of typical undergraduate responses. They concluded that the Ultimatum Game reveals humans’ species‐specific evolved capacity for fair and punishing behavior in situations with substantial reputation influence. But, since the Ultimatum Game is typically done one‐shot without reputational information, they argued that people make fair offers and reject unfair offers because their motivations evolved in a world where such interactions were not fitness relevant—thus, we are not evolved to fully incorporate the possibility of non‐reputational action in our decision‐making, at least in such artificial experimental contexts.

Recent comparative work has dramatically altered this initial picture. Two unified projects (which we call Phase 1 and Phase 2) have deployed the Ultimatum Game and other related experimental tools across thousands of subjects randomly sampled from 23 small‐scale human societies, including foragers, horticulturalists, pastoralists, and subsistence farmers, drawn from Africa, Amazonia, Oceania, Siberia and New Guinea (Henrich et al. 2005, Henrich et al. 2006). Three different experimental measures show that the people in industrialized societies consistently occupy the extreme end of the human distribution. Notably, small‐scale societies with only face‐to‐face interaction behaved in a manner reminiscent of Nowak et. al.’s analysis before they added the reputational information. That is, these populations made low offers and did not reject. [...]

Analyses of these data show that a population’s degree of market integration and its participation in a world religion both independently predict higher offers, and account for much of the variation between populations. Community size positively predicts greater punishment (Henrich et al. n.d.). The authors suggest that norms and institutions for exchange in ephemeral interactions culturally coevolved with markets and expanding larger‐scale sedentary populations. In some cases, at least in their most efficient forms, neither markets nor large population were feasible before such norms and institutions emerged. That is, it may be that what behavioral economists have been measuring in such games is a specific set of social norms, evolved for dealing with money and strangers, that have emerged since the origins of agriculture and the rise of complex societies.

I'm guessing that the results would be significantly affected by the perceived relative status, as the offer can be more about signaling than rational choice. If the two players happen to perceive the relative status similarly, or if the second player perceives equal or larger status disparity, s/he will likely accept. Maybe even think of the first player as foolish for offering too much and being a lousy bargainer. A rejection would often be due to the status-related outrage ("Who does s/he think s/he is to offer me only a pittance?")

So, if you think that being in control of how much to offer raises your status, you are likely to offer less, and if you think that not having any say in the amount automatically makes you lower status, you would be likely to accept a low offer.

Thus I would expect that in a society where equality is not considered an unalienable right, but is rather determined by material possessions, the average accepted offer would be lower. Not sure if this matches the experimental results.

Thanks for posting one of the comparative ultimatum game studies! I knew they were out there but didn't remember quite where.

There are many problems here.

At the end of paragraph 2 and the other examples, you say

This exactly mirrors the Prisoner's Dilemma.

But it doesn't, as you point out later in the post, because the payoff matrix isn't D-C > C-C > D-D, as you explain, but rather C-C > D-C > C-D, because of reputational effects, which is not a prisoner's dilemma. "Prisoner's dilemma" is a very specific term, and you are inflating it.

evolution is also strongly motivated [...] evolution will certainly take note.

I doubt that quite strongly!

The evolutionarily dominant strategy is commonly called “Tit-for-tat” - basically, cooperate if and only if you expect your opponent to do so.

That is not tit-for-tat! Tit-for-tat is start with cooperate and then parrot the opponent's previous move. It does not do what it "expects" the opponent to do. Furthermore, if you categorically expect your opponent to cooperate, you should defect (just like you should if you expect him to defect). You only cooperate if you expect your opponent to cooperate if he expects you to cooperate ad nauseum.

This so-called "superrationality” appears even more [...]

That is not superrationality! Superrationality achieves cooperation by reasoning that you and your opponent will get the same result for the same reasons, so you should cooperate in order to logically bind your result to C-C (since C-C and D-D are the only two options). What is with all this misuse of terminology? You write like the agents in the examples of this game are using causal decision theory (which defects all the time no matter what) and then bring up elements that cannot possibly be implemented in causal decision theory, and it grinds my gears!

And if two people with these sorts of emotional hangups play the Prisoner's Dilemma together, they'll end up cooperating on all hundred crimes, getting out of jail in a mere century and leaving rational utility maximizers to sit back and wonder how they did it.

This is in direct violation of one of the themes of Less Wrong. If "rational expected utility maximizers" are doing worse than "irrational emotional hangups", then you're using a wrong definition of "rational". You do this throughout the post, and it's especially jarring because you are or were one of the best writers for this website.

playing as a "rational economic agent" gets you a bad result

9_9

[...] anger makes us irrational. But this is the good kind of irrationality [...]

"The good kind of irrationality" is like "the good kind of bad thing". An oxymoron, by definition.

[...] if we're playing an Ultimatum Game against a human, and that human precommits to rejecting any offer less than 50-50, we're much more likely to believe her than if we were playing against a rational utility-maximizing agent

Bullshit. A rational agent is going to do what works. We know this because we stipulated that it was rational. If you mean to say a "stupid number crunching robot that misses obvious details like how to play ultimatum games" then sure it might do as you describe. But don't call it "rational".

It is distasteful and a little bit contradictory to the spirit of rationality to believe it should lose out so badly to simple emotion, and the problem might be correctable.

You think?

Downvoted.

I agree with pretty much everything you've said here, except:

You only cooperate if you expect your opponent to cooperate if he expects you to cooperate ad nauseum.

You don't actually need to continue this chain - if you're playing against any opponent which cooperates iff you cooperate, then you want to cooperate - even if the opponent would also cooperate against someone who cooperated no matter what, so your statement is also true without the "ad nauseum" (provided the opponent would defect if you defected).

You're right. I assumed symmetry, which was wrong.

You're reading this uncharitably. There are also parts that are unclear on Yvain's part, sure, but not to the extent that you claim.

The original group project situation Yvain explores does mirror the Prisoner's Dilemma. Then, later, he introduces reputational effects to illustrate one of the Real World Solutions to the Prisoner's Dilemma that we have already developed.

It's not made crystal clear....

So one might expect the real world to have produced some practical solutions to Prisoners' Dilemmas. One of the best known such systems is called “society”. You may have heard of it.

Well, actually it is.

...

Evolution

I understood Yvain to be speaking metaphorically, or perhaps tongue-in-cheek, when talking about what evolution would take note of. I believe this was his intention, and furthermore is a reasonable reading given our knowledge of Yvain.

This is in direct violation of one of the themes of Less Wrong. If "rational expected utility maximizers" are doing worse than "irrational emotional hangups", then you're using a wrong definition of "rational". You do this throughout the post, and it's especially jarring because you are or were one of the best writers for this website.

I expect that Yvain used 'rational' against the theme of LW on purpose, to create a tension - rationality failing to outperform emotional hangups is a contradiction, that would motivate readers to find the false premise or re-analyse the situation.

I do concur with your point about tit-for-tat. Similarly for super-rationality; although it's possible Yvain is not familiar with Hofstadter's definition and was using 'super' as an intensifier, it seems unlikely.

You think?

Downvoted.

I had this downvoted based on on form and irritating tone before I looked closely and decided enough of the quotes from Yvain are, indeed, plainly wrong and I encourage hearty dismissal.

You do this throughout the post, and it's especially jarring because you are or were one of the best writers for this website.

Agree. Who is he and what has he done to the real Yvain?

Great critique!

The first time I read the post, I stopped reading when "tit-for-tat" and "superrationality" were misused in two consecutive sentences. Sadly, that part seems to be still inaccurate after Yvain edited it, because TFT is not dominant in the 100-fold repeated PD, if the strategy pool contains strategies that feed on TFT.

The first time I read the post, I stopped reading when "tit-for-tat" and "superrationality" were misused in two consecutive sentences. Sadly, that part seems to be still inaccurate after Yvain edited it, because TFT is not dominant in the 100-fold repeated PD, if the strategy pool contains strategies that feed on TFT.

To be fair he doesn't seem to make the claim that TFT is dominant in the fixed length iterated PD. (I noticed how outraged I was that Yvain was making such a basic error so I thought I should double check before agreeing emphatically!) Even so I'm not comfortable with just saying TFT is "evolutionarily dominant" in completely unspecified circumstances.

You'll notice I used scare quotes around most of the words you objected to. I'm trying to point out the apparent paradox, using the language that game theorists and other people not already on this website would use, without claiming that the paradox is real or unsolvable.

"This so-called "superrationality” " in the post is still wrong, I think. Would work without "so-called", since the meaning is clear from the context, but it's not conventional usage.

By definition?

I am largely appreciative of your overall comment, but "rational" is a historically legitimate term to describe naive utility-maximisers in this manner. The original post introduced it in inverted commas, suggesting a special usage of the term. While there are less ambiguous ways this could have been expressed, it seems to me the main benefit of doing so would be to pre-empt people complaining about an unfavourable usage of the term "rational". Your response to it seems excessive.

Agreed.

I want to point out that Eliezer's (and LW's general) use of the word 'rationality' is entirely different from the use of the word in the game theory literature, where it usually means VNM-rationaliity, or is used to elaborate concepts like sequential rationality in SPNE-type equilibria.

ETA: Reading Grognor's reply to the parent, it seems that much of the negative affect is due to inconsistent use of the word 'rational(ity)' on LW. Maybe it's time to try yet again to taboo LW's 'rationality' to avoid the namespace collision with academic literature.

I want to point out that Eliezer's (and LW's general) use of the word 'rationality' is entirely different from the use of the word in the game theory literature

And the common usage of 'rational' on lesswrong should be different to what is used in a significant proportion of game theory literature. Said literature gives advice, reasoning and conclusions that is epistemically, instrumentally and normatively bad. According to the basic principles of the site it is in fact stupid and not-rational to defect against a clone of yourself in a true Prisoner's Dilemma. A kind of stupidity that is not too much different to being 'rational' like Spock.

ETA: Reading Grognor's reply to the parent, it seems that much of the negative affect is due to inconsistent use of the word 'rational(ity)' on LW. Maybe it's time to try yet again to taboo LW's 'rationality' to avoid the namespace collision with academic literature.

No. The themes of epistemic and instrumental rationality are the foundational premise of the site. It is right there in the tagline on the top of the page. I oppose all attempts to replace instrumental rationality with something that involves doing stupid things.

I do endorse avoiding excessive use of the word.

Said literature gives advice, reasoning and conclusions that is epistemically, instrumentally and normatively bad.

This is a recurring issue, so perhaps my instructor and textbooks were atypical: we never discussed or even cared whether someone should defect on PD in my game theory course. The bounds were made clear to us in lecture – game theory studies concepts like Nash equilibria and backward induction (using the term 'rationality' to mean VNM-rationality) and applies them to situations like PD; that is all. The use of any normative language in homework sets or exams was pretty much automatically marked incorrect. What one 'should' or 'ought' to do were instead relegated to other courses in, e.g, economics, philosophy, political science. I'd like to know from others if this is a typical experience from a game theory course (and if anyone happens to be working in the field: if this is representative of the literature).

And the common usage of 'rational' on lesswrong should be different to what is used in a significant proportion of game theory literature.

No. The themes of epistemic and instrumental rationality are the foundational premise of the site. It is right there in the tagline on the top of the page. I oppose all attempts to replace instrumental rationality with something that involves doing stupid things.

Upon reflection, I tend to agree with these statements. In this case, perhaps we should taboo 'rationality' in its game theoretic meaning – use the phrase 'VNM-rationality' whenever that is meant instead of LW's 'rationality'.

Game theory is particularly interesting because it adds up to normalcy so fast - very simple math on very simple situations very rapidly describes real life, macro-level behaviour.

Tom Siegfried has a powerful quote: Game theory captures something about how the world works. (note: bizarre HTML book setup, read pg 73-5)

The evolutionarily dominant strategy is commonly called “Tit-for-tat” - basically, cooperate if and only if you expect your opponent to do so.

No, Tit-for-Tat co-operates if and only if the other player co-operated last time. It works only in an iterated Prisoner's Dilemma, where you have multiple interactions with the same player.

Co-operate if and only if you expect the other player to co-operate (because of reputation, emotional behaviour etc.) is a quite different strategy. Strategies with some reputational or prediction element like this will work in cases where you have a one-shot interaction with each player, but multiple interactions with a community of players.

Strategies where you "commit" yourself to co-operating but only with players who have similarly committed themselves are different yet again, and work on true one-shot dilemmas. These are the "super-rational" strategies. Variants of these (TDT, UDT etc.) avoid the need for explicit commitments, since they always do what they wish they'd committed to doing.

Basically, I think you are mixing up quite different solutions here to the Prisoner's Dilemma.

EDIT: I see that Wedifrid made much the same points already. Sorry for the repetition.

The evolutionarily dominant strategy is commonly called “Tit-for-tat” - basically, cooperate if and only if you expect your opponent to do so.

That strategy is neither evolutionarily dominant nor "tit-for-tat". Tit-for-tat is applicable in the Iterated Prisoner's Dilemma with unknown duration and involves cooperating on the first round thereafter doing whatever the opponent did in the round before the current round. As the name implies it is somewhat like a specific implementation of "eye for an eye".

As for evolutionary dominance the strategy "cooperate if and only if you expect your opponent to do so" is strictly worse than "If cooperating gives the expected result of your opponent also cooperating then do so unless they will cooperate anyway". ie. Your proposed "completely-not-tit-for-tat" strategy cooperates with CooperateBot which is a terrible move. ("CooperateBot" may represent a much smaller or lower status monkey, for example.)

This sequence is really great - thank you for writing it!

You may want to be more careful about using game theory on real-world problems. Game theory makes a lot of assumptions (some explizit, others implizit) that most of the time are not given in real life.

You will even have a hard time to find good examples for real life prisoners who are in a game theoretic PD. In reality, most of the times the prisoners dilemma looks rather like this: Same payoff matrix as the classical PD, BUT both prisoners may chose to break their silence any time. Once a prisoner has confessed, there is no more going back to silence. This situation will probably yield cooperation (unless the prisoners cannot get accurate information).

  • Most real-world situations do neither fit one-shot games nor iterated games. Real life rarely has discrete game turns.

  • Many real-world situations are about behaviours that have some duration and can be stopped/changed.

  • Many real- life situations permit/enable the player to change his decision from cooperation to defection if he learns about the defection of the other player. In many of these situations the betrayed player is able to change his own action fast enough to deny the defector any advantage from defecting.

Of course, you can still model this with game theory, but you need to break "turns" into smaller units (Planck seconds, if you want to go all the way), as for iteration vs something being a one shot game, you could say either of these is universally the case based on definitions of what is a repetition of the same game, and what is different enough to qualify as a new scenario.

So game theory is not broken for real world problems, but like any theory I have seen when you scale it up from a simple puzzle to interactions with the universe you make the problem more difficult.

I like it, but suggest that you link back to the previous entry in the sequence and/or the sequence index.

Tangentially, I now understand exactly what I don't like about Eric S. Raymond's morality:

I am among those who fear... that the U.S. response to 9/11 was not nearly as violent and brutal as it needed to be. To prevent future acts of this kind, it is probably necessary that those who consider them should shit their pants with fear at the mere thought of the U.S.’s reaction.

... the correct response to a person who says “You do not own yourself, but are owned by society (or the state), and I am society (or the state) speaking.” is to injure him as gravely as you think you can get away with ... In fact, I think if you do not do violence in that situation you are failing in a significant ethical duty.

A precommitment to retribution is effective when dealing with "rational" agents or CDT agents. In fact a self-interested TDT agent in a world of CDT agents would do well by retaliating against all injuries with disproportionate force. (And also issuing extortionate threats; to be fair, ESR doesn't advocate this.) If you buy Gary Drescher's reduction of morality to decision theory, this is where the moral duty of revenge comes from. But a superrational agent in a world of superrational agents will rarely need to exercise retribution at all.

Human beings might be thought of as superrational agents who make mistakes (moral errors). I don't know of a good technical model for this, but I feel like one recommendation that would come out of it is to not punish people disproportionately, because what if others did that to you when you make a mistake?

I think the utility-maximizing reasons to avoid disproportionate punishment in such cases among humans have more to do with likely perpetrators being somewhat blind to such disincentives (remember, these are people who attack others by killing themselves,) and the fact that nations operate in a reputation system where acts of disproportionality which are too large tend to attract negative reputation.

Since humans tend to operate under friendship/anger/fairness formulations rather than utility maximizing ones, a sultan who honors a precommitment to cut off the head of a man who elopes with his daughter is seen as reasonable in circumstances (including a historically common degree of protectiveness of one's offspring) where one who massacred the man's entire village to be even more sure of deterring other attempts to cross him would be viewed as cruel and tyrannical.

I'd be curious to see what the results of a tournament of iterated PD with noise (i.e., each move is flipped with probability 5% -- and the opponent will never know the pre-noise move) would be.

Prisoners' Dilemmas even come up in nature. In baboon tribes, when a female is in “heat”, males often compete for the chance to woo her. The most successful males are those who can get a friend to help fight off the other monkeys, and who then helps that friend find his own monkey loving. But these monkeys are tempted to take their friend's female as well. Two males who cooperate each seduce one female. If one cooperates and the other defects, he has a good chance at both females. But if the two can't cooperate at all, then they will be beaten off by other monkey alliances and won't get to have sex with anyone. Still a Prisoner's Dilemma!

If anything this is closer to Parfit's Hitchiker. While both are game theory problems and a good decision theory will result in cooperation on both of them (assuming adequate prediction capabilities) the non-sequential nature of the game matters and some decision making strategies will handle them differently.

Many uses of the word "rational" here were fine ("rational economic agent" is understandable), but others really bothered me ("It is distasteful and a little bit contradictory to the spirit of rationality to believe it should lose out so badly to simple emotion" -- why perpetuate the Spock myth? I want to show this to my friends!). I have no specific suggestion at hand, but circumlocuting around the word in some of the cases above would bring the article from excellent to perfection.

We all enjoy defecting of a salesman, who doesn't cooperate holding a price high, but defect and lower it to have a gain.

The defection in economy has its implication in this mechanism of pricing.

The defecting is just as crucial!

In this way, defection seems to have two social meanings:

Defecting proactively is betrayal. Defecting reactively is punishment.

We seem to have strong negative opinions of the former and somewhat positive opinions of the latter. I think in your salesman example you're talking about punishment being crucial. In fact, the defection of the customer is only necessary as a response to the salesman's original defection.

I am curious as to whether you have a similarly real life example of where proactive defection (i.e. betrayal) is crucial (for some societal or group benefit)?

Defecting proactively is betrayal. Defecting reactively is punishment.

We seem to have strong negative opinions of the former and somewhat positive opinions of the latter.

And for this reason we tend to be predisposed to interpreting the behavior of enemies as 'proactive/betrayal' and our own as 'reactive/punishment' (where we acknowledge that we have defected at all).

Excellent article, but the image is too wide so it gets cropped by LW's fixed-width content section. Looks like this. Using Firefox 9.0.

If I defect but you cooperate, then I get to spend all day on the beach and still get a good grade - the best outcome for me, the worst for you.

! No, it's not. The "you" in this example prefers getting a good grade and privately fuming about having to do the work themselves to failing and not having to do the work. (And actually, for many of the academic group projects I've been involved in, it's less and happier work for the responsible member to do everything themselves, because there's too little work for too many people otherwise.)

The basic solution to the prisoner's dilemma: the only winning move is not to play. When you find a payoff matrix that looks like the prisoner's dilemma when denominated in years, add other considerations to the game until it is no longer the prisoner's dilemma when denominated in utility.

What's neat about that statement of the solution is that it naturally extends in every direction. Precommitments are an attempt to add another consideration- and their credibility is how much utility it looks like you can bestow to them. "My utility for punishing my daughter's defiler is higher than my utility for having clean carpets" is easy to believe, "my utility for keeping my commitments is higher than my utility for being $20,000 richer" is difficult to believe. (Scale should impact willingness to accept on an ultimatum game.)

Reputation is a way to change many one-time Prisonner's Dilemmas into one big Iterated Prisonner's Dilemma, where mutual cooperation is the best strategy for rational players. But how exactly does it work in real life?

I guess it works better when a small group of people interact again and again; and it works worse in a large group of people where many interactions are with strangers. So we should expect more cooperation in a village than in a big city.

Even in big cities people can create smaller units and interact more frequently within these units. So they would trust more their neighbors, coworkers, etc. But here is an opportunity for exploitation by people who don't mind frequent migration or changing jobs -- they often reset their social karma, so they should be trusted less. We should be also suspicious about other karma resetting moves, such as when a company changes their name.

Even if we don't know someone in person, we can do some probabilistic reasoning by thinking about their reference class: "Do I have an experience of people with traits X, Y, Z cooperating or defecting?" However, this kind of reasoning may be frowned upon socially, sometimes even illegal. It is said that people should not be punished for having a bad reference class because of a trait they didn't choose voluntarily and cannot change. Sometimes it is considered wrong to judge people by changeable traits, for example how they are dressed. On the other hand, reference classes like "people having a university diploma" are socially allowed.

Here is an interesting (potentially mindkilling) prediction: When it is legally forbidden to use reference classes and other forms of evaluating prestige, the rate of defection increases. (An extreme situation would be when you are legally required to cooperate regardless of what your opponent does.)

Even in big cities people can create smaller units and interact more frequently within these units. So they would trust more their neighbors, coworkers, etc. But here is an opportunity for exploitation by people who don't mind frequent migration or changing jobs -- they often reset their social karma, so they should be trusted less.

And they are -- itinerant people are universally less trusted than the ones with home addresses.

Another, mostly unrelated comment: the ultimatum game can actually tell you two different things. First, what divisions do people propose, and second, what divisions do people accept?

Presumably, everyone accepts fair divisions. Different groups of people have different percentages that reject unfair divisions, and different percentages that offer unfair divisions (a simplification, since the degree of fairness can also be varied). There are four potential clusters: groups that propose fair and accept unfair, groups that propose fair and reject unfair, groups that propose unfair and accept unfair, and groups that propose unfair and reject unfair. (Empirically, I believe only three of these clusters show up, but it'd take a lit search to verify that. The first two groups can be differentiated by having confederates propose unfair divisions)

These differences between people groups have real-world implications.

[edit] Note that Kaj_Sotala has found at least one paper on the subject.