Unpacking the Concept of "Blackmail"

Keep in mind: Controlling Constant Programs, Notion of Preference in Ambient Control.

There is a reasonable game-theoretic heuristic, "don't respond to blackmail" or "don't negotiate with terrorists". But what is actually meant by the word "blackmail" here? Does it have a place as a fundamental decision-theoretic concept, or is it merely an affective category, a class of situations activating a certain psychological adaptation that expresses disapproval of certain decisions and on the net protects (benefits) you, like those adaptation that respond to "being rude" or "offense"?

We, as humans, have a concept of "default", "do nothing strategy". The other plans can be compared to the moral value of the default. Doing harm would be something worse than the default, doing good something better than the default.

Blackmail is then a situation where by decision of another agent ("blackmailer"), you are presented with two options, both of which are harmful to you (worse than the default), and one of which is better for the blackmailer. The alternative (if the blackmailer decides not to blackmail) is the default.

Compare this with the same scenario, but with the "default" action of the other agent being worse for you than the given options. This would be called normal bargaining, as in trade, where both parties benefit from exchange of goods, but to a different extent depending on which cost is set.

Why is the "default" special here? If bargaining or blackmail did happen, we know that "default" is impossible. How can we tell two situations apart then, from their payoffs (or models of uncertainty about the outcomes) alone? It's necessary to tell these situations apart to manage not responding to threats, but at the same time cooperating in trade (instead of making things as bad as you can for the trade partner, no matter what it costs you). Otherwise, abstaining from doing harm looks exactly like doing good. A charitable gift of not blowing up your car and so on.

My hypothesis is that "blackmail" is what the suggestion of your mind to not cooperate feels like from the inside, the answer to a difficult problem computed by cognitive algorithms you don't understand, and not a simple property of the decision problem itself. By saying "don't respond to blackmail", you are pushing most of the hard work into intuitive categorization of decision problems into "blackmail" and "trade", with only correct interpretation of the results of that categorization left as an explicit exercise.

(A possible direction for formalizing these concepts involves introducing some kind of notion of resources, maybe amount of control, and instrumental vs. terminal spending, so that the "default" corresponds to less instrumental spending of controlled resources, but I don't see it clearly.)

(Let's keep on topic and not refer to powerful AIs or FAI in this thread, only discuss the concept of blackmail in itself, in decision-theoretic context.)

Comments

sorted by
magical algorithm
Highlighting new comments since Today at 8:50 PM
Select new highlight date
All comments loaded

I wonder if this question is related to the revulsion many people feel against certain kinds of price discrimination tactics. I mean things like how in the 19th century, train companies would put intentionally uncomfortable benches in the 3rd class carriages in order to encourage people to buy 2nd class tickets, or nowadays software that comes with arbitrary, programmed-in restrictions that can be removed by paying for the "professional" version.

People really don't like that! It seems like there is some folk-ethics norm that "if you can make me better off with no effort on your part, then you have an obligation to do so", which seems like part of a "no blackmail" condition.

That makes sense from a reciprocal altruism perspective. If someone can benefit you at no cost to themself, and doesn't, that probably indicates a lack of intent to cooperate under all circumstances. The natural response is hostility.

In Bombay, the only difference between first- and second- class cars is the price. The second-class cars are more crowded. I've been trying to think of a nice analogy to blackmail but didn't.

It seems that we cry blackmail when a shelling point already exists, and the other agent is threatening to force us below it. The moral outrage functions as a precommittment to punish the clear defection.

In normal human life, 'do nothing' is the schelling point, because most people don't interact with most people. But sometimes the schelling point does move, and it seems what constitutes blackmail does too: if a child's drowning in a pond, and I tell you I'll only fish him out if you give me $1,000, it seems like I'm blackmailing you.

Sometimes both sides feel like they're being blackmailed though; like when firefighters go on strike, and both city hall and the union accuse the other of endangering people. Could this be put down to coordination problems?

if a child's drowning in a pond, and I tell you I'll only fish him out if you give me $1,000, it seems like I'm blackmailing you.

Perhaps a borderline case like this is most helpful. Is this extortion? Even though the default case in this case isn't 'doing nothing'. The default is saving the child. Because that is what someone should do.

So maybe the word is difficult to unpack because it has morality behind it. A person shouldn't bomb your car, and shouldn't expose your private secrets. On the other hand, they needn't give you food, so it's OK to ask for money for that.

If I demand money for being faithful to my husband, than that is extortion because I'm supposed to be faithful. If, however, I want a divorce and would divorce him, I'm allowed to let him pay me for faithfulness. Such gray areas indicate to me that it is indeed about some notion of expected/moral behavior.

Selling food to starving families -- when they become so poor that you ought to give them food for free -- then that is extortion.

So: demanding more compensation when you should do it for less (or demanding any when you should do it for free).

Don't even get me started on how ill-defined and far from being formally understood the concept of "Schelling point" is. It's very useful in informal game theory of course.

This is a clever idea, but I don't think it works: you need to unpack the question of why a decision algorithm would deem cooperation non-optimal, and see if it coincides with a special class of problems where cooperation is generally non-optimal.

So I think what gets an offer labeled as blackmail is the recognition that cooperation would lead the other party to repeatedly use their discretion to force my next remaining options to be even worse. So blackmail and trade differ in that:

  • If I cooperate wth a blackmailer, they are more likely to spend resources "digging up dirt" on me, kidnapping my loved ones, etc. I don't want to be in that position, regardless of what I decide to do then.
  • If I trade with a trade-offerer, they are more likely to spend resources acquiring goods that I may want to trade for. I do want to be in the position where others make things available to me that I want (except for where I'd be competing with them in that process.)

And yes, these two situations are equivalent, except for what I want the offerer to do, which I think is what yields the distinction, not the concept of a baseline in the initial offer.

You can phrase blackmail as a sort of addiction situation where dynamic inconsistency potentially leaves me vulnerable to exploitation. My preferences at any time t are:

1) Not have an addiction.
2) Have an addiction, and take some more of the drug.
3) Have an addiction, and not take the drug.

where I'm addicted at time t, and taking the drug will make me addicted in time t+1 (and i otherwise won't be addicted in t+1).

In this light, one can view the classification of something as blackmail, as being any feeling or mechanism that makes me choose 3) over 2). "2 looks appealing, but I feel a strong compulsion to do 3." Agents with such a mechanism gain a resistance to dynamic inconsistency.

In contrast, if "addiction" were good, and the item in 1) were moved below 3) in my preference ranking, then I wouldn't benefit from a mechanism that makes me choose 3 over 2. That would feel like trade.

I know this isn't quite rigorous, but if I can calculate the counterfactual "what would the other player's strategy be if ze did not model me as an agent capable of responding to incentives," blackmail seems easy to identify by comparison to this.

Perhaps this can be what we mean by 'default'?

I think this ties into Larks' point -- if Larks didn't think I responded to incentives, I think ze'd just help the child, so asking me $1,000 would be blackmail. Clippy would not help the child, and so asking me $1,000 is trade.

To first order, this means that folks playing decision-theoretic games against me actually have an incentive to self-modify to be all-else-equal sadistic, so that their threats can look like offers. But then I can assume that they would not have so modified in the first place if they hadn't modelled me as responding to incentives, etc. etc.

"an agent incapable of responding to incentives" is not a well-defined agent. What do you respond to? A random number generator? Subliminal messages? Pie?

Pie?

I respond to pie. Are you offering pie?

Should you find yourself in the greater Boston area, drop me a line and I will give you some pie.

(I suspect that there is a context to this comment, and I might even find it interesting if I were to look it up, but I'm sort of enjoying the comment in isolation. Hopefully it isn't profoundly embarrassing or anything.)

Can I take you up on that as well? You can never have too much pie.

Well, you're certainly free to drop me a line if you're in the area, but I'm far less likely to respond, let alone respond with pie.

I really wish "blackmail" were not used to mean extortion.

I had the same reaction, thinking blackmail is a special form of extortion in which the threat is a threat of exposure. But when I sought support from the dictionary, I was disappointed

Dictionaries are histories of usage; not arbiters of meaning. If they were, language would not change in meaning (only add new words) from the moment the first dictionaries were made.

See here

The default is special because it costs the other person time/money/effort to do anything other than the default.

Hence, not blowing up your car is the default, but so is not giving you food.

I feel like what people call blackmail is largely related to intentionality. The blackmailer goes out of their way to harm you should you not cooperate.

In the trade example, whereas if someone wants to trade and you don't, and you need the object but don't trade, we don't blame that on the other person trying to harm you.

Current guess.

Blackmailing is a class of situations similar to Counterfactual Mugging, where you are willing to sacrifice utility in the actual world, in order to control its probability into being lower, so that the counterfactual worlds (that have higher utility) will gain as much probability as possible, and will thus improve the overall expected utility, even as utility of the actual world becomes lower.

Or, simply, you are being blackmailed when you wish this wouldn't be happening, and the correct actions are those that make the reality as improbable as possible.

(In Counterfactual Mugging, you are sacrificing utility in the actual world in order to improve utility of the counterfactual world, while in blackmailing, you are doing the same in order to improve its probability.)

Isn't it because you want to incentivize people to bargain with you but incentivize them not to blackmail you?

It seems to me the relevant difference is that in blackmail one or both parties end up worse off. So a group of individuals who blackmail each other tend to get poorer over time, compared to a group that successfully deters blackmail.

Agent 1 negotiates with agent 2. Agent 1 can take option A or B, while agent 2 can take option C or D. Agent 1 communicates that they will take option A if agent 2 takes option C and will take option B if agent 2 takes option D.

If utilities are such that for

  • agent 1: A > B, C < D, A+C < B + D

and for

  • agent 2: A < B, C > D, A+C < B + D

or

  • agent 1: A < B, C > D, A+C > B + D
  • agent 2: A > B, C < D, A+C > B + D

this is an offer.

If

  • agent 1: A < B, C < D, A+C < B + D
  • agent 2: A < B, C > D, A+C < B + D

or

  • agent 1: A > B, C > D, A+C > B + D
  • agent 2: A > B, C < D, A+C > B + D

this is blackmail by agent 1.

If

  • agent 1: A > B, C < D, A+C < B + D
  • agent 2: A < B, C < D, A+C < B + D

or

  • agent 1: A < B, C > D, A+C > B + D
  • agent 2: A > B, C > D, A+C > B + D

this is agent 1 giving in to agent 2's blackmail.

I don't think I mentioned anything about any "default" anywhere?

(Unless I overlooked something in the other cases there is either no reason to negotiate, no prospect of success in negotiating or at least one party acting irrationally. It is implicitly assumed that preferences between combinations of the options only depend on the preferences between the individual options. )

Notice that under this definition punishing someone for a crime is a form of blackmail.

I'm not sure that's a problem.

Or maybe: Change blackmail in the above to threat, and define blackmail as a threat not legitimized by social conventions.

Well, at least we've unpacked the concept of "default" into the concept of social conventions.

Suppose that Blackmail is

merely an affective category, a class of situations activating a certain psychological adaptation

-- then we should ask what features of the ancestral environment caused us to evolve it. We might understand it better in that case.

I suspect that the ancestral environment came with a very strong notion of a default outcome for a given human, in the absence of there being any particular negotiation, and also came with a clear notion of negative interaction (stabbing, hitting, kicking) versus positive interaction (giving fish, teaching how to hunt better, etc).

My take: what we call "extortion" or "blackmail" is where agent A1 offers A2 a choice between X and Y, both of which are harmful to A2, and where A1 has selected X to be less harmful to A2 than Y with the intention of causing A2 to choose X.

"Not responding to blackmail" comprises A2 choosing Y over X whenever A2 suspects this is going on.

A1 can still get A2 to choose X over Y, even if A2 has a policy of not responding to blackmail, by not appearing to have selected X... that is, by not appearing to be blackmailing A2.

For example, if instead of "I will hurt you if you don't give me money" A1 says "I've just discovered that A3 is planning to hurt you! I can prevent it by taking certain steps on your behalf, but those steps are expensive, and I have other commitments for my money that are more important to me than averting your pain. But if you give me the money, I can take those steps, and you won't get hurt," A2 may not recognize this as blackmail, in which case A1 can finesse A2's policy.

Of course, any reasonably sophisticated human will recognize that as likely blackmail, so a kind of social arms race ensues. Real-world blackmail attempts can be very subtle. (ETA: That extortion is illegal also contributes to this, of course... subtle extortion attempts can reduce A1's legal liability, even when they don't actually fool anyone.)

(Indeed, in some cases A1 can fool themselves, which brings into question whether it's still blackmail. IMHO, the best way to think about cases like that is to stop treating people fooling themselves as unified agents, but that's way off-topic.)

Why is the "default" special here?

Because in a blackmail, I do not wish the trade to happen at all. Let the "default" outcome for a trade T be one where the trade doesn't happen. Assume that my partner (the Baron) gets to decide whether T happens or not.

If T is a blackmail, then every option is worse than not-T. So, if I can commit to ensuring that T is also negative for the Baron, then the Baron won't let T happen. This gives a definition for blackmail: a trade T where every option is worse than not-T, but where I can commit to actions that ensure that T is negative for the person that decides whether T happens or not.

Let's contrast this with another trade T, with no blackmail elements to it, where I am a monopolist or monopsonist. It is still to my advantage to credibly commit to rejecting everything if I don't get 99% of the profit. However, I am limited by the fact that I want the trade to happen; I can't commit to any option that is actually harmful to the Baron. He will trade with me as long as he doesn't lose; his 'default' ensures that I have to give him something.

Finally, most trades are not monopolist or monopsonist. In this case, it is not to my advantage to precommit to taking more than "my fair (market) share" of the profit, as that will cause the trade to fail; the Baron's default is higher (he can trade with others) so I have to offer him at least that.

Now, I don't want to go down the rabbit hole of dueling pre-commitments, or the proper decision-theoretic way of resolving the issue (blackmailing someone or precommiting to avoid blackmail are very similar processes). But it does show why you would want to precommit to a particular action in blackmail situations, but not in others: you do not control if the trade happens, and blackmails are trades that you do not want to see happen. You can call not-trading the 'default' if you wish, but the salient fact is that it is better for you, not that it is default.