In reply to:

Good observation.

Amusingly, one possible explanation is that the people who gave Gleb pushback on here were operating on bad-faith-detecting intuitions--this is supported by the quick reaction time. I'd say that those intuitions were good ones, if they lead to those folks giving Gleb pushback on a quick timescale, and I'd also say that those intuitions shaped healthy norms to the extent that they nudged us towards establishing a quick reality-grounded social feedback loop.

But the people who did give Gleb pushback more frequently framed things in terms other than them having bad-faith-detecting intuitions than you'd have guessed, if they were actually concluding that giving Gleb pushback was worth their time based on their intuitions--they pointed to specific behaviors, and so on, when calling him out. But how many of these people actually decided to give Gleb feedback because they System-2-noticed that he was implementing a specific behavior, and how many of us decided to give Gleb feedback because our bad-faith-detecting intuitions noticed something was up, which led us to fish around for a specific bad behavior that Gleb was doing?

If more of us did the latter, this suggests that we have social incentives in place that reward fishing around and finding specific bad behaviors, but to me, fishing around for bad behaviors (i.e. fishing through data) like this doesn't seem too much different from p-hacking, except that fishing around for social data is way harder to call people out on. And if our real reasons for reaching the correct conclusion that Gleb needed to get pushback were based in bad-faith-detecting intuitions, and not in System 2 noticing bad behaviors, then maybe providing social allowance for the mechanism that actually led some of us to detect Gleb a bit earlier to do its work on its own in the future, rather than requiring its use to be backed up by evidence of bad behaviors (junk data) that can be both p-hacked by those who want to criticize independently of what was true, or hidden by those with more skill than Gleb, would be a good idea.

At a minimum, being honest with ourselves about what our real reasons are ought to help us understand our minds a bit better.

But how many of these people actually decided to give Gleb feedback because they System-2-noticed that he was implementing a specific behavior, and how many of us decided to give Gleb feedback because our bad-faith-detecting intuitions noticed something was up, which led us to fish around for a specific bad behavior that Gleb was doing?

I don't know if you can separate it this cleanly. Sometimes you get a smells-funny feeling and then your System 2 goes to investigate. But sometimes -- and I think this was the case with Gleb -- both System 1 and System 2 look at each other and chorus "Really, dude?" :-)

Bad intent is a disposition, not a feeling

It’s common to think that someone else is arguing in bad faith. In a recent blog post, Nate Soares claims that this intuition is both wrong and harmful:

I believe that the ability to expect that conversation partners are well-intentioned by default is a public good. An extremely valuable public good. When criticism turns to attacking the intentions of others, I perceive that to be burning the commons. Communities often have to deal with actors that in fact have ill intentions, and in that case it's often worth the damage to prevent an even greater exploitation by malicious actors. But damage is damage in either case, and I suspect that young communities are prone to destroying this particular commons based on false premises.

To be clear, I am not claiming that well-intentioned actions tend to have good consequences. The road to hell is paved with good intentions. Whether or not someone's actions have good consequences is an entirely separate issue. I am only claiming that, in the particular case of small high-trust communities, I believe almost everyone is almost always attempting to do good by their own lights. I believe that propagating doubt about that fact is nearly always a bad idea.

It would be surprising, if bad intent were so rare in the relevant sense, that people would be so quick to jump to the conclusion that it is present. Why would that be adaptive?

What reason do we have to believe that we’re systematically overestimating this? If we’re systematically overestimating it, why should we believe that it’s adaptive to suppress this?

There are plenty of reasons why we might make systematic errors on things that are too infrequent or too inconsequential to yield a lot of relevant-feeling training data or matter much for reproductive fitness, but social intuitions are a central case of the sort of things I would expect humans to get right by default. I think the burden of evidence is on the side disagreeing with the intuitions behind this extremely common defensive response, to explain what bad actors are, why we are on such a hair-trigger against them, and why we should relax this.

Nate continues:

My models of human psychology allow for people to possess good intentions while executing adaptations that increase their status, influence, or popularity. My models also don’t deem people poor allies merely on account of their having instinctual motivations to achieve status, power, or prestige, any more than I deem people poor allies if they care about things like money, art, or good food. […]

One more clarification: some of my friends have insinuated (but not said outright as far as I know) that the execution of actions with bad consequences is just as bad as having ill intentions, and we should treat the two similarly. I think this is very wrong: eroding trust in the judgement or discernment of an individual is very different from eroding trust in whether or not they are pursuing the common good.

Nate's argument is almost entirely about mens rea - about subjective intent to make something bad happen. But mens rea is not really a thing. He contrasts this with actions that have bad consequences, which are common. But there’s something in the middle: following an incentive gradient that rewards distortions. For instance, if you rigorously A/B test your marketing until it generates the presentation that attracts the most customers, and don’t bother to inspect why they respond positively to the result, then you’re simply saying whatever words get you the most customers, regardless of whether they’re true. In such cases, whether or not you ever formed a conscious intent to mislead, your strategy is to tell whichever lie is most convenient; there was nothing in your optimization target that forced your words to be true ones, and most possible claims are false, so you ended up making false claims.

More generally, if you try to control others’ actions, and don’t limit yourself to doing that by honestly informing them, then you’ll end up with a strategy that distorts the truth, whether or not you meant to. The default state for any given constraint is that it has not been applied to someone's behavior. To say that someone has the honest intent to inform is a positive claim about their intent. It's clear to me that we should expect this to sometimes be the case - sometimes people perceive a convergent incentive to inform one another, rather than a divergent incentive to grab control. But, if you do not defend yourself and your community against divergent strategies unless there is unambiguous evidence, then you make yourself vulnerable to those strategies, and should expect to get more of them.The default hypothesis should be that any given constraint has not been applied to someone's behavior. To say that someone has the honest intent to inform is a positive claim about their intent. It's clear to me that we should expect this to sometimes be the case - sometimes people have a convergent incentive to inform one another, rather than a divergent incentive to grab control. 

I’ve been criticizing EA organizations a lot for deceptive or otherwise distortionary practices (see here and here), and one response I often get is, in effect, “How can you say that? After all, I've personally assured you that my organization never had a secret meeting in which we overtly resolved to lie to people!”

Aside from the obvious problems with assuring someone that you're telling the truth, this is generally something of a nonsequitur. Your public communication strategy can be publicly observed. If it tends to create distortions, then I can reasonable infer that you’re following some sort of incentive gradient that rewards some kinds of distortions. I don’t need to know about your subjective experiences to draw this conclusion. I don’t need to know your inner narrative. I can just look, as a member of the public, and report what I see.

Acting in bad faith doesn’t make you intrinsically a bad person, because there’s no such thing. And besides, it wouldn't be so common if it required an exceptionally bad character. But it has to be OK to point out when people are not just mistaken, but following patterns of behavior that are systematically distorting the discourse - and to point this out publicly so that we can learn to do better, together.

(Cross-posted at my personal blog.)

[EDITED 1 May 2017 - changed wording of title from "behavior" to "disposition"]

Comments

sorted by
magical algorithm
Highlighting new comments since Today at 10:52 AM
Select new highlight date
All comments loaded

I agree that most relevant bad behavior isn't going to feel from the inside like an attempt to mislead, and I think that rationalists sometimes either ignore this or else have an unfounded optimism about nominal alignment.

It would be surprising, if bad intent were so rare in the relevant sense, that people would be so quick to jump to the conclusion that it is present. Why would that be adaptive?

In the evolutionary context, our utterances and conscious beliefs are optimized for their effects on others, and not merely for accuracy. Believing and claiming bad things about competitors is a typical strategy. Prima facie, accusations of bad faith are particularly attractive since they can be levied on sparse evidence yet are rationally compelling. Empirically, accusations of bad faith are particularly common.

Acting in bad faith doesn’t make you intrinsically a bad person, because there’s no such thing.

This makes an interesting contrast with the content of the post. The feeling that some people are bad is a strong and central social intuition. Do you think you've risen to the standard of evidence you are requesting here? It seems to me that you are largely playing the same game people normally play, and then trying to avoid norms that regulate the game by disclaiming "I'm not playing the game."

For the most part these procedural issues seem secondary to disputes about facts on the ground. But all else equal they're a reason to prefer object-level questions to questions about intent, logical argument and empirical data to intuition, and private discussion to public discussion.

The feeling that some people are bad is a strong and central social intuition. Do you think you've risen to the standard of evidence you are requesting here?

Nope! Good point.

It seems to me that you are largely playing the same game people normally play, and then trying to avoid norms that regulate the game by disclaiming "I'm not playing the game."

Here's a specific outcome I would like to avoid: ganging up on the individuals saying misleading things, replacing them with new individuals who have better track records, but doing nothing to alter the underlying incentives. That would be really bad. I think we actually have exceptionally high-integrity individuals in critical leadership positions now, in ways that make the problem of perceived incentive to distort much easier to solve than it might otherwise be.

I don't actually know how not to play the same old game yet, but I am trying to construct a way.

I don't actually know how not to play the same old game yet, but I am trying to construct a way.

I see you aiming to construct a way and making credible progress, but I worry that you're trying to do to many things at once and are going to cause lasting damage by the time you figure it out.

Specifically, the "confidence game" framing of the previous post moved it from "making an earnest good faith effort to talk about things" to "the majority of the post's content is making a status move"[1] (in particular in the context of your other recent posts, and is exacerbated by this one), and if I were using the framing of this current post I'd say both the previous post and this one have bad intent.

I don't think that's a good framing - I think it's important that you (and folk at OpenPhil and at CEA) do not just have an internally positive narrative but are actually trying to do things that actually cache out to "help each other" (in a broad sense of "help each other"). But I'm worried that this will not remain the case much longer if you continue on your current trajectory.

A year ago, I was extremely impressed with the work you were doing and points you were making, and frustrated that those points were not having much impact.

My perception was "EA Has A Lying Problem" was an inflection point where a) yeah, people started actually paying attention to the class of criticism you're doing, but the mechanism by which people started paying attention was by critics invoking rhetoric and courting controversy, which was approximately as bad as the problem it was trying to solve. (or at least, within an order of magnitude as bad)

[1] I realize there was a whole lot of other content of the Confidence Game post that was quite good. But, like, the confidence game part is the part I remember easily. Which is the problem.

Could you say more about which things you think I should be doing separately instead of together, and why?

Things I notice you doing:

  1. Meta discussion of how to have conversations / high quality discourse / why this is important
  2. Evaluating OpenPhil and CEA as institutions, in a manner that's aiming to be evenhanded and fair
  3. Making claims and discussing OpenPhil and CEA that seem pretty indistinguishable from "punishing them and building public animosity towards them."

Because of #3, I think it's a lot harder to have credibility when doing #1 or #2. I think there is now enough history with #3 (perceived or actual, whatever your intent), that if you want to be able to do #1 or #2 you need to signal pretty hard that you're not doing #3 anymore, and specifically take actions aiming to rebuild trust. (And if you were doing #3 by accident, this includes figuring out why your process was outputting something that looked like 3)

I have thoughts about "what to do to cause OpenPhil and CEA to change their behavior" which'll be a response to tristanm's comment.

Making claims and discussing OpenPhil and CEA that seem pretty indistinguishable from "punishing them and building public animosity towards them."

Um, this notion that publicly criticizing organizations such as OpenPhil and CEA amounts to unhelpfully "punishing them and building public animosity towards them", and thus is per se something to be avoided, is exactly one of the glaring issues "EA has a Lying Problem" (specifically, the subsection Criticizing EA orgs is harmful to the movement) was about. Have we learned nothing since then?

I think I'm mostly going to have to retreat to "this is a very important conversation that I would very much like to have over skype but I think online text is not a good medium for it."

But we've had this conversation online, when EA Has A Lying Problem was first posted. Some worthwhile points were raised that are quite close to your position here, such as the point that unrealistic standards of idealism/virtue, honesty and prompt response to criticism (that is, unrealistic for broadly any real-world institution) could undermine the very real progress that EA orgs are hopefully making, compared to most charitable organizations. This is very much true, but the supposed implication that any and all internal critiques are per se harmful simply doesn't follow!

I wasn't saying any and all critiques are harmful - the specific thing I was saying was "these are three things I see you doing right now, and I don't think you can do all of those within a short timespan."

Independently, I also think some-but-not-all of the specific critiques you are making are harmful, but that wasn't the point I was making at the time.

The reason I'd much prefer to have the conversation in person is because by now the entire conversation is emotionally charged (at least for me, and it looks like for you), in a way that is counterproductive. Speaking only for myself, I know that in an in person conversation where I can read facial expressions, I can a) more easily maintain empathy throughout the process, b) as soon as I hit a point where either we disagree, or where the conversation is getting heated, it's a lot easier to see that, step back and say "okay let's stop drop and doublecrux." (And, hopefully, often realize that something was a simple misunderstanding rather than a disagreement)

Online, there are two options at any interval: write out a short point, or write out a long point. If I write out a short point, it won't actually address all the things I'm trying to point at. If I write a long point, at least one thing will probably be disagreed with or misunderstood, which will derail the whole post.

A) I think this is probably a good thing to do when an online conversation is accumulating drama and controversy.

B) Even if it's not, I very much want to test it out and find out if it works.

Overall, a lot of this feels to me like asking me to do more work, with no compensation, and no offers of concrete independent help, and putting the burden of making the interaction go well on the critic.

A year ago, I was extremely impressed with the work you were doing and points you were making, and frustrated that those points were not having much impact.

It would have been very, very helpful at that time to have public evidence that anyone at all agreed or at least thought that particular points I was making needed an answer. I'm getting that now, I wasn't getting that then, so I find it hard to see the appeal in going back to a style that wasn't working.

My perception was "EA Has A Lying Problem" was an inflection point where a) yeah, people started actually paying attention to the class of criticism you're doing, but the mechanism by which people started paying attention was by critics invoking rhetoric and courting controversy, which was approximately as bad as the problem it was trying to solve. (or at least, within an order of magnitude as bad)

That was a blog post by Sarah Constantin. I am not Sarah Constantin. I wrote my own post in response and about the same things, which no one is bringing up here because no one remembers it. It got a bit of engagement at the time, but I think most of that was spillover from Sarah's post.

If you want higher-quality discourse, you can engage more publicly with what you see as the higher-quality discourse. My older posts are still available to engage with on the public internet, and were written to raise points that would still be relevant in the future.

I agree that the "confidence game" framing, and particularly the comparison to a ponzi scheme seemed to me like surprisingly charged language, and not the kind of thing you would do if you wanted a productive dialogue with someone.

I'm not sure whether Benquo means for it to come across that way or not. (Pro: maybe he has in fact given up on direct communication with OpenPhil, and thinks his only method of influence is riling up their base. Con: maybe he just thought it was an apt metaphor and didn't model it as a slap-in-the-face, like I did. Or maybe something else I'm missing.)

But all else equal they're a reason to prefer object-level questions to questions about intent, logical argument and empirical data to intuition, and private discussion to public discussion.

I agree with the first two items, but consider that the content of these private discussions, and whatever the conclusions that are being drawn from them, are probably only visible to the wider community in the form of the decisions being made at the highest levels. Therefore, how do you ensure that when these decisions are made, and the wider community is expected to support them, that there will not be disagreement or confusion? Especially since the reasoning behind them is probably highly complex.

Then this begs the question, what is the distribution of private / public discussion that is the most preferred? Certainly if all discussion was kept private, then the wider community (especially the EA community) would have no choice but to support decisions on faith alone. And at the other extreme, there is the high cost of writing and publishing documentation of reasoning, the risk of wide misunderstanding and confusion, and the difficulty associated with trying to retract or adjust statements that are no longer supported.

And if "private" discussion wasn't constrained to just a small circle, but rather simply meant that you would have to communicate to each individual inquiry separately, than that may come at an even greater cost than that of simply publishing your thoughts openly, because it would require you to devote your attention and effort into multiple, possibly numerous individual discussions, that require modeling each person's level of knowledge and understanding.

I essentially don't think the answer is going to be as simple as "private" vs "public", but I tend to err on the side of transparency, though this may reflect more of a value than a belief based on strong empirical data.

It would be surprising, if bad intent were so rare in the relevant sense, that people would be so quick to jump to the conclusion that it is present. Why would that be adaptive?

Human punishment of free riders helps ensure there are few free-riders. Our fear and surprise responses are ridiculously over sensitive, because of the consequences of type 1 vs type 2 errors. Etc...

Evolution, too, is into massive A/B testing with no optimisation target that includes truth.

That seems plausible, and suggests that the low rate of free-riders is causally related to our readiness to call out suspected ones.

This suggests that the right thing to do is to try to reduce the cost, rather than the rate, of false-positives. And surely not to demolish this Chesterton's Fence without a good replacement fix for the underlying problem.

This suggests it's more useful to compare human groups and see how they manage the problem, rather than trying to parse the ins and outs of evolutionary psychology.

I think the burden of evidence is on the side disagreeing with the intuitions behind this extremely common defensive response

Note also that most groups treat their intuitions about whether or not someone is acting in bad faith as evidence worth taking seriously, and that we're remarkable in how rarely we tend to allow our bad-faith-detecting intuitions to lead us to reach the positive conclusion that someone is acting in bad faith. Note also that we have a serious problem with not being able to effectively deal with Gleb-like people, sexual predators, etc, and that these sorts of people reliably provoke person-acting-in-bad-faith-intuitions in people with (both) strong and accurate bad-faith-sensing intuitions. (Note that having strong bad-faith-detecting intuitions correlates somewhat with having accurate ones, since having strong intuitions here makes it easier to pay attention to your training data, and thus build better intuitions with time). Anyways, as a community, taking intuitions about when someone's acting in bad faith more seriously on the margin could help with this.

Now, one problem with this strategy is that many of us are out of practice at using these intuitions! It also doesn't help that people without accurate bad-faith-detecting intuitions often typical-mind fallacy their way into believing that there aren't people who have exceptionally accurate bad-faith-detecting intuitions. Sometimes this gets baked into social norms, such that criticism becomes more heavily taxed, partly because people with weak bad-faith-detecting intuitions don't trust others to direct their criticism at people who are actually acting in bad faith.

Of course, we currently don't accept person-acting-in-bad-faith-intuitions as useful evidence in the EA/LW community, so people who provoke more of these intuitions are relatively more welcome here than in other groups. Also, for people with both strong and accurate bad-faith-detecting intuitions, being around people who set off their bad-faith-sensing intuitions isn't fun, so such people feel less welcome here, especially since a form of evidence they're good at acquiring isn't socially acknowledged or rewarded, while it is acknowledged and rewarded elsewhere. And when you look around, you see that we in fact don't have many people with strong and accurate bad-faith-detecting intuitions; having more of these people around would have been a good way to detect Gleb-like folks much earlier than we tend to.

How acceptable bad-faith-detecting intuitions are in decision-making is also highly relevant to the gender balance of our community, but that's a topic for another post. The tl;dr of it is that, when bad-faith-detecting intuitions are viewed as providing valid evidence, it's easier to make people who are acting creepy change how they're acting or leave, since "creepiness" is a non-objective thing that nevertheless has a real, strong impact on who shows up at your events.

Anyhow, I'm incredibly self-interested in pointing all of this out, because I have very strong (and, as of course I will claim, very accurate) bad-faith-detecting intuitions. If people with stronger bad-faith-detecting intuitions are undervalued because our skill at detecting bad actors isn't recognized, then, well, this implies people should listen to us more. :P

effectively deal with Gleb-like people

Here on LW Gleb got laughed at almost immediately as he started posting. Did he actually manage to make any inroads into EA/Bay Area communities? I know EA ended up writing a basically "You are not one of us, please go away" post/letter, but it took a while.

Good observation.

Amusingly, one possible explanation is that the people who gave Gleb pushback on here were operating on bad-faith-detecting intuitions--this is supported by the quick reaction time. I'd say that those intuitions were good ones, if they lead to those folks giving Gleb pushback on a quick timescale, and I'd also say that those intuitions shaped healthy norms to the extent that they nudged us towards establishing a quick reality-grounded social feedback loop.

But the people who did give Gleb pushback more frequently framed things in terms other than them having bad-faith-detecting intuitions than you'd have guessed, if they were actually concluding that giving Gleb pushback was worth their time based on their intuitions--they pointed to specific behaviors, and so on, when calling him out. But how many of these people actually decided to give Gleb feedback because they System-2-noticed that he was implementing a specific behavior, and how many of us decided to give Gleb feedback because our bad-faith-detecting intuitions noticed something was up, which led us to fish around for a specific bad behavior that Gleb was doing?

If more of us did the latter, this suggests that we have social incentives in place that reward fishing around and finding specific bad behaviors, but to me, fishing around for bad behaviors (i.e. fishing through data) like this doesn't seem too much different from p-hacking, except that fishing around for social data is way harder to call people out on. And if our real reasons for reaching the correct conclusion that Gleb needed to get pushback were based in bad-faith-detecting intuitions, and not in System 2 noticing bad behaviors, then maybe providing social allowance for the mechanism that actually led some of us to detect Gleb a bit earlier to do its work on its own in the future, rather than requiring its use to be backed up by evidence of bad behaviors (junk data) that can be both p-hacked by those who want to criticize independently of what was true, or hidden by those with more skill than Gleb, would be a good idea.

At a minimum, being honest with ourselves about what our real reasons are ought to help us understand our minds a bit better.

But how many of these people actually decided to give Gleb feedback because they System-2-noticed that he was implementing a specific behavior, and how many of us decided to give Gleb feedback because our bad-faith-detecting intuitions noticed something was up, which led us to fish around for a specific bad behavior that Gleb was doing?

I don't know if you can separate it this cleanly. Sometimes you get a smells-funny feeling and then your System 2 goes to investigate. But sometimes -- and I think this was the case with Gleb -- both System 1 and System 2 look at each other and chorus "Really, dude?" :-)

nod. This does seem like it should be a continuous thing, rather than System 1 solely figuring things out in some cases and System 2 figuring it out alone in others.

I sent a few private notes to him early on about the way I reacted to his posts. This wasn't a "bad faith" detector ( I don't actually buy the premise - such a thing is VERY uncommon compared to honest incorrect values and beliefs), this was a pattern match to an overzealous overconfident newbie, possibly with under-developed social skills. You know, just like all of us a few years (or in my case decades) ago.

I'd guess the same fraction of people reacted disrespectfully to Gleb in each community (i.e. most but not all). The difference was more that in an EA context, people worried that he would shift money away from EA-aligned charities, but on LW he only wasted peoples' time.

Agree in theory, but, lacking an effective bad faith detector myself, how do I know whose intuitions to trust? :(

I'm very glad that you asked this! I think we can come up with some decent heuristics:

  • If you start out with some sort of inbuilt bad faith detector, try to see when, in retrospect, it's given you accurate readings, false positives, and false negatives. I catch myself doing this without having planned to on a System 1 level from time to time. It may be possible, if harder, to do this sort of intuition reshaping in response to evidence with System 2. Note that it sometimes takes a long time, and that sometimes you never figure out, whether or not your bad-faith-detecting intuitions were correct.
  • There's debate about whether a bad-faith-detecting intuition that fires when someone "has good intentions" but ends up predictably acting in ways that hurt you (especially to their own benefit) is "correct". My view is that the intuition is correct; defining it as incorrect and then acting in social accordance with it being incorrect incentivizes others to manipulate you by being/becoming good at making themselves believe they have good intentions when they don't, which is a way of destroying information in itself. Hence why allowing people to get away with too many plausibly deniable things destroys information: if plausible deniability is a socially acceptable defense when it's obvious someone has hurt you in a way that benefits them, they'll want to blind themselves to information about how their own brains work. (This is a reason to disagree with many suggestions made in Nate's post. If treating people like they generally have positive intentions reduces your ability to do collaborative truth-seeking with others on how their minds can fail in ways that let you down--planning fallacy is one example--then maybe it would be helpful to socially disincentivize people from misleading themselves this way by giving them critical feedback, or at least not tearing people down for being ostracizers when they do the same).
  • Try to evaluate other's bad faith detectors by the same mechanism as in the first point; if they give lots of correct readings and not many false ones (especially if they share their intuitions with you before it becomes obvious to you whether or not they're correct), this is some sort of evidence that they have strong and accurate bad-faith-detecting intuitions.
  • The above requires that you know someone well enough for them to trust you with this data, so a quicker way to evaluate other's bad-faith-detecting intuitions is to look at who they give feedback to, criticize, praise, etc. If they end up attacking or socially qualifying popular people who are later revealed to have been acting in bad faith, or if they end up praising or supporting ones who are socially suspected of being up to something who are later revealed to have been acting in good faith, these are strong signals of them having accurate bad-faith-detecting intuitions.
  • Done right, bad-faith-detecting intuitions should let you make testable predictions about who will impose costs or provide benefits to you and your friends/cause; these intuitions become more valuable as you become more accurate at evaluating them. Bad-faith-detecting intuitions might not "taste" like Officially Approved Scientific Evidence, and we might not respect them much around here, but they should tie back into reality, and be usable to help you make better decisions than you'd been able to make without using them.

The binary classification leads to problems. We distinguish cooperative intent, defective intent and hostile intent. The person who optimizes his marketing for conversation without regard for the truth is acting defective and neither cooperative nor hostile.

There's such a thing as hostile intent. Some people are intent to cause harm for other people but those aren't the people with whom we have problems in this community.

I found this helpful. Distinguishing between treating people as friendly agents, tools, and enemy agents seems quite a bit better than the binary good/bad faith distinction. I think a lot of bad faith accusations feel like people are saying "this person is treating me like an enemy agent," but are properly evidence for "this person is treating me like a tool," which is itself sufficient reason to distrust and build common knowledge about them.

In some ways, "enemy agent" and "friendly agent" are more similar attitudes than either is to "tool".

I would probably define "in bad faith" as "trying to deliberately mislead" (which itself is basically lying, just widened a bit to include cases like "but technically speaking this is a true statement" and "but I didn't say anything, just wiggled my eyebrows suggestively"). Do you think it's more complicated than that?

I'm skeptical of the work "deliberately" is doing there. If the whole agent determining someone's actions is following a decision procedure that tries to push my beliefs away from the truth when convenient, then there's a sense in which the whole agent is acting in bad faith, even if they've never consciously deliberated on the matter. At least, it's materially different from unmotivated error, in a way that makes it similar to consciously lying.

Harry Frankfurt's "On Bullshit" introduced the distinction between lies and bullshit. The liar wants to deceive you about the world (to get you to believe false statements), whereas the bullshitter wants to deceive you about his intentions (to get you to take his statements as good-faith efforts, when they are merely meant to impress).

We may need to introduce a third member of this set. Along with lies told by liars, and bullshit spread by bullshitters, there is also spam emitted by spambots.

Like the bullshitter (but unlike the liar), the spambot doesn't necessarily have any model of the truth of its sentences. However, unlike the bullshitter, the spambot doesn't particularly care what (or whether) you think of it. But it optimizes its sentences to cause you to do a particular action.

To me it seems troll is also an important category. Most journalists don't care whether you believe what they write but care that you engage with their writing. Whether you love it or hate it is secondary when you share the post on facebook and twitter.

This is a bit of a definitions dispute, but I want to distinguish between someone whose values/interests/goals do not coincide with yours but who's quite open about it on the one hand, and someone who wants to manipulate you without you realizing what's going on on the other hand. I wouldn't apply the expression "in bad faith" to the former case (that could be a "hostile agent", but that's a different thing), but I would to the latter case.

tries to push my beliefs away from the truth when convenient

So if someone thinks the truth is different from what you think it is and tries to "push your beliefs away", is he acting in bad faith? Consider e.g. your standard sincere Christian missionaries.

Just loudly repeating what you said using my own words... when we talk about optimizing for truth (or any other X), there are essentially 3 options (and of course any mix of them)...

  • optimizing explicitly for X;
  • optimizing neither for X nor against X (but perhaps for something else, or nothing at all); or
  • optimizing explicitly against X.

And while it is a bad form to accuse someone of optimizing against truth, it makes sense to suspect that people are simply not optimizing for truth... which -- especially when they optimize for something else -- usually ends with some misleading, even it there was no conscious intention to mislead.

This said, how to communicate this conclusion of "you need to explicitly optimize for truth, otherwise you will probably end up misleading people even if your intentions are pure"?

Probably needs to be communicated differently among rationalists, and outside of our small community. Either way, it helps emphasising that we talk about "misleading unintentionally" or perhaps just "misunderstanding", i.e. to put high priority on communicating that we are not accusing the other side of having bad intentions, merely that... what they said is not what the perfect version of them would say in a perfect world, and that we would like them to get closer to that.

It would be surprising, if bad intent were so rare in the relevant sense, that people would be so quick to jump to the conclusion that it is present. Why would that be adaptive?

You may not be wrong but I don't think it would necessarily be surprising. We adapted under social conditions that are radically different than exist today. It may no longer be adaptive.

Hypothesis: In small tribes and family groups assumptions of bad faith may have served to help negotiate away from unreasonable positions while strong familial ties and respected third parties mostly mitigated the harms. Conflicts between tribes without familial connections may have tended to escalate however (although there are ways to mitigate against this too).

Hypothesis: Perhaps assumptions of good and bad faith were reasonably accurate in small tribal and familial groups but in intertribal disagreements there was a tendency to assume bad faith because the cost of assuming good faith and being wrong was so much higher than assuming bad faith and being wrong.

So, a couple of thoughts:

1) ascribing intent to behavior is one of the best ways to control someone's behavior, and it's deeply baked into our reactions. You are much more likely to get someone to conform to your desires if you say "you're intentionally behaving badly, stop it" than if you say "I don't like that outcome, but you didn't mean it". Your mind is biased toward seeing (and believing, so you can more forcefully make the accusation) much stronger intent than actually exists.

2) intent is a much better predictor of future behaviors than simple observation. It's far easier to punish or cut off ties to someone with bad intents than with one who's just a little incompetent but means well. Therefore, your mind is biased toward seeing intent so you can take more forceful actions to protect your interests.

3) Nate doesn't make the point directly, but "good" and "bad" are massively oversimplified to the point of being misleading. There are many dimensions to evaluate about a person or organization's likelihood of helping or harming your goals in the future, and in figuring out how best to influence them to be more aligned with your values and beliefs.

4) I'm torn about the object-level objection about statements of value that differ from what behaviors imply. Most humans are not beings of pure thought, and the fact that there are any actions that affect others which are not purely information-sharing doesn't seem that surprising to me.

Nate doesn't make the point directly, but "good" and "bad" are massively oversimplified to the point of being misleading. There are many dimensions to evaluate about a person or organization's likelihood of helping or harming your goals in the future, and in figuring out how best to influence them to be more aligned with your values and beliefs.

This struck me as being such an important oversight that it almost turned Nate's whole post into an academic exercise.

Any given interpersonal disagreement that culminates in an argument is going to have some kind of difficult-to-reconcile opposition of values and/or mutual knowledge at its core. Both parties are generally going to try to use persuasion in some form to manipulate their opponent's sense of the relevant values, or their perception of the details of the situation, or their knowledge and interpretation of the facts. From the other side, this will very often look like a bad-faith attempt to undermine your values and beliefs, and you can't necessarily even say that it isn't.

In the ideal case, a disagreement can be solved purely by sharing all of the relevant facts. This may be the only case where you can actually expect people to come to an agreement without any tinge of feeling that their opponent is acting in bad faith or being manipulative.

In the less ideal case, all the facts may be shared, but a difference in perspective or weighting of various details necessitates further argument to try to come to an agreement. Since you are trying to address your opponent's thinking and perceptions, you are by definition attempting to manipulate their mind. This is true regardless of the "goodness" of your intentions.

In the something-like-worst-cast, fundamentally felt values are in opposition, and no amount of sharing of facts and interpretations is going to lead to agreement. At this point it is difficult to even say that you are acting in good faith even if you think that you are, because you're (perhaps knowingly) trying to persuade someone of something that they believe is wrong and would still believe to be wrong upon indefinite reflection.

The endpoints of "pure good faith" and "pure bad faith" are probably very rare, but the middle ground of muddled manipulativeness and self-justification better describe most arguments.

For more explanation on how incentive gradients interact with and allow the creation of mental modules that can systematically mislead people without intent to mislead, see False Faces.