Richard_Ngo

Former AI safety research engineer, now AI governance researcher at OpenAI. Blog: thinkingcomplete.com

Sequences

Stories
Meta-rationality
Replacing fear
Shaping safer goals
AGI safety from first principles

Wiki Contributions

Comments

The thing that distinguishes the coin case from the wind case is how hard it is to gather additional information, not how much more information could be gathered in principle. In theory you could run all sorts of simulations that would give you informative data about an individual flip of the coin, it's just that it would be really hard to do so/very few people are able to do so. I don't think the entropy of the posterior captures this dynamic.

The variance over time depends on how you gather information in the future, making it less general. For example, I may literally never learn enough about meteorology to update my credence about the winds from 0.5. Nevertheless, there's still an important sense in which this credence is more fragile than my credence about coins, because I could update it.

I guess you could define it as something like "the variance if you investigated it further". But defining what it means to investigate further seems about as complicated as defining the reference class of people you're trading against. Also variance doesn't give you the same directional information—e.g. OP would bet on doom at 2% or bet against it at 16%.

Overall though, as I said above, I don't know a great way to formalize this, and would be very interested in attempts to do so.

Answer by Richard_NgoApr 19, 2024154

I don't think there's a very good precise way to do so, but one useful concept is bid-ask spreads, which are a way of protecting yourself from adverse selection of bets. E.g. consider the following two credences, both of which are 0.5.

  1. My credence that a fair coin will land heads.
  2. My credence that the wind tomorrow in my neighborhood will be blowing more northwards than southwards (I know very little about meteorology and have no recollection of which direction previous winds have mostly blown).

Intuitively, however, the former is very difficult to change, whereas the latter might swing wildly given even a little bit of evidence (e.g. someone saying "I remember in high school my teacher mentioned that winds often blow towards the equator.")

Suppose I have to decide on a policy that I'll accept bets for or against each of these propositions at X:1 odds (i.e. my opponent puts up $X for every $1 I put up). For the first proposition, I might set X to be 1.05, because as long as I have a small edge I'm confident I won't be exploited.

By contrast, if I set X=1.05 for the second proposition, then probably what will happen is that people will only decide to bet against me if they have more information than me (e.g. checking weather forecasts), and so they'll end up winning a lot of money for me. And so I'd actually want X to be something more like 2 or maybe higher, depending on who I expect to be betting against, even though my credence right now is 0.5.

In your case, you might formalize this by talking about your bid-ask spread when trading against people who know about these bottlenecks.

I think the two things that felt most unhealthy were:

  1. The "no forgiveness is ever possible" thing, as you highlight. Almost all talk about ineradicable sin should, IMO, be seen as a powerful psychological attack.
  2. The "our sins" thing feels like an unhealthy form of collective responsibility—you're responsible even if you haven't done anything. Again, very suspect on priors.

Maybe this is more intuitive for rationalists if you imagine a SJW writing a song about how, even millions of years in the future, anyone descended from westerners should still feel guilt about slavery: "Our sins can never be undone. No single death will be forgiven." I think this is the psychological exploit that's screwed up leftism so much over the last decade, and feels very analogous to what's happening in this song.

Just read this (though not too carefully). The book is structured with about half being transcripts of fictional lectures given by Bostrom at Oxford, about a quarter being stories about various woodland creatures striving to build a utopia, and another quarter being various other vignettes and framing stories.

Overall, I was a bit disappointed. The lecture transcripts touch on some interesting ideas, but Bostrom's style is generally one which tries to classify and taxonimize, rather than characterize (e.g. he has a long section trying to analyze the nature of boredom). I think this doesn't work very well when describing possible utopias, because they'll be so different from today that it's hard to extrapolate many of our concepts to that point, and also because the hard part is making it viscerally compelling.

The stories and vignettes are somewhat esoteric; it's hard to extract straightforward lessons from them. My favorite was a story called The Exaltation of ThermoRex, about an industrialist who left his fortune to the benefit of his portable room heater, leading to a group of trustees spending many millions of dollars trying to figure out (and implement) what it means to "benefit" a room heater.

Just read Bostrom's Deep Utopia (though not too carefully). The book is structured with about half being transcripts of fictional lectures given by Bostrom at Oxford, about a quarter being stories about various woodland creatures striving to build a utopia, and another quarter being various other vignettes and framing stories.

Overall, I was a bit disappointed. The lecture transcripts touch on some interesting ideas, but Bostrom's style is generally one which tries to classify and taxonimize, rather than characterize (e.g. he has a long section trying to analyze the nature of boredom). I think this doesn't work very well when describing possible utopias, because they'll be so different from today that it's hard to extrapolate many of our concepts to that point, and also because the hard part is making it viscerally compelling.

The stories and vignettes are somewhat esoteric; it's hard to extract straightforward lessons from them. My favorite was a story called The Exaltation of ThermoRex, about an industrialist who left his fortune to the benefit of his portable room heater, leading to a group of trustees spending many millions of dollars trying to figure out (and implement) what it means to "benefit" a room heater.

Fantastic work :)

Some thoughts on the songs:

  • I'm overall super impressed by how well the styles of the songs fit the content—e.g. the violins in FHI, the British accent works really well for More Dakka, the whisper for We Do Not Wish, the Litany of Tarrrrski, etc.
  • My favorites to listen to are FHI at Oxford, Nihil Supernum, and Litany of Tarrrrski, because they have both messages that resonate a lot and great tunes.
  • IMO Answer to Job is the best-composed on artistic merits, and will have the most widespread appeal. Tune is great, style matches the lyrics really well (particular shout-out to the "or labor or lust" as a well-composed bar). Only change I'd make is changing "upon lotus thrones" to "on lotus thrones" to scan better.
  • Dath Ilan's Song feels... pretty unhealthy, tbh.
  • I thought Prime Factorization was really great until the bit about the car and the number, which felt a bit jarring.

If it was the case that there was important public information attached to Scott's full name, then this argument would make sense to me.

In general having someone's actual name public makes it much easier to find out other public information attached to them. E.g. imagine if Scott were involved in shady business dealings under his real name. This is the sort of thing that the NYT wouldn't necessarily discover just by writing the profile of him, but other people could subsequently discover after he was doxxed.

To be clear, btw, I'm not arguing that this doxxing policy is correct, all things considered. Personally I think the benefits of pseudonymity for a healthy ecosystem outweigh the public value of transparency about real names. I'm just arguing that there are policies consistent with the NYT's actions which are fairly reasonable.

But it wasn't a cancellation attempt. The issue at hand is whether a policy of doxxing influential people is a good idea. The benefits are transparency about who is influencing society, and in which ways; the harms include the ones you've listed above, about chilling effects.

It's hard to weigh these against each other, but one way you might do so is by following a policy like "doxx people only if they're influential enough that they're probably robust to things like losing their job". The correlation between "influential enough to be newsworthy" and "has many options open to them" isn't perfect, but it's strong enough that this policy seems pretty reasonable to me.

To flip this around, let's consider individuals who are quietly influential in other spheres. For example, I expect there are people who many news editors listen to, when deciding how their editorial policies should work. I expect there are people who many Democrat/Republican staffers listen to, when considering how to shape policy. In general I think transparency about these people would be pretty good for the world. If those people happened to have day jobs which would suffer from that transparency, I would say "Look, you chose to have a bunch of influence, which the world should know about, and I expect you can leverage this influence to end up in a good position somehow even after I run some articles on you. Maybe you're one of the few highly-influential people for whom this happens to not be true, but it seems like a reasonable policy to assume that if someone is actually pretty influential then they'll land on their feet either way." And the fact that this was true for Scott is some evidence that this would be a reasonable policy.

(I also think that taking someone influential who didn't previously have a public profile, and giving them a public profile under their real name, is structurally pretty analogous to doxxing. Many of the costs are the same. In both cases one of the key benefits is allowing people to cross-reference information about that person to get a better picture of who is influencing the world, and how.)

I don't think the NYT thing played much of a role in Scott being better off now. My guess is a small minority of people are subscribed to his Substack because of the NYT thing (the dominant factor is clearly the popularity of his writing).

What credence do you have that he would have started the substack at all without the NYT thing? I don't have much information, but probably less than 80%. The timing sure seems pretty suggestive.

(I'm also curious about the likelihood that he would have started his startup without the NYT thing, but that's less relevant since I don't know whether the startup is actually going well.)

My guess is the NYT thing hurt him quite a bit and made the potential consequences of him saying controversial things a lot worse for him.

Presumably this is true of most previously-low-profile people that the NYT chooses to write about in not-maximally-positive ways, so it's not a reasonable standard to hold them to. And so as a general rule I do think "the amount of adversity that you get when you used to be an influential yet unknown person but suddenly get a single media feature about you" is actually fine to inflict on people. In fact, I'd expect that many (or even most) people in this category will have a worse time of it than Scott—e.g. because they do things that are more politically controversial than Scott, have fewer avenues to make money, etc.

Load More