This is a brief review of On the Chatham House Rule by Scott Garrabrant.

I tend to be very open about my thoughts and beliefs. However, I naturally am still discrete about a lot of things - things my friends told me privately, personal things about myself, and so on.

This has never been a big deal, figuring out the norms around secrecy. For most of my life it's seemed pretty straightforward, and I've not had problems with it. We all have friends who tell us things in private, and we're true to our word. We've all discovered a fact about someone that's maybe a bit embarrassing or personal, where they don't know we know, and so we've e.g. carefully moved a group conversation around not putting pressure on that person to explain why they were unavailable last Tuesday. Discretion is a tool we all use implicitly, and often successfully. 

And yet, in the last two years, I've had to think very carefully about norms around secrecy, and found increasingly difficult problems that have substantially changed my behaviour.

I think the main change is that I much more regularly start interactions with 5-25 minutes of conversation about the expected disclosure norms. And furthermore, I regularly decline information that comes with strong nondisclosure norms attached, after spending a minute talking about the terms and conditions of secrecy.

Now, why do I do this? And what has changed? I'm not certain.

I think I've not enjoyed the weird politicisation of language that happens behind secrecy, where people use abstract words a lot, and then don't correct you when you misuse them, because they can't tell you the thing they were actually thinking of. I especially don't like talking with people who aren't even telling me that they're hiding secret information that informs their opinions, sometimes this feels outright deceptive. 

More than either of these though, I don't like being in that position myself. I like to just say my thoughts out loud in conversation. It's very, very limiting to not be allowed to just answer when someone asks me why I believe what I believe. Note that there's a big difference between not letting certain information out, and recomputing how you would reply and what thoughts you would have if you had never heard that information in the first place.

Riffing off of meta-honesty: it's very hard to be meta-open about openness norms. 

"I see, so you don't want to tell me your opinion on <topic> because you have secret information. I think this is mostly pretty bad for your and my ability to talk and coordinate on this topic. Here are <detailed reasons>. Can you tell me why you disagree with those reasons?" 

"No, because that would be giving away my secret information about <topic>."

This is a conversation I've seen happen, where a researcher I respect sat down with a friend of mine to discuss secrecy, and 10 mins into the conversation the researcher realised they weren't able to say their reasons because of secrecy concerns.

One of the main points of Bostrom's career is that not all ideas should be let out, certainly not in any order. We must become much wiser as a civilization before the maximally free exploration of ideas is universally safe, and I don't think that many people are wrong to keep information secret. My main concern about secrecy is that people do not put in the work to ensure that the public discourse is maintained once people and organisations go dark. But that's a story for another day.

Figuring out how to do secrecy well is important, and hard. Scott's post On The Chatham House Rule takes a fairly common set of explicit secrecy norms, and shows how much more demanding they are than anybody expected, and how lack of clarity around that has lead to their failure to work. It's a helpful example of showing how trying to have such norms is difficult and requires careful thought. I think secrecy is very hard, and we now know that just saying "Chatham House Rules" does not work.

(This post is similar to Lessons from the Cold War on Information Hazards: Why Internal Communications is Critical. However, my affect when reading that post is more like "Hah, those silly americans, they don't know how to run a government," whereas my affect toward the Chatham House Rules post is more like "Wow, this time it was us who thought this secrecy norm made sense and got it wrong," so this post has a more visceral reaction for me.)

When it was first posted I thought it was just an irritating technicality, and didn't think much of the post. But now I see it as part of a broader pattern where secrecy norms have much more wide-ranging implications than one often expects, and require a lot of careful thought to get right. I appreciate having such a clear public example to point to of how simple secrecy norms, that everyone thinks are simple, are not simple at all, are confusing, and break easily. It is nothing like the last word on the subject, but a useful early step in the conversation.

New Comment
10 comments, sorted by Click to highlight new comments since:

I guess this is sort of an agreement with the post... but I don't think the post goes far enough.

Whoever "you guys" are, all you'll do by adopting a lot of secrecy is slow yourselves down radically, while making sure that people who are better than you are at secrecy, who are better than you are at penetrating secrecy, who have more resources than you do, and who are better at coordinated action than you are, will know nearly everything you do, and will also know many things that you don't know.

They will "scoop" you at every important point. And you have approximately zero chance of ever catching up with them on any of their advantages.

The best case long term outcome of an emphasis on keeping dangerous ideas secret would be that particular elements within the Chinese government (or maybe the US government, not that the corresponding elements would necessarily be much better) would get it right when they consolidated their current worldview's permanent, unchallengeable control over all human affairs. That control could very well include making it impossible for anyone to even want to change the values being enforced. The sorts of people most likely to be ahead throughout any race, and most likely to win if there's a hard "end", would be completely comfortable with re-educating you to cure your disharmonious counter-revolutionary attitudes. If they couldn't do that, they'd definitely arrange things so that you couldn't ever communicate those attitudes or coordinate around them.

The worst case outcome is that somebody outright destroys the world in a way you might have been able to talk them out of.

Secrecy destroys your influence over people who might otherwise take warnings from you. Nobody is going to change any actions without a clear and detailed explanation of the reasons. And you can't necessarily know who needs to be given such an explanation. In fact, people you might consider members of "your community" could end up making nasty mistakes because they don't know something you do.

I've spent a lot of my career on the sorts of things where people try to keep secrets, and my overall impression of the AI risk and X-risk communities (including Nick Bostrom) is that they have a profoundly unrealistic, sometimes outright romanticized, view of what secrecy is and what it can do for them (and an unduly rosy view of their prospects for unanimous action in general).

all you'll do by adopting a lot of secrecy is slow yourselves down radically, while making sure that people who are better than you are at secrecy, who are better than you are at penetrating secrecy, who have more resources than you do, and who are better at coordinated action than you are, will know nearly everything you do, and will also know many things that you don't know.

So, there's a few types of secrecy. Here's three.

  • The sort of secrecy you have with friends when you gossip, which most of the time works fine.
  • The sort of secrecy where nobody really knows what is being worked on within companies like Apple and Facebook, whereas there's way more openness about e.g. Google.
  • The sort of secrecy where you're trying to protect yourself from foreign governments, which is way harder.

I'm pretty sure secrecy has been key for Apple's ability to control its brand, and it's not just slowed itself down, and I think that it's plausible to achieve similar levels of secrecy, and that this has many uses. But what you're talking about is secrecy from governmental groups actively trying to hack you.

I largely agree that when a major government wants your info, they can get it, though I'm not sure it's not possible to keep secrets from them with a massive amount of work (I have not thought about it too much). I do question your assumption that governments will end up taking over the world, I think with deeply revolutionary tech like nanotech, AI, and others, different groups can end up taking over the world. So I don't view things as clearly falling toward the outcome of Chinese/US/etc hegemony.

I don't think Apple is a useful model here at all.

I'm pretty sure secrecy has been key for Apple's ability to control its brand,

Well, Apple thinks so anyway. They may or may not be right, and "control of the brand" may or may not be important anyway. But anyway it's true that Apple can keep secrets to some degree.

and it's not just slowed itself down,

Apple is a unitary organization, though. It has a boundary. It's small enough that you can find the person whose job it is to care about any given issue, and you are unlikely to miss anybody who needs to know. It has well-defined procedures and effective enforcement. Its secrets have a relatively short lifetime of maybe as much as 2 or 3 years.

Anybody who is spying on Apple is likely to be either a lot smaller, or heavily constrained in how they can safely use any secret they get. If I'm at Google and I steal something from Apple, I can't publicize it internally, and in fact I run a very large risk of getting fired or turned in to law enforcement if I tell it to the wrong person internally.

Apple has no adversary with a disproportionate internal communication advantage, at least not with respect to any secrets that come from Apple.

The color of the next iPhone is never going to be as interesting to any adversary as an X-risk-level AI secret. And if, say, MIRI actually has a secret that is X-risk-level, then anybody who steals it, and who's in a position to actually use it, is not likely to feel the least bit constrained by fear of MIRI's retaliation in using it to do whatever X-risky thing they may be doing.

There's also the sort of secrecy you have when you signed an NDA because you consult with a company. I would expect a person like Nick Bostrom to have access to information about what happens inside DeepMind that's protected by promises of secrecy.

I can tell you that if you just want to walk into DeepMind (i.e. past the security gate), you have to sign an NDA.

These seem like important considerations, but aren't really engaging with what Chatham House rules are trying to do, which is not to keep secrets, just to keep people's identities obfuscated enough that people feel comfortable speaking freely

I've spent a lot of my career on the sorts of things where people try to keep secrets, and my overall impression of the AI risk and X-risk communities (including Nick Bostrom) is that they have a profoundly unrealistic, sometimes outright romanticized, view of what secrecy is and what it can do for them (and an unduly rosy view of their prospects for unanimous action in general).

I guess I'd be interested in discussing specifics with you, but I can't think of any good public examples of secrecy in ai-risk and x-risk related communities.

Oh actually, yes I can, MIRI's written about going non-disclosed by default. I expect you to think this is fine and probably good and not too relevant, because it's not (as far as the writeup suggests) an attempt to keep secrets from the US government, and you expect they'd fail at that. Is that right?

And OpenAI is attempting to push more careful release practises into the overton window of discussion in the ML communities (my summary is here). I think that while I agree that if a foreign government wants that tech, they can probably get it, I still think not releasing things has major effects. For example, there are lots of great researchers in the world that aren't paid by governments, and those people cannot get the ideas, which means that overall progress on potentially dangerous tech slows down considerably. I'm not sure what I expect you to think of this one. Am curious if you think this seems unrealistics/romanticised.

MIRI's written about going non-disclosed by default. I expect you to think this is fine and probably good and not too relevant, because it's not (as far as the writeup suggests) an attempt to keep secrets from the US government, and you expect they'd fail at that. Is that right?

No, I think it's probably very counterproductive, depending on what it really means in practice. I wasn't quite sure what the balance was between "We are going to actively try to keep this secret" and "It's taking too much of our time to write all of this up".

On the secrecy side of that, the problem isn't whether or not MIRI's secrecy works (although it probably won't)[1]. The problem is with the cost and impact on their own community from their trying to do it. I'm going to go into that further down this tome.

And OpenAI is attempting to push more careful release practises into the overton window of discussion in the ML communities (my summary is here). [...] For example, there are lots of great researchers in the world that aren't paid by governments, and those people cannot get the ideas [...]

That whole GPT thing was just strange.

OpenAI didn't conceal any of the ideas at all. They held back the full version of the actual trained network, but as I recall they published all of the methods they used to create it. Although a big data blob like the network is relatively easy to keep secret, if your goal is to slow down other research, controlling the network isn't going to be effective at all.

... and I don't think that slowing down follow-on research was their goal. If I remember right, they seemed to be worried that people would abuse the actual network they'd trained. That was indeed unrealistic. I've seen the text from the full network, and played with giving it prompts and seeing what comes out. Frankly, the thing is useless for fooling anybody and wouldn't be worth anybody's time. You could do better by driving a manually created grammar with random numbers, and people already do that.

Treating it like a Big Deal just made OpenAI look grossly out of touch. I wonder how long it took them to get the cherry-picked examples they published when they made their announcement...

So, yes, I thought OpenAI was being unrealistic, although it's not the kind of "romanticization" I had in mind. I just can't figure out what they could have stood to gain by that particular move.

All that said, I don't think I object to "more careful release practices", in the sense of giving a little thought to what you hand out. My objections are more to things like--

  1. Secrecy-by-default, or treating it as cost-free to make something secret. It's impractical to have too many secrets, and tends to dilute your protection for any secrets you actually do truly need. In the specific case of AI risk, I think it also changes the balance of speed between you and your adversaries... for the worse. I'll explain more about that below when I talk about MIRI.

  2. The idea that you can just "not release things", without very strict formal controls and institutional boundaries, and have that actually work in any meaningful way. There seems to be a lot of "illusion of control" thinking going on. Real secrecy is hard, and it gets harder fast if it has to last a long time.

To set the frame for the rest, I'm going to bloviate a bit about how I've seen secrecy to work in general.

One of the "secrets of secrecy" is that, at any scale beyond two or three people, it's more about controlling diffusion rates than about creating absolute barriers. Information interesting enough to care about will leak eventually.

You have some amount of control over the diffusion rate within some specific domains, and at their boundaries. Once information breaks out into a domain you do not control, it will spread according to the conditions in that new domain regardless of what you do. When information hits a new community, there's a step change in how fast it propagates.

Which brings up next not-very-secret secret: I'm wrong to talk about a "diffusion rate". The numbers aren't big enough to smooth out random fluctuations the way they are for molecules. Information tends to move in jumps for lots of reasons. Something may stay "secret" for a really long time just because nobody notices it... and then become big news when it gets to somebody who actively propagates it, or to somebody who sees an implication others didn't. A big part of propagation is the framing and setting; if you pair some information with an explanation of why it matters, and release it into a community with a lot of members who care, it will move much, much faster than if you don't.[2]

So, now, MIRI's approach...

The problem with what MIRI seems to be doing is that it disproportionately slows the movement of information within their own community and among their allies. In most cases, they will probably hurt themselves more than they hurt their "adversaries".

Ideas will still spread among the "good guys", but unreliably, slowly, through an unpredictable rumor mill, with much negotiation and everybody worrying at every turn about what to tell everybody else [3]. That keeps problems from getting solved. It can't be fixed by telling the people who "need to know", because MIRI (or whoever) won't know who those people are, especially-but-not-only if they're also being secretive.

Meanwhile, MIRI can't rely on keeping absolute secrets from anybody for any meaningful amount of time. And they'll probably have a relatively small effect on institutions that could actually do dangerous development. Assuming it's actually interesting, once one of MIRI's secrets gets to somebody who happens to be part of some "adversary" institution, it will be propagated throughout that institution, possibly very quickly. It may even get formally announced in the internal newsletter. It even has a chance of moving on from there into that first institution's own institutional adversaries, because they spy on each other.

But the "adversaries" are still relatively good at secrecy, especially from non-peers, so any follow-on ideas they produce will be slower to propagate back out into the public where MIRI et al can benefit from them.

The advantage the AI risk and X-risk communities have is, if you will, flexibility: they can get their heads around new ideas relatively quickly, adapt, act on implications, build one idea on another, and change their course relatively rapidly. The corresponding, closely related disadvantage is weakness in coordinating work on a large scale toward specific, agreed-upon goals (like say big scary AI development projects).

Worrying too much about secrecy throws away the advantage, but doesn't cure the disadvantage. Curing the disadvantage requires a culture and a set of material resources that I don't believe MIRI and friends can ever develop... and that would probably torpedo their effectiveness if they did develop them.

By their nature, they are going to be the people who are arguing against some development program that everybody else is for. Maybe against programs that have already got a lot of investment behind them before some problem becomes clear. That makes them intrinsically less acceptable as "team players". And they can't easily focus on doing a single project; they have to worry about any possible way of doing it wrong. The structures that are good at building dangerous projects aren't necessarily the same as the structures that are good at stoppping them.

If the AI safety community loses its agility advantage, it's not gonna have much left.

MIRI will probably also lose some donors and collaborators, and have more trouble recruiting new ones as time goes on. People will forget they exist because they're not talking, and there's a certain reluctance to give people money or attention in exchange for "pigs in pokes"... or even to spend the effort to engage and find out what's in the poke.

A couple of other notes:

Sometimes people talk about spreading defensive ideas without spreading the corresponding offensive ideas. In AI, that comes out as wanting to talk about safety measures without saying anything about how to increase capability.

In computer security, it comes out as cryptic announcments to "protect this port from this type of traffic until you apply this patch"... and it almost never works for long. The mere fact that you're talking about some specific subject is enough to get people interested and make them figure out the offensive side. It can work for a couple of weeks for a security bug announcement, but beyond that it will almost always just backfire by drawing attention. And it's very rare to be able to improve a defense without understanding the actual threat.

Edited the next day in an attempt to fix the footnotes... paragraphs after the first in each footnote were being left in the main flow.


  1. As for keeping secrets from any major government...
    .
    First, I still prefer to talk about the Chinese government. The US government seems less likely to be a player here. Probably the most important reason is that most parts of the US government apparatus see things like AI development as a job for "industry", which they tend to believe should be a very clearly separate sphere from "government". That's kind of different from the Chinese attitude, and it matters. Another reason is that the US government tends to have certain legal constraints and certain scruples that limit their effectiveness in penetrating secrecy.
    .
    I threw the US in as a reminder that China is far from the only issue, and I chose them because they used to be more interesting back during the cold war, and perhaps could be again if they got worried enough about "national security".
    .
    But if any government, including the US, decides that MIRI has a lot of important "national security" information, and decides to look hard at them, then, yes, MIRI will largely fail to keep secrets. They may not fail completely. They may be able to keep some things off the radar, for a while. But that's less likely for the most important things, and it will get harder the more people they convince that they may have information that's worth looking at. Which they need to do.
    .
    They'll probably even have information leaking into institutions that aren't actively spying on them, and aren't governments, either.
    .
    But that all that just leaves them where they started anyway. If there were no cost to it, it wouldn't be a problem. ↩︎

  2. You can also get independent discoveries creating new, unpredictable starting points for diffusion. Often independent discoveries get easier as time goes on and the general "background" information improves. If you thought of something, even something really new, that can be a signal that conditions are making it easier for the next person to think of the same thing. I've seen security bugs with many independent discoveries.
    .
    Not to mention pathologies like one community thinking something is a big secret, and then seeing it break out from some other, sometimes much larger community that has treated it as common knowledge for ages. ↩︎

  3. If you ever get to the point where mostly-unaffiliated individuals are having to make complicated decisions about what should be shared, or having to think hard about what they have and have not committed themselves not to share, you are 95 percent of the way to fully hosed.
    .
    That sort of thing kind of works for industrial NDAs, but the reason it works is that, regardless of what people have convinced themselves to believe, most industrial "secret sauce" is pretty boring, and the rest tends to be either so specific and detailed that it obviously covered by any NDA. AND you usually only care about relatively few competitors, most of whose employees don't get paid enough to get sued. That's very different from some really inobvious world-shaking insight that makes the difference between low-power "safe" AI and high-power "unsafe" AI. ↩︎

we’ve e.g. carefully moved a group the conversation around not putting pressure on them to explain why they were unavailable last Tuesday.

What do you mean by "a group the conversation"?

Ah, there was an extra 'the' in that sentence. Edited. Let me know if it's still unclear.