CFAR’s new focus, and AI Safety

A bit about our last few months:

  • We’ve been working on getting a simple clear mission and an organization that actually works.  We think of our goal as analogous to the transition that the old Singularity Institute underwent under Lukeprog (during which chaos was replaced by a simple, intelligible structure that made it easier to turn effort into forward motion).
  • As part of that, we’ll need to find a way to be intelligible.
  • This is the first of several blog posts aimed at causing our new form to be visible from outside.  (If you're in the Bay Area, you can also come meet us at tonight's open house.) (We'll be talking more about the causes of this mission-change; the extent to which it is in fact a change, etc. in an upcoming post.)

Here's a short explanation of our new mission:
  • We care a lot about AI Safety efforts in particular, and about otherwise increasing the odds that humanity reaches the stars.

  • Also, we[1] believe such efforts are bottlenecked more by our collective epistemology, than by the number of people who verbally endorse or act on "AI Safety", or any other "spreadable viewpointdisconnected from its derivation.

  • Our aim is therefore to find ways of improving both individual thinking skill, and the modes of thinking and social fabric that allow people to think together.  And to do this among the relatively small sets of people tackling existential risk. 


To elaborate a little:

Existential wins and AI safety

By an “existential win”, we mean humanity creates a stable, positive future.  We care a heck of a lot about this one.

Our working model here accords roughly with the model in Nick Bostrom’s book Superintelligence.  In particular, we believe that if general artificial intelligence is at some point invented, it will be an enormously big deal.

(Lately, AI Safety is being discussed by everyone from The Economist to Newsweek to Obama to an open letter from eight thousand.  But we’ve been thinking on this, and backchaining partly from it, since before that.)

Who we’re focusing on, why

Our preliminary investigations agree with The Onion’s; despite some looking, we have found no ultra-competent group of people behind the scenes who have fully got things covered.

What we have found are:
  • AI and machine learning graduate students, researchers, project-managers, etc. who care; who can think; and who are interested in thinking better;
  • Students and others affiliated with the “Effective Altruism” movement, who are looking to direct their careers in ways that can do the most good;
  • Rationality geeks, who are interested in seriously working to understand how the heck thinking works when it works, and how to make it work even in domains as confusing as AI safety.
These folks, we suspect, are the ones who can give humanity the most boost in its survival-odds per dollar of CFAR’s present efforts (which is a statement partly about us, but so it goes).  We’ve been focusing on them.  

(For the sake of everyone.  Would you rather: (a) have bad rationality skills yourself; or (b) be killed by a scientist or policy-maker who also had bad rationality skills?)

Brier-boosting, not Signal-boosting

Everyone thinks they’re right.  We do, too.  So we have some temptation to take our own favorite current models of AI Safety strategy and to try to get everyone else to shut up about their models and believe ours instead.

This understandably popular activity is often called “signal boosting”, “raising awareness”, or doing “outreach”.

At CFAR, though, we force ourselves not to do “signal boosting” in this way.  Our strategy is to spread general-purpose thinking skills, not our current opinions.  It is important that we get the truth-seeking skills themselves to snowball across relevant players, because ultimately, creating a safe AI (or otherwise securing an existential win) is a research problem.  Nobody, today, has copyable opinions that will get us there.

We like to call this “Brier boosting”, because a “Brier score” is a measure of predictive accuracy. 

Lever and World


[1] By "We believe X", we do not mean to assert that every CFAR staff member individually believes X.  (Similarly for "We care about Y).  We mean rather that CFAR as an organization is planning/acting as though X is true.  (Much as if CFAR promises you a rationality T-shirt, that isn't an individual promise from each of the individuals at CFAR; it is rather a promise from the organization as such.)

If we're going to build an art of rationality, we'll need to figure out how to create an organization where people can individually believe whatever the heck they end up actually believing as they chase the evidence, while also having the organization qua organization be predictable/intelligible.

ETA:
You may also want to check out two documents we posted in the days since this post:

Comments

sorted by
magical algorithm
Highlighting new comments since Today at 3:27 PM
Select new highlight date
All comments loaded

This is just a guess, but I think CFAR and the CFAR-sphere would be more effective if they focused more on hypothesis generation (or "imagination", although that term is very broad). Eg., a year or so ago, a friend of mine in the Thiel-sphere proposed starting a new country by hauling nuclear power plants to Antarctica, and then just putting heaters on the ground to melt all the ice. As it happens, I think this is a stupid idea (hot air rises, so the newly heated air would just blow away, pulling in more cold air from the surroundings). But it is an idea, and the same person came up with (and implemented) a profitable business plan six months or so later. I can imagine HPJEV coming up with that idea, or Elon Musk, or von Neumann, or Google X; I don't think most people in the CFAR-sphere would, it's just not the kind of thing I think they've focused on practicing.

There's a difference between optimizing for truth and optimizing for interestingness. Interestingness is valuable for truth in the long run because the more hypotheses you have, the better your odds of stumbling on the correct hypothesis. But naively optimizing for truth can decrease creativity, which is critical for interestingness.

I suspect "having ideas" is a skill you can develop, kind of like making clay pots. In the same way your first clay pots will be lousy, your first ideas will be lousy, but they will get better with practice.

...creation is embarrassing. For every new good idea you have, there are a hundred, ten thousand foolish ones, which you naturally do not care to display.

Source.

If this is correct, this also gives us clues about how to solve Less Wrong's content problem.

Online communities do not have a strong comparative advantage in compiling and presenting facts that are well understood. That's the sort of thing academics and journalists are already paid to do. If online communities have a comparative advantage, it's in exploring ideas that are neglected by the mainstream--things like AI risk, or CFARish techniques for being more effective.

Unfortunately, LW's culture has historically been pretty antithetical to creativity. It's hard to tell in advance whether an idea you have is a good one or not. And LW has often been hard on posts it considers bad. This made the already-scary process of sharing new ideas even more fraught with the possibility of embarrassment.

If a single individual present is unsympathetic to the foolishness that would be bound to go on at such a [brainstorming] session, the others would freeze. The unsympathetic individual may be a gold mine of information, but the harm he does will more than compensate for that. It seems necessary to me, then, that all people at a session be willing to sound foolish and listen to others sound foolish.

Same source.

I recommend recording ideas in a private notebook. I've been doing this for a few years, and I now have way more ideas than I know what to do with.

Elon Musk

Relevant: http://waitbutwhy.com/2015/11/the-cook-and-the-chef-musks-secret-sauce.html

Definitely agree with the importance of hypothesis generation and the general lack of it–at least for me, I would classify this as my main business-related weakness, relative to successful people I know.

headline: CFAR considering colonizing Antarctica.

History repeats itself. Seafarers have always been fond of colonizing distant lands.

Colonizing Antarctica and making a whole slew of new countries is actually a good idea IMO, but it doesn't have enough appeal. The value to humanity of creating new countries that can innovate on institutions is large.

You can think of Mars colonization as a more difficult version of Antarctic colonization which is actually going to be attempted because it sounds cooler.

I can imagine thinking of such an idea. If you start with the assumption that colonizing Mars is really hard it's the next step to think about what we could colonize on earth.

There's much empty land in Australia that could be colonized easier than the Arctic.

For the sake of counter factual historical accuracy, if anyone came up with it, it would be Leo Szilard.

A few nitpicks on choice of "Brier-boosting" as a description of CFAR's approach:

Predictive power is maximized when Brier score is minimized

Brier score is the sum of differences between probabilities assigned to events and indicator variables that are are 1 or 0 according to whether the event did or did not occur. Good calibration therefore corresponds to minimizing Brier score rather than maximizing it, and "Brier-boosting" suggests maximization.

What's referred to as "quadratic score" is essentially the same as the negative of Brier score, and so maximizing quadratic score corresponds to maximizing predictive power.

Brier score fails to capture our intuitions about assignment of small probabilities

A more substantive point is that even though the Brier score is minimized by being well-calibrated, the way in which it varies with the probability assigned to an event does not correspond to our intuitions about how good a probabilistic prediction is. For example, suppose four observers A, B, C and D assigned probabilities 0.5, 0.4, 0.01 and 0.000001 (respectively) to an event E occurring and the event turns out to occur. Intuitively, B's prediction is only slightly worse than A's prediction, whereas D's prediction is much worse than C's prediction. But the difference between the increase in B's Brier score and A's Brier score is 0.36 - 0.25 = 0.11, which is much larger than corresponding difference for D and C, which is approximately 0.02.

Brier score is not constant across mathematically equivalent formulations of the same prediction

Suppose that a basketball player is to make three free throws, observer A predicts that the player makes each one with probability p and suppose that observer B accepts observer A's estimate and notes that this implies that the probability that the player makes all three free throws is p^3, and so makes that prediction.

Then if the player makes all three free throws, observer A's Brier score increases by

3*(1 - p)^2

while observer B's Brier score increases by

(1 - p^3)^2

But these two expressions are not equal in general, e.g. for p = 0.9 the first is 0.03 and the second is 0.073441. So changes to Brier score depend on the formulation of a prediction as opposed to the prediction itself.

======

The logarithmic scoring rule handles small probabilities well, and is invariant under changing the representation of a prediction, and so is preferred. I first learned of this from Eliezer's essay A Technical Explanation of a Technical Explanation.

Minimizing logarithmic score is equivalent to maximizing the likelihood function for logistic regression / binary classification. Unfortunately, the phrase "likelihood boosting" has one more syllable than "Brier boosting" and doesn't have same alliterative ring to it, so I don't have an actionable alternative suggestion :P.

Good point!

(And thanks for explaining clearly and noting where you learned about logarithmic scoring.)

I would suggest that "helping people think more clearly so that they'll find truth better, instead of telling them what to believe" already has a name, and it's "the Socratic method." It's unfortunate that this has the connotation of "do everything in a Q&A format", though.

"Brier scoring" is not a very natural scoring rule (log scoring is better; Jonah and Eliezer already covered the main reasons, and it's what I used when designing the Credence Game for similar reasons). It also sets off a negative reaction in me when I see someone naming their world-changing strategy after it. It makes me think the people naming their strategy don't have enough mathematician friends to advise them otherwise... which, as evidenced by these comments, is not the case for CFAR ;) Possible re-naming options that contrast well with "signal boosting"

  • Score boosting
  • Signal filtering
  • Signal vetting

I don't think the first problem is a big deal. No-one worries about "I boosted that from a Priority 3 to a Priority 1 bug".

If CFAR will be discontinuing/de-emphasizing rationality workshops for the general educated public, then I'd like to see someone else take up that mantle, and I'd hope that CFAR would make it easy for such a startup to build on what they've learned so far.

We'll be continuing the workshops, at least for now, with less direct focus, but with probably a similar amount of net development time going into them even if the emphasis is on more targeted programs. This is partly because we value the existence of an independent rationality community (varied folks doing varied things adds to the art and increases its integrity), and partly because we’re still dependent on the workshop revenue for part of our operating budget.

Re: others taking up the mantel: we are working to bootstrap an instructor training; have long been encouraging our mentors and alumni to run their own thingies; and are glad to help others do so. Also Kaj Sotala seems to be developing some interesting training thingies designed to be shared.

Feedback from someone who really enjoyed your May workshop (and I gave this same feedback then, too): Part of the reason I was willing to go to CFAR was that it is separate (or at least pretends to be separate, even though they share personnel and office space) from MIRI. I am 100% behind rationality as a project but super skeptical of a lot of the AI stuff that MIRI does (although I still follow it because I do find it interesting, and a lot of smart people clearly believe strongly in it so I'm prepared to be convinced.) I doubt I'm the only one in this boat.

Also, I'm super uncomfortable being associated with AI safety stuff on a social level because it has a huge image problem. I'm barely comfortable being associated with "rationality" at all because of how closely associated it is (in my social group, at least) with AI safety's image problem. (I don't exaggerate when I say that my most-feared reaction to telling people I'm associated with "rationalists" is "oh, the basilisk people?")

I had mixed feelings towards this post, and I've been trying to process them.

On the positive side:

  • I think AI safety is important, and that collective epistemology is important for this, so I'm happy to know that there will be some attention going to this.
  • There may be synergies to doing some of this alongside more traditional rationality work in the same org.

On the negative side:

  • I think there is an important role for pursuing rationality qua rationality, and that this will be harder to do consistently under an umbrella with AI safety as an explicit aim. For example one concern is that there will be even stronger pressure to accept community consensus that AI safety is important rather than getting people to think this through for themselves. Since I agree with you that the epistemology matters, this is concerning to me.
  • With a growing community, my first inclination would be that one could support both organisations, and that it would be better to have something new focus on epistemology-for-AI, while CFAR in a more traditional form continues to focus more directly on rationality (just as Open Phil split off from GiveWell rather than replacing the direction of GiveWell). I imagine you thought about this; hopefully you'll address it in one of the subsequent posts.
  • There is potential reputational damage by having these things too far linked. (Though also potential reputational benefits. I put this in "mild negative" for now.)

On the confused side:

  • I thought the post did an interesting job of saying more reasonable things than the implicature. In particular I thought it was extremely interesting that it didn't say that AI safety was a new focus. Then in the ETA you said "Even though our aim is explicitly AI Safety..."

I think framing matters a lot here. I'd feel much happier about a CFAR whose aim was developing and promoting individual and group rationality in general and particularly for important questions, one of whose projects was focusing on AI safety, than I do about a CFAR whose explicit focus is AI safety, even if the basket of activities they might pursue in the short term would look very similar. I wonder if you considered this?

Thanks for the thoughts; I appreciate it.

I agree with you that framing is important; I just deleted the old ETA. (For anyone interested, it used to read:

ETA: Having talked just now to people at our open house, I would like to clarify: Even though our aim is explicitly AI Safety...
CFAR does still need an art of rationality, and a community of rationality geeks that support that. We will still be investing at least some in that community. We will also still be running some "explore" workshops of different sorts aiming at patching gaps in the art (funding permitting), not all of which will be deliberately and explicitly backchained form AI Safety (although some will). Play is generative of a full rationality art. (In addition to sometimes targeting things more narrowly at particular high-impact groups, and otherwise more directly backchaining.) (More in subsequent posts.)

I'm curious where our two new docs leave you; I think they make clearer that we will still be doing some rationality qua rationality.

Will comment later re: separate organizations; I agree this is an interesting idea; my guess is that there isn't enough money and staff firepower to run a good standalone rationality organization in CFAR's stead, and also that CFAR retains quite an interest in a standalone rationality community and should therefore support it... but I'm definitely interested in thoughts on this.

Julia will be launching a small spinoff organization called Convergence, facilitating double crux conversations between EAs and EA-adjacent people in, e.g., tech and academia. It'll be under the auspices of CFAR for now but will not have opinions on AI. I'm not sure if that hits any of what you're after.

Thanks for engaging. Further thoughts:

I agree with you that framing is important; I just deleted the old ETA.

For what it's worth I think even without saying that your aim is explicitly AI safety, a lot of people reading this post will take that away unless you do more to cancel the implicature. Even the title does this! It's a slightly odd grammatical construction which looks an awful lot like CFAR’s new focus: AI Safety; I think without being more up-front about alternative interpretation it will sometimes be read that way.

I'm curious where our two new docs leave you

Me too! (I assume that these have not been posted yet, but if I'm just failing to find them please let me know.)

I think they make clearer that we will still be doing some rationality qua rationality.

Great. Just to highlight that I think there are two important aspects of doing rationality qua rationality:

  • Have the people pursuing the activity have this as their goal. (I'm less worried about you failing on this one.)
  • Have external perceptions be that this is what you're doing. I have some concern that rationality-qua-rationality activities pursued by an AI safety org will be perceived as having an underlying agenda relating to that. And that this could e.g. make some people less inclined to engage, even relative to if they're run by a rationality org which has a significant project on AI safety.

my guess is that there isn't enough money and staff firepower to run a good standalone rationality organization in CFAR's stead

I feel pretty uncertain about this, but my guess goes the other way. Also, I think if there are two separate orgs, the standalone rationality one should probably retain the CFAR brand! (as it seems more valuable there)

I do worry about transition costs and losing synergies of working together from splitting off a new org. Though these might be cheaper earlier than later, and even if it's borderline right now whether there's enough money and staff to do both I think it won't be borderline within a small number of years.

Julia will be launching a small spinoff organization called Convergence

This sounds interesting! That's a specialised enough remit that it (mostly) doesn't negate my above concerns, but I'm happy to hear about it anyway.

Even the title does this! It's a slightly odd grammatical construction which looks an awful lot like CFAR’s new focus: AI Safety; I think without being more up-front about alternative interpretation it will sometimes be read that way.

Datapoint: it wasn't until reading your comment that I realized that the title actually doesn't read "CFAR's new focus: AI safety".

I am annoyed by this post because you describe it as, "we had a really good idea and then we decided to post this instead of getting to that idea".

I don't see the point of building anticipation. I like to quote, "start as close to the end, then go forward"

To coordinate we need a leader that many of us would sacrifice for. The obvious candidates are Eliezer Yudkowsky, Peter Thiel, and Scott Alexander. Perhaps we should develop a process by which a legitimate, high-quality leader could be chosen.

Edit: I see mankind as walking towards a minefield. We are almost certainly not in the minefield yet, at our current rate we will almost certainly hit the minefield this century, lots of people don't think the minefield exists or think that fate or God will protect us from the minefield, and competitive pressures (Moloch) make lots of people individually better off if they push us a bit faster towards this minefield.

I disagree. The LW community already has capable high-status people who many others in the community look up to and listen to suggestions from. It's not clear to me what the benefit is from picking a single leader. I'm not sure what kinds of coordination problems you had in mind, but I'd expect that most such problems that could be solved by a leader issuing a decree could also be solved by high-status figures coordinating with each other on how to encourage others to coordinate. High-status people and organizations in the LW community communicate with each other a fair amount, so they should be able to do that.

And there are significant costs to picking a leader. It creates a single point of failure, making the leader's mistakes more costly, and inhibiting innovation in leadership style. It also creates PR problems; in fact, LW already has faced PR problems regarding being an Eliezer Yudkowsky personality cult.

Also, if we were to pick a leader, Peter Thiel strikes me as an exceptionally terrible choice.

The ten up-votes you have for this post is a signal that either we shouldn't have a leader or if we should it would be difficult for him/her to overcome the opposition in the rationality movement to having a leader.

Also, if we were to pick a leader, Peter Thiel strikes me as an exceptionally terrible choice.

I agree we shouldn't pick a leader, but I'm curious why you think this. He's the only person on the list who's actually got leadership experience (CEO of Paypal), and he did a pretty good job.

Leading a business and leading a social movement require different skill sets, and Peter Thiel is also the only person on the list who isn't even part of the LW community. Bringing in someone only tangentially associated with a community as its leader doesn't seem like a good idea.

The key to deciding if we need a leader is to look at historically similar situations and see if they benefited from having a leader. Given that we would very much like to influence government policy, Peter Thiel strikes me as the best possible choice if he would accept. I read somewhere that when Julius Caesar was going to attack Rome several Senators approached Pompey the Great, handed him a sword, and said "save Rome." I seriously think we should try something like this with Thiel.

Given that we would very much like to influence government policy

How would the position of leader of the LW community help Peter Thiel do this? Also, Peter Thiel's policy priorities seem to differ a fair amount from those of the average lesswronger, and I'd be pretty surprised if he agreed to change priorities substantially in order to fit with his role as LW leader.

Given that we would very much like to influence government policy

Is this actually a thing that we would want? It seems to me like this line of reasoning depends on a lot of assumptions that don't seem all that shared.

(I do think that rationalists should coordinate more, but I don't think rationalists executing the "just obey authority" action is likely to succeed. That seems like a recipe for losing a lot of people from the 'rationalist' label. I think there are other approaches that are better suited to the range of rationalist personalities, that still has enough tradition behind it for it to be likely to work; the main inspirations here are Norse þings and Quaker meetings.)

I read somewhere that when Julius Caesar was going to attack Rome several Senators approached Pompey the Great, handed him a sword, and said "save Rome." I seriously think we should try something like this with Thiel.

At the moment Peter Thiel should spent all his available time at recruiting people for the Trump administration to fill those 4000 places that are opened. Asking him to spend any time elsewhere is likely not effective.

If I remember correctly, history records Caesar as having been relentlessly successful in that campaign?

If Alyssa Vance is correct that the community is bottlenecked on idea generation, I think this is exactly the wrong way to respond. My current view is that increasing hierarchy has the advantage of helping people coordinate better, but it has the disadvantage that people are less creative in a hierarchical context. Isaac Asimov on brainstorming:

If a single individual present has a much greater reputation than the others, or is more articulate, or has a distinctly more commanding personality, he may well take over the conference and reduce the rest to little more than passive obedience. The individual may himself be extremely useful, but he might as well be put to work solo, for he is neutralizing the rest.

I believe this has already happened to the community through the quasi-deification of people like Eliezer, Scott, and Gwern. It's odd, because I generally view the LW community as quite nontraditional. But when I look at academia, I get the impression that college professors are significantly closer in status to their students than our intellectual leadership.

This is my steelman of people who say LW is a cult. It's not a cult, but large status differences might be a sociological "code smell" for intellectual communities. Think of the professor who insists that they always be addressed as "Dr. Jones" instead of being called by their first name. This is rarely the sort of earnest, energetic, independent-minded person who makes important discoveries. "The people I know who do great work think that they suck, but that everyone else sucks even more."

The problem is compounded by the fact that Eliezer, Scott, and Gwern are not actually leaders. They're high status, but they aren't giving people orders. This leads to leadership vacuums.

My current guess is that we should work on idea generation at present, then transform into a more hierarchical community when it's obvious what needs to be done. I don't know what the best community structure for idea generation is, but I suspect the university model is a good one: have a selective admissions process, while keeping the culture egalitarian for people who are accepted. At least this approach is proven.

I shall preface by saying that I am neither a rationalist nor an aspiring rationalist. Instead, I would classify myself as a "rationality consumer" - I enjoy debating philosophy and reading good competence/insight porn. My life is good enough that I don't anticipate much subjective value from optimizing my decisionmaking.

I don't know how representative I am. But I think if you want to reach "people who have something to protect" you need to use different approaches from "people who like competence porn", and I think while a site like LW can serve both groups we are to some extent running into issues where we may have a population that is largely the latter instead of the former - people admire Gwern, but who wants to be Gwern? Who wants to be like Eliezer or lukeprog? We may not want leaders, but we don't even have heroes.

I think possibly what's missing, and this is especially relevant in the case of CFAR, is a solid, empirical, visceral case for the benefit of putting the techniques into action. At the risk of being branded outreach, and at the very real risk of significantly skewing their post-workshop stats gathering, CFAR should possibly put more effort into documenting stories of success through applying the techniques. I think the main focus of research should be full System-1 integration, not just for the techniques themselves but also for CFAR's advertisement. I believe it's possible to do this responsibly if one combines it with transparency and System-2 relevant statistics. Contingent, of course, on CFAR delivering the proportionate value.

I realize that there is a chicken-and-egg problem here where for reasons of honesty, you want to use System-1-appealing techniques that only work if the case is solid, which is exactly the thing that System-1 is traditionally bad at! I'm not sure how to solve that, but I think it needs to be solved. To my intuition, rationality won't take off until it's value-positive for S1 as well as S2. If you have something to protect you can push against S1 in the short-term, but the default engagement must be one of playful ease if you want to capture people in a state of idle interest.

CFAR should possibly put more effort into documenting stories of success through applying the techniques.

They do put effort into this; I do wonder how communicable it is, though.

For example, at one point Anna described a series of people all saying something like "well, I don't know if it had any relationship to the workshop, but I did X, Y, and Z" during followups that, across many followups, seemed obviously due to the workshop. But it might be a vague thing that's easier to see when you're actually doing the followups rather than communicating statistics about followups.

I shall preface by saying that I am neither a rationalist nor an aspiring rationalist. Instead, I would classify myself as a "rationality consumer" - I enjoy debating philosophy and reading good competence/insight porn. My life is good enough that I don't anticipate much subjective value from optimizing my decisionmaking.

Thanks so much for saying this! Thinking about this distinction you made, I feel there may be actually four groups of LW readers, with different needs or expectations from the website:

"Science/Tech Fans" -- want more articles about new scientific research and new technologies. "Has anyone recently discovered a new particle, or built a new machine? Give me a popular science article about it!"

"Competence/Insight Consumers" -- want more articles about pop psychology theories and life hacks. They feel they are already doing great, and only want to improve small details. "What do you believe is the true source of human motivation, and how do you organize your to-do lists? But first, give me your credentials: are you a successful person?"

"Already Solving a Problem" -- want feedback on their progress, and information speficially useful for them. Highly specific; two people in the same category working on completely different problems probably wouldn't benefit too much from talking to each other. If they achieve critical mass, it would be best to make a subgroup for them (except that LW currently does not support creating subgroups).

"Not Started Yet" -- inspired by the Sequences, they would like to optimize their lives and the universe, but... they are stuck in place, or advancing very very slowly. They hope for some good advice that would make something "click", and help them leave the ground.

Maybe it's poll time... what do you want to read about?

[pollid:1176]

If anyone's mind is in a place where they think they'd be more productive or helpful if they sacrificed themselves for a leader, then, with respect, I think the best thing they can do for protecting humanity's future is to fix that problem in themselves.

The way people normally solve big problems is to have a leader people respect, follow, and are willing to sacrifice for. If there is something in rationalists that prevents us from accepting leadership then the barbarians will almost certainly beat us.

I support this, whole-heartedly :) CFAR has already created a great deal of value without focusing specifically on AI x-risk, and I think it's high time to start trading the breadth of perspective CFAR has gained from being fairly generalist for some more direct impact on saving the world.

Hi Anna, could you please explain how CFAR decided to focus on AI safety, as opposed to other plausible existential risks like totalitarain governments or nuclear war?

Coming up. Working on a blog post about it; will probably have it up in ~4 days.

I intend to donate to MIRI this year; do you anticipate that upcoming posts or other reasoning/resources might or should persuade people like myself to donate to CFAR instead?

Our aim is therefore to find ways of improving both individual thinking skill, and the modes of thinking and social fabric that allow people to think together. And to do this among the relatively small sets of people tackling existential risk.

I get the impression that 'new ways of improving thinking skill' is a task that has mostly been saturated. The reasons people perhaps don't have great thinking skill might be because

1) Reality provides extremely sparse feedback on 'the quality of your/our thinking skills' so people don't see it as very important.

2) For a human, who represents 1/7 billionth of our species, thinking rationally is often a worse option than thinking irrationally in the same way as a particular group of humans, so as to better facilitate group membership via shared opinions. It's very hard to 'go it alone'.

3) (related to 2) Most decisions that a human has to make have already been faced by innumerable previous humans and do not require a lot of deep, fundamental-level thought.

These effects seem to present challenges to level-headed, rational thinking about the future of humanity. I see a lot of #2 in bad, broken thinking about AI risk, where the topic is treated as a proxy war for prosecuting various political/tribal conflicts.

Actually it is possible that the worst is yet to come in terms of political/tribal conflict influencing AI risk thinking.