My main problem with OpenAI is that it's one thing for them to not be focused on AI alignment, but are they even really focused on AI "safety" even in the loose sense of the word? Most of their published research has to do with tweaks and improvements to deep learning techniques that enhance their performance but do not really aid our theoretical understanding of them. (Which makes it pretty much the same as Google Brain, FAIR, and DeepMind in that regard). It even turned out that Ian Goodfellow, the discoverer of GANs and the primary researcher on adversarial attacks on deep learning systems left OpenAI and went back to Google because it turned out Google researchers were more interested than OpenAI in working on deep learning security issues...

On the $30 million grant from Open Philanthropy: I've seen it discussed on HackerNews and Reddit but not much here, and it seems like there's plenty of confusion about what's going on. After all it is quite a large amount, but OpenAI seems like it's quite well funded already. So the obvious question people have is, is this a ploy for the AI risk people to gain more control over OpenAI's research direction? And one thing I'm worried about is that there could be plenty of push-back on that, because it was such a bold move and the reasons given by Open Philanthropy for the grant would not indicate they were doing as such. And it seems there's quite a lot of hostility towards AI safety research in general.

OpenAI makes humanity less safe

If there's anything we can do now about the risks of superintelligent AI, then OpenAI makes humanity less safe.

Once upon a time, some good people were worried about the possibility that humanity would figure out how to create a superintelligent AI before they figured out how to tell it what we wanted it to do.  If this happened, it could lead to literally destroying humanity and nearly everything we care about. This would be very bad. So they tried to warn people about the problem, and to organize efforts to solve it.

Specifically, they called for work on aligning an AI’s goals with ours - sometimes called the value alignment problem, AI control, friendly AI, or simply AI safety - before rushing ahead to increase the power of AI.

Some other good people listened. They knew they had no relevant technical expertise, but what they did have was a lot of money. So they did the one thing they could do - throw money at the problem, giving it to trusted parties to try to solve the problem. Unfortunately, the money was used to make the problem worse. This is the story of OpenAI.

Before I go on, two qualifiers:

  1. This post will be much easier to follow if you have some familiarity with the AI safety problem. For a quick summary you can read Scott Alexander’s Superintelligence FAQ. For a more comprehensive account see Nick Bostrom’s book Superintelligence.
  2. AI is an area in which even most highly informed people should have lots of uncertainty. I wouldn't be surprised if my opinion changes a lot after publishing this post, as I learn relevant information. I'm publishing this because I think this process should go on in public.

The story of OpenAI

Before OpenAI, there was DeepMind, a for-profit venture working on "deep learning” techniques. It was widely regarded as the advanced AI research organization. If any current effort was going to produce superhuman intelligence, it was DeepMind.

Elsewhere, industrialist Elon Musk was working on more concrete (and largely successful) projects to benefit humanity, like commercially viable electric cars, solar panels cheaper than ordinary roofing, cheap spaceflight with reusable rockets, and a long-run plan for a Mars colony. When he heard the arguments people like Eliezer Yudkowsky and Nick Bostrom were making about AI risk, he was persuaded that there was something to worry about - but he initially thought a Mars colony might save us. But when DeepMind’s head, Demis Hassabis, pointed out that this wasn't far enough to escape the reach of a true superintelligence, he decided he had to do something about it:

Hassabis, a co-founder of the mysterious London laboratory DeepMind, had come to Musk’s SpaceX rocket factory, outside Los Angeles, a few years ago. […] Musk explained that his ultimate goal at SpaceX was the most important project in the world: interplanetary colonization.

Hassabis replied that, in fact, he was working on the most important project in the world: developing artificial super-intelligence. Musk countered that this was one reason we needed to colonize Mars—so that we’ll have a bolt-hole if A.I. goes rogue and turns on humanity. Amused, Hassabis said that A.I. would simply follow humans to Mars.

[…]

Musk is not going gently. He plans on fighting this with every fiber of his carbon-based being. Musk and Altman have founded OpenAI, a billion-dollar nonprofit company, to work for safer artificial intelligence.

OpenAI’s primary strategy is to hire top AI researchers to do cutting-edge AI capacity research and publish the results, in order to ensure widespread access. Some of this involves making sure AI does what you meant it to do, which is a form of the value alignment problem mentioned above.

Intelligence and superintelligence

No one knows exactly what research will result in the creation of a general intelligence that can do anything a human can, much less a superintelligence - otherwise we’d already know how to build one. Some AI research is clearly not on the path towards superintelligence - for instance, applying known techniques to new fields. Other AI research is more general, and might plausibly be making progress towards a superintelligence. It could be that the sort of research DeepMind and OpenAI are working on is directly relevant to building a superintelligence, or it could be that their methods will tap out long before then. These are different scenarios, and need to be evaluated separately.

What if OpenAI and DeepMind are working on problems relevant to superintelligence?

If OpenAI is working on things that are directly relevant to the creation of a superintelligence, then its very existence makes an arms race with DeepMind more likely. This is really bad! Moreover, sharing results openly makes it easier for other institutions or individuals, who may care less about safety, to make progress on building a superintelligence.

Arms races are dangerous

One thing nearly everyone thinking seriously about the AI problem agrees on, is that an arms race towards superintelligence would be very bad news. The main problem occurs in what is called a “fast takeoff” scenario. If AI progress is smooth and gradual even past the point of human-level AI, then we may have plenty of time to correct any mistakes we make. But if there’s some threshold beyond which an AI would be able to improve itself faster than we could possibly keep up with, then we only get one chance to do it right.

AI value alignment is hard, and AI capacity is likely to be easier, so anything that causes an AI team to rush makes our chances substantially worse; if they get safety even slightly wrong but get capacity right enough, we may all end up dead. But you’re worried that the other team will unleash a potentially dangerous superintelligence first, then you might be willing to skip some steps on safety to preempt them. But they, having more reason to trust themselves than you, might notice that you’re rushing ahead, get worried that your team will destroy the world, and rush their (probably safe but they’re not sure) AI into existence.

OpenAI promotes competition

DeepMind used to be the standout AI research organization. With a comfortable lead on everyone else, they would be able to afford to take their time to check their work if they thought they were on the verge of doing something really dangerous. But OpenAI is now widely regarded as a credible close competitor. However dangerous you think DeepMind might have been in the absence of an arms race dynamic, this makes them more dangerous, not less. Moreover, by sharing their results, they are making it easier to create other close competitors to DeepMind, some of whom may not be so committed to AI safety.

We at least know that DeepMind, like OpenAI, has put some resources into safety research. What about the unknown people or organizations who might leverage AI capacity research published by OpenAI?

For more on how openly sharing technology with extreme destructive potential might be extremely harmful, see Scott Alexander’s Should AI be Open?, and Nick Bostrom’s Strategic Implications of Openness in AI Development.

What if OpenAI and DeepMind are not working on problems relevant to superintelligence?

Suppose OpenAI and DeepMind are largely not working on problems highly relevant to superintelligence. (Personally I consider this the more likely scenario.) By portraying short-run AI capacity work as a way to get to safe superintelligence, OpenAI’s existence diverts attention and resources from things actually focused on the problem of superintelligence value alignment, such as MIRI or FHI.

I suspect that in the long-run this will make it harder to get funding for long-run AI safety organizations. The Open Philanthropy Project just made its largest grant ever, to Open AI, to buy a seat on OpenAI’s board for Open Philanthropy Project executive director Holden Karnofsky. This is larger than their recent grants to MIRI, FHI, FLI, and the Center for Human-Compatible AI all together.

But the problem is not just money - it’s time and attention. The Open Philanthropy Project doesn’t think OpenAI is underfunded, and could do more good with the extra money. Instead, it seems to think that Holden can be a good influence on OpenAI. This means that of the time he's allocating to AI safety, a fair amount has been diverted to OpenAI.

This may also make it harder for organizations specializing in the sort of long-run AI alignment problems that don't have immediate applications to attract top talent. People who hear about AI safety research and are persuaded to look into it will have a harder time finding direct efforts to solve key long-run problems, since an organization focused on increasing short-run AI capacity will dominate AI safety's public image.

Why do good inputs turn bad?

OpenAI was founded by people trying to do good, and has hired some very good and highly talented people. It seems to be doing genuinely good capacity research. To the extent to which this is not dangerously close to superintelligence, it’s better to share this sort of thing than not – they could create a huge positive externality. They could construct a fantastic public good. Making the world richer in a way that widely distributes the gains is very, very good.

Separately, many people at OpenAI seem genuinely concerned about AI safety, want to prevent disaster, and have done real work to promote long-run AI safety research. For instance, my former housemate Paul Christiano, who is one of the most careful and insightful AI safety thinkers I know of, is currently employed at OpenAI. He is still doing AI safety work – for instance, he coauthored Concrete Problems in AI Safety with, among others, Dario Amodei, another OpenAI researcher.

Unfortunately, I don’t see how those two things make sense jointly in the same organization. I’ve talked with a lot of people about this in the AI risk community, and they’ve often attempted to steelman the case for OpenAI, but I haven’t found anyone willing to claim, as their own opinion, that OpenAI as conceived was a good idea. It doesn’t make sense to anyone, if you’re worried at all about the long-run AI alignment problem.

Something very puzzling is going on here. Good people tried to spend money on addressing an important problem, but somehow the money got spent on the thing most likely to make that exact problem worse. Whatever is going on here, it seems important to understand if you want to use your money to better the world.

(Cross-posted at my personal blog.)

Comments

sorted by
magical algorithm
Highlighting new comments since Today at 10:46 AM
Select new highlight date
All comments loaded

A guy I know, who works in one of the top ML groups, is literally less worried about superintelligence than he is about getting murdered by rationalists. That's an extreme POV. Most researchers in ML simply think that people who worry about superintelligence are uneducated cranks addled by sci fi.

I hope everyone is aware of that perception problem.

Let me be as clear as I can about this. If someone does that, I expect it will make humanity still less safe. I do not know how, but the whole point of deontological injunctions is that they prevent you from harming your interests in hard to anticipate ways.

As bad as a potential arms race is, an arms race fought by people who are scared of being murdered by the AI safety people would be much, much worse. Please, if anyone reading this is considering vigilante violence against AI researchers, don't.

The right thing to do is tell people your concerns, like I am doing, as clearly and openly as you can, and try to organize legitimate, above-board ways to fix the problem.

I may be an outlier, but I've worked at a startup company that did machine learning R&D, and which was recently acquired by a big tech company, and we did consider the issue seriously. The general feeling of the people at the startup was that, yes, somewhere down the line the superintelligence problem would eventually be a serious thing to worry about, but like, our models right now are nowhere near becoming able to recursively self-improve themselves independently of our direct supervision. Actual ML models basically need a ton of fine-tuning and engineering and are not really independent agents in any meaningful way yet.

So, no, we don't think people who worry about superintelligence are uneducated cranks... a lot of ML people do take it seriously enough that we've had casual lunch room debates about it. Rather, the reality on the ground is that right now most ML models have enough trouble figuring out relatively simple tasks like Natural Language Understanding, Machine Reading Comprehension, or Dialogue State Tracking, and none of us can imagine how solving those practical problems with say, Actor-Critic Reinforcement Learning models that lack any sort of will of their own, will lead suddenly to the emergence of an active general superintelligence.

We do still think that eventually things will likely develop, because people have been burned underestimating what A.I. advances will occur in the next X years, and when faced with the actual possibility of developing an AGI or ASI, we're likely to be much more careful in the future when things start to get closer to being realized. That's my humble opinion anyway.

This seems like a good place to point out the unilaterialist's curse. If you're thinking about taking an action that burns a commons and notice that no one else has done it yet, that's pretty good evidence that you're overestimating the benefits or underestimating the costs.

This perception problem is a big part of the reason I think we are doomed if superintelligence will soon be feasible to create.

If my anecdotal evidence is indicative of reality, the attitude in the ML community is that people concerned about superhuman AI should not even be engaged with seriously. Hopefully that, at least, will change soon.

If you think there is a chance that he would accept, could you please tell the guy you are referring to that I would love to have him on my podcast. Here is a link to this podcast, and here is me.

Edited thanks to Douglas_Knight

Are you describing me? It fits to a T except my dayjob isn't ML. I post using this shared anonymous account here because in the past when I used my real name I received death threats online from LW users. In a meetup I had someone tell me to my face that if my AGI project crossed a certain level of capability, they would personally hunt me down and kill me. They were quite serious.

I was once open-minded enough to consider AI x-risk seriously. I was unconvinced, but ready to be convinced. But you know what? Any ideology that leads to making death threats against peaceful, non-violent open source programmers is not something I want to let past my mental hygiene filters.

If you, the person reading this, seriously care about AI x-risk, then please do think deeply about what causes this, and ask youself what can be done to put a stop to this behavior. Even if you haven't done so yourself, it is something about the rationalist community which causes this behavior to be expressed.

--

I would be remiss without layout out my own hypothesis. I believe much of this comes directly from ruthless utilitarianism and the "shut up and multiply" mentality. It's very easy to justify murder of one individual, or the threat of it even if you are not sure you'd carry it through, if it is offset by some imagined saving of the world. The problem here is that nobody is omniscient, and yet AI x-riskers are willing to be swayed by utility calculations that in reality have so much uncertainty that they should never be taken seriously. Vaniver's reference to the unilaterialist's curse is spot-on.

Death threats are a serious matter and such behavior must be called out. If you really have received 3 or more death threats as you claim, you should be naming names of those who have been going around making death threats and providing documentation, as should be possible since you say at least two of them were online. (Not because the death threats are particularly likely to be acted on - I've received a number of angry death threats myself over my DNM work and they never went anywhere, as indeed >99.999% of death threats do - but because it's a serious violation of community norms, specific LW policy against 'threats against specific groups', and merely making them greatly poisons the community, sowing distrust and destroying its reputation.)

Especially since, because they are so serious, it is also serious if someone is hoaxing fake death threats and concern-trolling while hiding behind a throwaway... That sort of vague unspecific but damaging accusation is how games of telephone get started and, for example, why, 7+ years later, we still have journalists writing BS about how 'the basilisk terrified the LW community' (thanks to our industrious friends over on Ratwiki steadily inflating the claims from 1 or 2 people briefly worried to a community-wide crisis). I am troubled by the coincidence that almost simultaneous with these claims, over on /r/slatestarcodex, probably the most active post-LW discussion forum, is also arguing over a long post - by another throwaway account - claiming that it is regarded as a cesspit of racism by unnamed experts, following hard on the heels of Caplan/Cowen slamming LW for the old chestnut of being a 'religion'. "You think people would do that? Just go on the Internet and tell lies?" Nor are these the first times that pseudonymous people online have shown up to make damaging but false or unsubstantiated accusations (su3su2su1 comes to mind as making similar claims and turning out to have 'lied for Jesus' about his credentials and the unnamed experts, as does whoever was behind that attempt to claim MIRI was covering up rape).

I agree with the 1st paragraph. You could have done without the accusations of concern trolling in the 2nd.

If, as you say, you agree with the first paragraph, it might behoove you to follow the advice given in said paragraph--naming the people who threatened you and providing documentation.

And call more attention to myself? No. What's good for the community is not the same as what protects myself and my family. Maybe you're missing the larger point here: this wasn't an isolated occurrence, or some unhinged individual. I didn't feel threatened by individuals making juvenile threats, I felt threatened by this community. I'm not the only one. I have not, so far, been stalked by anyone I think would be capable of doing me harm. Rather it is the case that multiple times in casual conversation it has come up that if the technology I work on advanced beyond a certain level, it would be a moral obligation to murder me to halt further progress. This was discussed just as one would debate the most effective charity to donate to. That the dominant philosophy here could lead to such outcomes is a severe problem with both the LW rationality community and x-risk in particular.

I'm curious if this is recent or in the past. I think there has been a shift in the community somewhat, when it became more associated with fluffy-ier EA movement.

You could get someone trusted to post the information anonimised on your behalf. I probably don't fit that bill though.

Are you describing me?

Unlikely. Generally speaking, people who work in ML, especially the top ML groups, aren't doing anything close to 'AGI'. (Many of them don't even take the notion of AGI seriously, let alone any sort of recursive self-improvement.) ML research is not "general" at all (the 'G' in AGI): even the varieties of "deep learning" that are said to be more 'general' and to be able to "learn their own features" only work insofar as the models are fit for their specific task! (There's a lot of hype in the ML world that sometimes obscures this, but it's invariably what you see when you look at which models approach SOTA, and which do poorly.) It's better to think of it as a variety of stats research that's far less reliant on formal guarantees and more focused on broad experimentation, heuristic approaches and an appreciation for computational issues.

We've returned various prominent AI researchers alive the last few times, we can't be that murderous.

I agree that there's a perception problem, but I think there are plenty of people who agree with us too. I'm not sure how much this indicates that something is wrong versus is an inevitable part of the dissemination (or, if I'm wrong, the eventual extinction) of the idea.

A friend of mine, who works in one of the top ML groups, is literally less worried about superintelligence than he is about getting murdered by rationalists.

That's not as irrational as it might seem! The point is, if you think (as most ML researchers do!) that the probability of current ML research approaches leading to any kind of self-improving, super-intelligent entity is low enough, the chances of evil Unabomber cultists being harbored within the "rationality community", however low, could easily be ascertained to be higher than that. (After all, given that Christianity endorses being peaceful and loving one's neighbors even when they wrong you, one wouldn't think that some of the people who endorse Christianity could bomb abortion clinics; yet these people do exist! The moral being, Pascal's mugging can be a two-way street.)

I think that perception will change once AI surpasses a certain threshold. That threshold won't necessarily be AGI - it could be narrow AI that is given control over something significant. Perhaps an algorithmic trading AI suddenly gains substantial control over the market and a small hedge fund becomes one of the richest in history over night. Or AI based tech companies begin to dominate and monopolize entire markets due to their substantial advantage in AI capability. I think that once narrow AI becomes commonplace in many applications, jobs begin to be lost due to robotic replacements, and AI allows many corporations to be too hard to compete with (Amazon might already be an example), the public will start to take interest in control over the technology and there will be less optimism about its use.

Thanks for saying what (I assume) a lot of people were thinking privately.

I think the problem is that Elon Musk is an entrepreneur not a philosopher, so he has a bias for action, "fail fast" mentality, etc. And he's too high-status for people to feel comfortable pointing out when he's making a mistake (as in the case of OpenAI). (I'm generally an admirer of Mr. Musk, but I am really worried that the intuitions he's honed through entrepreneurship will turn out to be completely wrong for AI safety.)

and now think about some visionary entrepreneur/philosopher coming in the past with OpenTank, OpenRadar, OpenRocket, OpenNuke... or OpenNanobot in the future

certainly the public will ensure proper control of the new technology

think about some visionary entrepreneur/philosopher coming in the past with OpenTank, OpenRadar, OpenRocket, OpenNuke... or OpenNanobot in the future

How about do-it-yourself genetic engineering?

My main problem with OpenAI is that it's one thing for them to not be focused on AI alignment, but are they even really focused on AI "safety" even in the loose sense of the word? Most of their published research has to do with tweaks and improvements to deep learning techniques that enhance their performance but do not really aid our theoretical understanding of them. (Which makes it pretty much the same as Google Brain, FAIR, and DeepMind in that regard). It even turned out that Ian Goodfellow, the discoverer of GANs and the primary researcher on adversarial attacks on deep learning systems left OpenAI and went back to Google because it turned out Google researchers were more interested than OpenAI in working on deep learning security issues...

On the $30 million grant from Open Philanthropy: I've seen it discussed on HackerNews and Reddit but not much here, and it seems like there's plenty of confusion about what's going on. After all it is quite a large amount, but OpenAI seems like it's quite well funded already. So the obvious question people have is, is this a ploy for the AI risk people to gain more control over OpenAI's research direction? And one thing I'm worried about is that there could be plenty of push-back on that, because it was such a bold move and the reasons given by Open Philanthropy for the grant would not indicate they were doing as such. And it seems there's quite a lot of hostility towards AI safety research in general.

The linked quote from Ian Goodfellow:

Yes, I left OpenAI at the end of February and returned to Google Brain. I enjoyed my time at OpenAI and am proud of the work my OpenAI colleagues and I accomplished. I returned to Google Brain because as time went on I found that my research focus on adversarial examples and related technologies like differential privacy saw me collaborate predominantly with colleagues at Google.

AI alignment isn't really OpenAI's primary mission. They're seeking to democratize access to AI technology, by developing AI technologies in the open (on Github, etc.) with permissive licenses. AI alignment is sortof a side research area that they are committing a small amount of time and resources to.

It says right on OpenAI's about me page:

OpenAI is a non-profit AI research company, discovering and enacting the path to safe artificial general intelligence.

That as stated looks like AI alignment to me, although I agree with you that in practice they are doing exactly what you said.

They're buying Holden a seat on the board in order to exercise unspecified influence over OpenAI. This is pretty clear from their grant writeup. I plan to write a bit about this soon.

The OpenAI people I've talked to say that they're less open than the name would suggest, and are willing to do things less openly to the extent that that makes sense to them. On the other hand, Gym and Universe are in fact pretty open and I think they probably made the world slightly worse, by slightly accelerating AI progress. It's possible that this might be offset by benefits to OpenAI's reputation if they're more willing to spread safety memes as they acquire more mind share.

Your story of OpenAI is incomplete in at least one important respect: Musk was actually an early investor in DeepMind before it was acquired by Google.

Finally, what do people think about the prospects of influencing OpenAI to err more on the side of safety from the inside? It's possible people like Paul can't do much about this yet by virtue of not having acquired sufficient influence within the company, and maybe just having more people like Paul working at OpenAI could strengthen that influence enough to matter.

I think our prospects for influence in a good direction are nonzero only if we make it common knowledge that no one credible thinks the original mandate of OpenAI promoted long-run AI safety. Beyond that I don't know.

I thought OpenAI was more about open sourcing deep learning algorithms and ensuring that a couple of rich companies/individuals weren't the only ones with access to the most current techniques. I could be wrong, but from what I understand OpenAI was never about AI safety issues as much as balancing power. Like, instead of building Jurassic Park safely, it let anyone grow a dinosaur in their own home.

to buy a seat on OpenAI’s board

I wish we lived in a world where the Open Philanthropy Project page could have just said it like that, instead of having to pretend that no one knows what "initiates a partnership between" means.

That world is called the planet Vulcan.

Meanwhile, on earth, we are subject to common knowledge/signalling issues...

Arguments for openness:

  • Everyone can see the bugs/ logical problems with your design.
  • Decreases the chance of arms race, depending upon psychology of the participants. And also black ops to your team. If I think people are secretly developing an intelligence breakthrough I wouldn't trust them and would develop my own in secret. And/or attempt to sabotage their efforts and steal their technology (and win). If it is out there, there is little benefit to neutralizing your team of safety researchers.
  • If something is open you are more likely to end up in a multi-polar world. And if the intelligence that occurs only has a chance of being human aligned you may want to reduce variance by increasing the number of poles.
  • If an arms race is likely despite your best efforts it is better that all the competitors have any of your control technology, this might require them to have your tech stack.

If someone is developing in the open, it is good proof that they are not unilaterally trying to impose their values on the future.

The future is hard, I'm torn on the question of openness.

I am curious about the frequency with which the second and fourth points get brought up as advantages. In the historical case, multipolar conflicts are the most destructive. Forestalling an arms race by giving away technology also sets that technology as the mandatory minimum.

As a result, every country that has a computer science department in their universities is now a potential belligerent, and violent conflict without powerful AI has been effectively ruled out.

Decreases the chance of arms race, depending upon psychology of the participants.

This may be a good argument in general, but given the actual facts on the ground when OpenAI was created, the reverse seems to have occurred.

I think the basic argument for OpenAI is that it is more dangerous for any one organization or world power to have an exclusive monopoly on A.I. technology, and so OpenAI is an attempt to safeguard against this possibility. Basically, it reduces the probability that someone like Alphabet/Google/Deepmind will establish an unstoppable first mover advantage and use it to dominate everyone else.

OpenAI is not really meant to solve the Friendly/Unfriendly AI problem. Rather it is meant to mitigate the dangers posed by for-profit corporations or nationalistic governments made up of humans doing what humans often do when given absurd amounts of power.

Personally I think OpenAI doesn't actually solve this problem sufficiently well because they are still based in the United States and thus beholden to U.S. laws, and wish that they'd chosen a different country, because right now the bleeding edge of A.I. technology is being developed primarily in a small region of California, and that just seems like putting all your eggs in one basket.

I do think however that the general idea of having a non-profit organization focused on AI technology is a good one, and better than the alternative of continuing to merely trust Google to not be evil.

So, um, you think that the arms race is likely to be between DeepMind and OpenAI?

And not between a highly secret organization funded by the US government and another similar organization funded by the Chinese government?

If there's anything we can do now about the risks of superintelligent AI, then OpenAI makes humanity less safe.

I feel quite strongly that people in the AI risk community are overly affected by the availability or vividness bias relating to an AI doom scenario. In this scenario some groups get into an AI arms race, build a general AI without solving the alignment problem, the AGI "fooms" and then proceeds to tile the world with paper clips. This scenario could happen, but some others could also happen:

  • An asteroid is incoming and going to destroy Earth. AI solves a complex optimization problem to allow us to divert the asteroid.
  • Terrorists engineer a virus to kill all persons with genetic trait X. An AI agent helps develop a vaccine before billions die.
  • By analyzing systemic risk in the markets, an AI agent detects and allows us to prevent the Mother of all Financial Meltdowns, that would have led to worldwide economic collapse.
  • An AI agent helps SpaceX figure out how to build a Mars colony for a two orders of magnitude less money than otherwise, thereby enabling the colony to be built.
  • An AI system trained on vast amounts of bioinformatics and bioimaging data discovers the scientific cause of aging and also how to prevent it.
  • An AI climate analyzer figures out how to postpone climate change for millennia by diverting heat into the deep oceans, and gives us an inexpensive way to do so.
  • etc etc etc

These scenarios are equally plausible, involve vast benefit to humanity, and require only narrow AI. Why should we believe that these positive scenarios are less likely than the negative scenario?

Consider the difference between the frame of expected value/probability theory and the frame of bounded optimality/error minimization. Under the second frame the question becomes "how can I manipulate my environment such that I wind up in close proximity to the errors that I have a comparative advantage in spotting?"

Great post. I even worry about the emphasis on FAI, as it seems to depend on friendly superintelligent AIs effectively defending us against deliberately criminal AIs. Scott Alexander speculated:

For example, it might program a virus that will infect every computer in the world, causing them to fill their empty memory with partial copies of the superintelligence, which when networked together become full copies of the superintelligence.

But way before that, we will have humans looking to get rich programming such a virus, and you better believe they won't be using safeguards. It won't take over every computer in the world - just the ones that aren't defended by a more-powerful superintelligence (i.e. almost all computers) and that aren't interacting with the internet using formally verified software. We'll be attacked by a superintelligence running on billions of smart phones. Might be distributed initially through a compromised build of the hottest new social app for anonymous VR fucking.

Ugh. When I heard about this first I naively thought it was great news. Now I see it's a much harder question.