In reply to:

Cooperative play (as opposed to morality) strongly depends on the position from which you're negotiating. For example if the FAI scenario is much less likely (a priori) than a Clippy scenario, then there's no reason for Clippy to make strong concessions.

For example if the FAI scenario is much less likely (a priori) than a Clippy scenario, then there's no reason for Clippy to make strong concessions.

But if a "paperclips" maximizer, as opposed to "tables", "cars", or "alien sex toys" maximizer, is just one of many unfriendly maximizers, then maximizing "human values" is just one of many unlikely outcomes. In other words, you can't just say that unfriendly AIs are more likely than friendly AIs when it comes to cooperation. Since the opposition between a paperclip maximizer and an "alien sex toy" maximizer is the same as the opposition between the former and an alien or human friendly AI. Since all of them want to maximize their opposing values. And even if there turns out to be a subset of values shared by some AIs, other groups could cooperate to outweigh their leverage.

Will AGI surprise the world?

Cross-posted from my blog.

Yudkowsky writes:

In general and across all instances I can think of so far, I do not agree with the part of your futurological forecast in which you reason, "After event W happens, everyone will see the truth of proposition X, leading them to endorse Y and agree with me about policy decision Z."

...

Example 2: "As AI gets more sophisticated, everyone will realize that real AI is on the way and then they'll start taking Friendly AI development seriously."

Alternative projection: As AI gets more sophisticated, the rest of society can't see any difference between the latest breakthrough reported in a press release and that business earlier with Watson beating Ken Jennings or Deep Blue beating Kasparov; it seems like the same sort of press release to them. The same people who were talking about robot overlords earlier continue to talk about robot overlords. The same people who were talking about human irreproducibility continue to talk about human specialness. Concern is expressed over technological unemployment the same as today or Keynes in 1930, and this is used to fuel someone's previous ideological commitment to a basic income guarantee, inequality reduction, or whatever. The same tiny segment of unusually consequentialist people are concerned about Friendly AI as before. If anyone in the science community does start thinking that superintelligent AI is on the way, they exhibit the same distribution of performance as modern scientists who think it's on the way, e.g. Hugo de Garis, Ben Goertzel, etc.

My own projection goes more like this:

As AI gets more sophisticated, and as more prestigious AI scientists begin to publicly acknowledge that AI is plausibly only 2-6 decades away, policy-makers and research funders will begin to respond to the AGI safety challenge, just like they began to respond to CFC damages in the late 70s, to global warming in the late 80s, and to synbio developments in the 2010s. As for society at large, I dunno. They'll think all kinds of random stuff for random reasons, and in some cases this will seriously impede effective policy, as it does in the USA for science education and immigration reform. Because AGI lends itself to arms races and is harder to handle adequately than global warming or nuclear security are, policy-makers and industry leaders will generally know AGI is coming but be unable to fund the needed efforts and coordinate effectively enough to ensure good outcomes.

At least one clear difference between my projection and Yudkowsky's is that I expect AI-expert performance on the problem to improve substantially as a greater fraction of elite AI scientists begin to think about the issue in Near mode rather than Far mode.

As a friend of mine suggested recently, current elite awareness of the AGI safety challenge is roughly where elite awareness of the global warming challenge was in the early 80s. Except, I expect elite acknowledgement of the AGI safety challenge to spread more slowly than it did for global warming or nuclear security, because AGI is tougher to forecast in general, and involves trickier philosophical nuances. (Nobody was ever tempted to say, "But as the nuclear chain reaction grows in power, it will necessarily become more moral!")

Still, there is a worryingly non-negligible chance that AGI explodes "out of nowhere." Sometimes important theorems are proved suddenly after decades of failed attempts by other mathematicians, and sometimes a computational procedure is sped up by 20 orders of magnitude with a single breakthrough.

Comments

sorted by
magical algorithm
Highlighting new comments since Today at 8:43 AM
Select new highlight date
All comments loaded

A third possibility is that AGI becomes the next big scare.

There's always a market for the next big scare, and a market for people who'll claim putting them in control will save us from the next big scare.

Having the evil machines take over has always been a scare. When AI gets more embodied, and start working together autonomously, people will be more likely to freak, IMO.

Getting beat on Jeopardy is one thing, watching a fleet of autonomous quad copters doing their thing is another. It made me a little nervous, and I'm quite pro AI. When people see machines that seem like they're alive, like they think, communicate among themselves, and cooperate in action, many will freak, and others will be there to channel and make use of that fear.

That's where I disagree with EY. He's right that a smarter talking box will likely just be seen as an nonthreatening curiosity. Watson 2.0, big deal. But embodied intelligent things that communicate and take concerted action will press our base primate "threatening tribe" buttons.

"Her" would have had a very different feel if all those AI operating systems had bodies, and got together in their own parallel and much more quickly advancing society. Kurzweil is right in pointing out that with such advanced AI, Samantha could certainly have a body. We'll be seeing embodied AI well before any human level of AI. That will be enough for a lot of people to get their freak out on.

Yeah, this becomes plausible if some analogue of Chernobyl happens. Maybe self-driving cars cause some kind of horrible accident due to algorithms behaving unexpectedly.

There's always a market for the next big scare, and a market for people who'll claim putting them in control will save us from the next big scare.

That's how I've always viewed SIAI/MIRI, at least in terms of a significant subset of those who send them money...

Self-driving cars are already inspiring discussion of AI ethics in mainstream media.

Driving is something that most people in the developed world feel familiar with — even if they don't themselves drive a car or truck, they interact with people who do. They are aware of the consequences of collisions, traffic jams, road rage, trucker or cabdriver strikes, and other failures of cooperation on the road. The kinds of moral judgments involved in driving are familiar to most people — in a way that (say) operating a factory or manipulating a stock market are not.

I don't mean to imply that most people make good moral judgments about driving — or that they will reach conclusions about self-driving cars that an AI-aware consequentialist would agree with. But they will feel like having opinions on the issue, rather than writing it off as something that programmers or lawyers should figure out. And some of those people will actually become more aware of the issue, who otherwise (i.e. in the absence of self-driving cars) would not.

So yeah, people will become more and more aware of AI ethics. It's already happening.


Self-driving cars will also inevitably catalyze discussion of the economic morality of AI deployment. Or rather, self-driving trucks will, as they put millions of truck drivers out of work over the course of five to ten years — long-distance truckers first, followed by delivery drivers. As soon as the ability to retrofit an existing truck with self-driving is available, it would be economic idiocy for any given trucking firm to not adopt it as soon as possible. Robots don't sleep or take breaks.

So, who benefits? The owners of the trucking firm and the folks who make the robots. And, of course, everyone whose goods are now being shipped twice as fast because robots don't sleep or take breaks. (The AI does not love you, nor does it hate you, but you have a job that it can do better than you can.)

As this level of AI — not AGI, but application-specific AI — replaces more and more skilled labor, faster and faster, it will become increasingly impractical for the displaced workers to retrain into the fewer and fewer remaining jobs.

This is also a moral problem of AI ...

Whether we should do otherwise-obviously-suboptimal things solely because it'd result in more jobs is a question that long predates self-driving cars...

Well, I want to end up in the future where humans don't have to labor to survive, so I'm all for automating more and more jobs away. But in order to end up in that future, the benefits of automation have to also accrue to the displaced workers. Otherwise you end up with a shrinking productive class, a teeny-tiny owner class, and a rapidly growing unemployable class — who literally can't learn a new trade fast enough to work at it before it is automated away by accelerating AI deployment.

As far as I can tell, the only serious proposal that might make the transition from the "most adult humans work at jobs to make a living" present to the "robots do most of the work and humans do what they like" future — without the sort of mass die-off of the lower class that someone out there probably fantasizes about — is something like Friedman's basic income / negative income tax proposal. If you want to end up in a future where humans can screw off all day because the robots have the work covered, you have to let some humans screw off all day. May as well be the displaced workers.

I agree. (Yvain wrote about that in more detail here, and a followup here.)

I'd prefer something like Georgism to negative income tax, but the former has fewer chances of actually being implemented any time soon.

Whether we should do otherwise-obviously-suboptimal things solely because it'd result in more jobs is a question that long predates self-driving cars...

It long predates Milton Friedman, too

I don't think the linked PCP thing is a great example. Yes, the first time someone seriously writes an algorithm to do X it typically represents a big speedup on X. The prediction of the "progress is continuous" hypothesis is that the first time someone writes an algorithm to do X, it won't be very economically important---otherwise someone would have done it sooner---and this example conforms to that trend pretty well.

The other issue seems closer to relevant; mathematical problems do go from being "unsolved" to "solved" with comparatively little warning. I think this is largely because they are small enough problems that they are 1 man jobs (which would not be plausible if anyone really cared about the outcome), but that may not be the whole story and at any rate something is going on here.

In the PCP case, the relevantly similar outcome would be the situation where theoretical work on interactive proofs turned out to be useful right out of the box. I'm not aware of any historical cases where this has happened, but I could be missing some, and I don't really understand why it would happen as rarely as it does. It would be nice to understand this possibility better.

As for "people can't tell the difference between watson and being close to broadly human-level AI,” I think this is unlikely. At the very least the broader intellectual community is going to have little trouble distinguishing between watson and economically disruptive AI, so this is only plausible if we get a discontinuous jump. But even assuming a jump, the AI community is not all that impressed by watson and I expect this is an important channel by which significant developments would affect expectations.

As you noted on your blog Elon Musk is concerned about unfriendly AI and from his comments about how escaping to mars won't be a solution because "The A.I. will chase us there pretty quickly" he might well share MIRI's fear that the AI will seek to capture all of the free energy of the universe. Peter Thiel, a major financial supporter of yours, probably also has this fear.

If after event W happens, Elon Musk, Peter Thiel, and a few of their peers see the truth of proposition X and decide that they and everything they care about will perish if policy Z doesn't get enacted, they will with high probability succeed in getting Z enacted.

I don't know if this story is true, but I read somewhere that when Julius Caesar was marching on the Roman Republic several Senators went to Pompey the Great, handed him a sword and said "save Rome." Perhaps when certain Ws happen we should make an analogous request of Musk or Thiel.

I would believe that as soon as AGI becomes near (it it will ever will) predictions by experts will start to converge to some fixed date, rather than the usual "15-20 years in the future".

(Nobody was ever tempted to say, "But as the nuclear chain reaction grows in power, it will necessarily become more moral!")

Apologies for asking an off-topic question that has certainly been discussed somewhere before, but if advanced decision theories are logically superior, then they are in some sense universal, in that a large subspace of mindspace will adopt them when the minds become intelligent enough ("Three worlds collide" seems to indicate that this is EYs opinion, at least for minds that evolved), then even a paperclip maximiser would assign some nontrivial component of its utility function to match humanity's, iff we would have done the same in the counterfactual case that FAI came first (I think this does also have to assume that at least one party has a sublinear utility curve).

In this sense, it seems that as entities grow in intelligence, they are at least likely to become more cooperative/moral.

Of course, FAI is vastly preferable to an AI that might be partially cooperative, so I am not trying to diminish the importance of FAI. I'd still like to know whether the consensus opinion is that this is plausible.

Actually I think I know one place has been discussed before - Clippy promised friendliness and someone else promised him a lot of paperclips. But I don't know of a serious discussion.

Just a historical note, I think Rolf Nelson was the earliest person to come up with that idea, back in 2007. Though it was phrased in terms of simulation warfare rather than acausal bargaining at first.

Cooperative play (as opposed to morality) strongly depends on the position from which you're negotiating. For example if the FAI scenario is much less likely (a priori) than a Clippy scenario, then there's no reason for Clippy to make strong concessions.

For example if the FAI scenario is much less likely (a priori) than a Clippy scenario, then there's no reason for Clippy to make strong concessions.

But if a "paperclips" maximizer, as opposed to "tables", "cars", or "alien sex toys" maximizer, is just one of many unfriendly maximizers, then maximizing "human values" is just one of many unlikely outcomes. In other words, you can't just say that unfriendly AIs are more likely than friendly AIs when it comes to cooperation. Since the opposition between a paperclip maximizer and an "alien sex toy" maximizer is the same as the opposition between the former and an alien or human friendly AI. Since all of them want to maximize their opposing values. And even if there turns out to be a subset of values shared by some AIs, other groups could cooperate to outweigh their leverage.

But since there is an exponentially huge set of random maximisers, the probability of each individual one is infinitesimal. OTOH, human values have a high probability density in mindspace because people are actually working towards it.

I've posted about this before, but there are many aspects of AI safety that we can research much more effectively once strong AI is nearer to realization. If people today say "AI could be a risk but it would be hard to get a good ROI on research dollars invested in AI safety today", I'm inclined to agree.

Therefore, it won't simply be interest in X-risk, but feasibility of concrete research plans on how to reduce it that help advance any AI safety agenda.