The project of Friendly AI would benefit from being approached in a much more down-to-earth way. Discourse about the subject seems to be dominated by a set of possibilities which are given far too much credence:
- A single AI will take over the world
- A future galactic civilization depends on 21st-century Earth
- 10n-year lifespans are at stake, n greater than or equal to 3
- We might be living in a simulation
- Acausal deal-making
- Multiverse theory
Add up all of that, and you have a great recipe for enjoyable irrelevance. Negate every single one of those ideas, and you have an alternative set of working assumptions that are still consistent with the idea that Friendly AI matters, and which are much more suited to practical success:
- There will always be multiple centers of power
- What's at stake is, at most, the future centuries of a solar-system civilization
- No assumption that individual humans can survive even for hundreds of years, or that they would want to
- Assume that the visible world is the real world
- Assume that life and intelligence are about causal interaction
- Assume that the single visible world is the only world we affect or have reason to care about
The simplest reason to care about Friendly AI is that we are going to be coexisting with AI, and so we should want it to be something we can live with. I don't see that anything important would be lost by strongly foregrounding the second set of assumptions, and treating the first set of possibilities just as possibilities, rather than as the working hypothesis about reality.
[Earlier posts on related themes: practical FAI, FAI without "outsourcing".]
That is only a superficial difference, a difference of scenario considered. If you put a bad actor from ordinary machine-ethics into a possible world where you can torture someone forever, or if you put a UFAI into a possible world where the most harm it can do is blow you up once, this difference goes away.
Designing an "ethical computer program" or a "friendly AI" is not about which possible world the program inhabits, it's about the internal causality of the program and the choices it makes. The valuable parts of FAI research culture are all on this level. Associating FAI with the possible world of "post-singularity hell", as if that is the essence of what distinguishes the approach, is an example of what I want to combat in this post.
The key difference is that in the case of a Seed AI, you need to find a way to make a goal system stable under recursive self-improvement. In the case of a toaster, you do not.
It's useful to keep Friendly AI concerns in mind when designing ethical robots, since they potentially become a risk when they start to get more autonomous. But when you're giving a robot a gun, the relevant ethical concerns are things like whether it will shoot civilians. The scope is relevantly different.
Really, there is a whole field out there of Machine Ethics, and it's pretty well established that it's up to a different sort of thing than what SIAI is doing. While some folks still conflate "Friendly AI" and "Machine Ethics", I think it's much better to maintain the distinction and consider FAI a subfield of Machine Ethics.