I recently wrote an essay about AI risk, targeted at other academics:
Long-Term and Short-Term Challenges to Ensuring the Safety of AI Systems
I think it might be interesting to some of you, so I am sharing it here. I would appreciate any feedback any of you have, especially from others who do AI / machine learning research.
Extinction is much more costly to society as a whole than to any individual (especially if we count future unborn people). For example a purely selfish individual might value the cost of extinction the same as their own death (which is on average around $10 million as estimated by how much you have to pay people to compensate for increasing their risk of death). For society as a whole this cost is at least quadrillions of dollars if not astronomically more. So selfish individuals would be willing to take much bigger extinction risks than is socially optimal, if doing so provides them with private benefits. This is a tragedy of the commons scenario.
In the slow takeoff scenario, I think a similar tragedy of the commons dynamic is likely to play out. If humanity as a whole could coordinate and wait until we fully solve the AI control / value alignment problem before creating autonomous AIs, then humane values could eventually control all or most of the universe. But instead we're likely to create such AIs as soon as we can extract private benefits (fame, prestige, profit, etc.) from creating them. Once we do, they'll take over larger and larger share of the economy and eventually the universe. (Nobody currently owns the universe, so again it's a classic commons.)
But a single purely selfish individual is unlikely to create a competitive AI project. For a medium-large organization made of people who care at least of their own life and the life of their kin the cost of extinction will be so high that it will offset any benefits that they may hope to obtain.