Today, the Machine Intelligence Research Institute is launching a new forum for research discussion: the Intelligent Agent Foundations Forum! It's already been seeded with a bunch of new work on MIRI topics from the last few months.
We've covered most of the (what, why, how) subjects on the forum's new welcome post and the How to Contribute page, but this post is an easy place to comment if you have further questions (or if, maths forbid, there are technical issues with the forum instead of on it).
But before that, go ahead and check it out!
(Major thanks to Benja Fallenstein, Alice Monday, and Elliott Jin for their work on the forum code, and to all the contributors so far!)
EDIT 3/22: Jessica Taylor, Benja Fallenstein, and I wrote forum digest posts summarizing and linking to recent work (on the IAFF and elsewhere) on reflective oracle machines, on corrigibility, utility indifference, and related control ideas, and on updateless decision theory and the logic of provability, respectively! These are pretty excellent resources for reading up on those topics, in my biased opinion.
First of all, purposefully limiting scope to protecting against only the runaway superintelligence scenario is preventing a lot of good that could be done right now, and keeps your work from having practical applications it otherwise would have. For example, right now somewhere deep in Google and Facebook there are machine learning recommendation engines that are suggesting the display of whisky ads to alcoholics. Learning how to create even a simple recommendation engine whose output is constrained by the values of its creators would be a large step forward and would help society today. But I guess that's off-topic.
Second, even if you buy the argument that existential risk trumps all and we should ignore problems that could be solved today such as that recommendation engine example, it is demonstrably not the case in history that the fastest way to develop a solution is to ignore all practicalities and work from theory backwards. No, in almost every case what happens is the practical and the theoretical move forward hand in hand, with each informing progress in the other. You solve the recommendation engine example not because it has the most utilitarian direct outcomes, but because the theoretical and practical outcomes are more likely to be relevant to the larger problem than an ungrounded problem chosen by different means. And on the practical side, you will have engineers coming forward the beginnings of solutions -- "hey I've been working on feedback controls, and this particular setup seems to work very well in the standard problem sets..." In the real world theoreticians more often than not spend their time proving the correctness of the work of a technologist, and then leveraging that theory to improve upon it.
Third, there are specific concerns I have about the approach. Basically time spent now on unbounded AIXI constructs is probably completely wasted. Real AGIs don't have Solomonoff inductors or anything resembling them. Thinking that unbounded solutions could be modified to work on a real, computable superintelligence betrays a misunderstanding of the actual utility of AIXI. AIXI showed that all the complexity of AGI lies in the practicalities, because the pure uncomputable theory is dead simple but utterly divorced from practice. AIXI brought some respectability to the field by having some theoretical backing, even if that theory is presently worse than useless in as much as it is diverting otherwise intelligent people from making meaningful contributions.
Finally, there's the simple matter that an ignore-all-practicalities theory-first approach is useless until it nears completion. My current trajectory places the first AGI at 10 to 15 years out, and the first self-improving superintelligence shortly thereafter. Will MIRI have practical results in that time frame? The schedule is not going to stop and wait for perfection. So if you want to be relevant, then stay relevant.
I think something showing how to do value learning on a small scale like this would be on topic. It might help to expose the advantages and disadvantages of algorithms like inverse reinforcement learning.
I also agree that, if there are more practical applications of AI safety ideas, this will increase interest and resources devoted to AI safety. I don't really see those applications yet, but I will look out for them. Thanks for bringing this to my attention.
I don't have a great understanding of the history of engineering, but I get the impression that working from the theory backwards can often be helpful. For example, Turing developed the basics of computer science before sufficiently general computers existed.
My current impression is that solving FAI with a hypercomputer is a fundamentally easier problem that solving it with a bounded computer, and it's hard to say much about the second problem if we haven't made steps towards solving the first one. On the other hand, I do think that concepts developed in the AI field (such as statistical learning theory) can be helpful even for creating unbounded solutions.
I would really like it if the pure uncomputable theory of Friendly AI were dead simple!
Anyway, AIXI has been used to develop more practical algorithms. I definitely approach many FAI problems with the mindset that we're going to eventually need to scale this down, and this makes issues like logical uncertainty a lot more difficult. In fact, Paul Christiano has written about tractable logical uncertainty algorithms, which is a form of "scaling down an intractable theory". But it helped to have the theory in the first place before developing this.
Solutions that seem to work for practical systems might fail for superintelligence. For example, perhaps induction can yield acceptable practical solutions for weak AIs, but does not necessarily translate to new contexts that a superintelligence might find itself in (where it has to make pivotal decisions without training data for these types of decisions). But I do think working on these is still useful.
I consider AGI in the next 10-15 years fairly unlikely, but it might be worth having FAI half-solutions by then, just in case. Unfortunately I don't really know a good way to make half-solutions. I would like to hear if you have a plan for making these.