Stronger than human artificial intelligence would be dangerous to humanity. It is vital any such intelligence’s goals are aligned with humanity's goals. Maximizing the chance that this happens is a difficult, important and under-studied problem.
To encourage more and better work on this important problem, we (Zvi Mowshowitz and Vladimir Slepnev) are announcing a $5000 prize for publicly posted work advancing understanding of AI alignment, funded by Paul Christiano.
This prize will be awarded based on entries gathered over the next two months. If the prize is successful, we will award further prizes in the future.
The prize is not backed by or affiliated with any organization.
Rules
Your entry must be published online for the first time between November 3 and December 31, 2017, and contain novel ideas about AI alignment. Entries have no minimum or maximum size. Important ideas can be short!
Your entry must be written by you, and submitted before 9pm Pacific Time on December 31, 2017. Submit your entries either as links in the comments to this post, or by email to apply@ai-alignment.com. We may provide feedback on early entries to allow improvement.
We will award $5000 to between one and five winners. The first place winner will get at least $2500. The second place winner will get at least $1000. Other winners will get at least $500.
Entries will be judged subjectively. Final judgment will be by Paul Christiano. Prizes will be awarded on or before January 15, 2018.
What kind of work are we looking for?
AI Alignment focuses on ways to ensure that future smarter than human intelligence will have goals aligned with the goals of humanity. Many approaches to AI Alignment deserve attention. This includes technical and philosophical topics, as well as strategic research about related social, economic or political issues. A non-exhaustive list of technical and other topics can be found here.
We are not interested in research dealing with the dangers of existing machine learning systems commonly called AI that do not have smarter than human intelligence. These concerns are also understudied, but are not the subject of this prize except in the context of future smarter than human intelligence. We are also not interested in general AI research. We care about AI alignment, which may or may not also advance the cause of general AI research.
(Addendum: the results of the prize and the rules for the next round have now been announced.)
Yeah, I had an initial gut sense of "oh man this seems important and but I'm worried it'd quietly fade out of consciousness by default." Much of my advice would be whpearson's. Some additional thoughts (I think mostly fleshing out why I think whpearson's suggestions are important)
i. Big Activation Costs
You are asking people to do a hard thing. You'd providing money to incentivize them, but people are lazy - they will forget, or start doing it but not get around to finish or not get around to finishing until too late.
Anything to reduce the activation cost is good.
1) Maybe have the first thing you ask is for people to apply if they might be interested, with as low a cost to doing so as possible (while gaining at least some information about people and weeding out dead-wood).
This gets people slightly committed, and gives you the opportunity to spam a much narrower subset of people to remind them. (see spam section)
2) It's ambiguous to me what kind of writing you're looking for, which in turn makes me unsure if it's be a good use of my time to work on this, which makes me hesitate. (I'm currently assuming that this is not the right use of my talents both for altruistic and selfish reasons, but I can imagine a slightly different version of me for whom it'd be ambiguous)
Whpearson's "list good existing articles, as diverse as possible" helps counteract part of this, but still doesn't answer questions like "should I be doing this if this is currently my day job? Presumably the point is to get more people workin on this." (and the correlary: if professional AI safety workers are submitting, what chance do I have of contributing something useful?)
(Relatedly - I'd originally thought you should spell out what sort of questions you were looking to resolve, then saw you had linked to Paul Christiano's doc. I think attempting to summarize the doc might accidentally focus on too narrow a domain, but the current linking of the doc is so small I missed it the first time)
ii. Spam vs Valuable-Self-Promoting-Machinery
By default, you need to spam things a lot. One way to get the word out is to post on all the relevant FB groups, discords, etc - multiple times, so that when they forget and fade to the backburner it doesn't disappear forever.
Being forced to spam everyone once a week is a bad equilibrium. If you can figure out how to spam exactly the people who matter (see i.1) that's also better.
If you can spam in a way that's providing value rather than sucking up attention, that's better. If you can make the thing spam itself in a way that provides value, better still.
One way of spamming-that-provides value might be having a couple followup posts that do things like "provide suggestions and reading lists for people who are considering working on this but don't quite know how to approach the problem." (targeting the sort of person who you think almost has the skills the contribute, and is just missing a few key elements that are easy to teach)
Another might be encouraging to post their drafts publicly to attract additional attention and comments that keep the thing in public consciousness. (This may work against the contest model though)