AI Safety Research Camp - Project Proposal

AI Safety Research Camp - Project Proposal

→ Give your feedback on our plans below or in the google doc
Apply to take part in the Gran Canaria camp on 12-22 April (deadline: 12 February)
Join the Facebook group


Aim: Efficiently launch aspiring AI safety and strategy researchers into concrete productivity by creating an ‘on-ramp’ for future researchers.


  1. Get people started on and immersed into concrete research work intended to lead to papers for publication.
  2. Address the bottleneck in AI safety/strategy of few experts being available to train or organize aspiring researchers by efficiently using expert time.
  3. Create a clear path from ‘interested/concerned’ to ‘active researcher’.
  4. Test a new method for bootstrapping talent-constrained research fields.

Method: Run an online research group culminating in a two week intensive in-person research camp. Participants will work in groups on tightly-defined research projects on the following topics:

  • Agent foundations
  • Machine learning safety
  • Policy & strategy
  • Human values

Projects will be proposed by participants prior to the start of the program. Expert advisors from AI Safety/Strategy organisations will help refine them into proposals that are tractable, suitable for this research environment, and answer currently unsolved research questions. This allows for time-efficient use of advisors’ domain knowledge and research experience, and ensures that research is well-aligned with current priorities.

Participants will then split into groups to work on these research questions in online collaborative groups over a period of several months. This period will culminate in a two week in-person research camp aimed at turning this exploratory research into first drafts of publishable research papers. This will also allow for cross-disciplinary conversations and community building, although the goal is primarily research output. Following the two week camp, advisors will give feedback on manuscripts, guiding first drafts towards completion and advising on next steps for researchers.

Example: Multiple participants submit a research proposal or otherwise express an interest in interruptibility during the application process, and in working on machine learning-based approaches. During the initial idea generation phase, these researchers read one another’s research proposals and decide to collaborate based on their shared interests. They decide to code up and test a variety of novel approaches on the relevant AI safety gridworld. These approaches get formalised in a research plan.

This plan is circulated among advisors, who identify the most promising elements to prioritise and point out flaws that render some proposed approaches unworkable. Participants feel encouraged by expert advice and support, and research begins on the improved research proposal.

Researchers begin formalising and coding up these approaches, sharing their work in a Github repository that they can use as evidence of their engineering ability. It becomes clear that a new gridworld is needed to investigate issues arising from research so far. After a brief conversation, their advisor is able to put them in touch with the relevant engineer at Deepmind, who gives them some useful tips on creating this.

At the research camp the participants are able to discuss their findings and put them in context, as well as solve some technical issues that were impossible to resolve part-time and remotely. They write up their findings into a draft paper and present it at the end of the camp. The paper is read and commented on by advisors, who give suggestions on how to improve the paper’s clarity. The paper is submitted to NIPS 2018’s Aligned AI workshop and is accepted.

Expected outcome: Each research group will aim to produce results that can form the kernel of a paper at the end of the July camp. We don’t expect every group to achieve this, as research progress is hard to predict.

  1. At the end of the camp, from five groups, we would expect three to have initial results and a first draft of a paper that the expert advisors find promising.
  2. Within six months following the camp, three or more draft papers have been written that are considered to be promising by the research community.
  3. Within one year following the camp, three or more researchers who participated in the project obtain funding or research roles in AI safety or strategy.

Next steps following the camp: When teams have produced promising results, camp organizers and expert advisors will endeavour to connect the teams to the right parties to help the research shape up further and be taken to conclusion.

Possible destinations for participants who wish to remain in research after the camp would likely be some combination of:

  1. Full-time internships in areas of interest, for instance Deepmind, FHI or CHAI
  2. Full-time research roles at AI safety/strategy organisations
  3. Obtaining research funding such as OpenPhil or FLI research grants - successful publications may unlock new sources of funding
  4. Independent remote research
  5. Research engineering roles at technical AI safety organisations

Research projects can be tailored towards participants’ goals - for instance researchers who are interested in engineering or machine learning-related approaches to safety can structure a project to include a significant coding element, leading to (for instance) a GitHub repo that can be used as evidence of engineering skill. This is also a relatively easy way for people who are unsure if research work is for them to try it out without the large time investment and opportunity cost of a PhD or masters program, although we do not see it as a full replacement for these.


Timeline: We anticipate this project having 4 main phases (dates are currently open for discussion):

  1. Plan and develop the project, recruit researchers and look for advisors - December 2017 to April 2018
  2. Testing and refinement of event design during a small-scale camp at Gran Canaria - April 12-22
  3. Project selection, refinement and exploration (online) - April 2018 to July 2018
  4. Research camp (in person) - July/August 2018

Recruiting: We plan to have approximately 20 researchers working in teams of 3-5 people, with projects in agent foundations, machine learning, strategy/policy and human values/cognition. Based on responses to a registration form we have already posted online (link here) we expect to be able to easily meet this number of participants.

Each team will be advised by a more experienced researcher in the relevant area, however we expect this won’t be as tightly-coupled a relationship as that between PhD students and their supervisors - the aim is to maximise the usefulness of the relatively scarce advisor time and to develop as much independence in researchers as possible.

Project selection and exploration: Once the initial recruitment phase is complete, researchers and advisors can choose a project to work on and refine it into a single question answerable within the timeframe. We recognise the need for strong project planning skills and careful project choice and refinement here, and this project choice is a potential point of failure (see Important Considerations below). Following project selection, researchers will begin exploring the research project they’ve chosen in the months between project choice and the research camp. This would probably require five to ten hours a week of commitment from researchers, mostly asynchronously but with a weekly ‘scrum’ meeting to share progress within a project team. Regular sharing of progress and forward planning will be important to keep momentum going.

Research camp: Following the selection and exploration, we will have a two-week intensive camp assembling all participants in-person at a retreat to do focused work on the research projects. Exploratory work can be done asynchronously, but finishing research projects can be hard work and require intensive communication which can more easily be done in person. This also makes the full-time element of this project much more bounded and manageable for most potential participants. An in-person meeting also allows for much better communication between researchers on different projects, as well as helping form lasting and fruitful connections between researchers.

Important Considerations

Shaping the research question: Selecting good research questions for this project will be challenging, and is one of the main potential points of failure. The non-traditional structure of the event brings with it some extra considerations. We expect that most projects will be:

  1. Tractable to allow progress to be made in a short period of time, rather than conceptually complex or open-ended
  2. Closely related to current work, e.g. suggestions found in ‘further work’ or ‘open questions’ sections from recent papers
  3. Parallelisable across multiple researchers, e.g. evaluating multiple possible solutions to a single problem or researching separate aspects of a policy proposal

This biases project selection towards incremental research, i.e. extending previous work rather than finding completely new approaches. This is hard to avoid in these circumstances, and we are optimising at least partly for the creation of new researchers who can go on to do more risky, less incremental research in the future. Furthermore, a look at the ‘future work/open questions’ sections of many published safety papers will reveal a broad selection of interesting, useful questions that still meet the criteria above so although this is a tradeoff, we do not expect it to be overly limiting. A good example of this in the Machine Learning subfield would be evaluating multiple approaches to one of the problems listed in DeepMind’s recent AI Safety gridworlds paper.

Finding advisors: Although we intend this to be relatively self-contained, some amount of advice from active researchers will be beneficial at both the project selection and research stages, as well as at the end of the camp. The most useful periods for advisor involvement will be at the initial project selection/shaping phase and at the end of the camp - the former allows for better, more tractable projects as well as conveying previously unpublished relevant information and a sense of what’s considered interesting. The latter will be useful for preparing papers and integrating new researchers into the existing community. Informal enquiries suggest that it is likely to be possible to recruit advisors for these stages, but ongoing commitments will be more challenging.

The expected commitment during project selection and shaping would be one or two sessions of several hours spent evaluating and commenting on proposed research projects. This could be done asynchronously or by video chat. Commitment at the end of the research camp is likely to be similar - responding to initial drafts of papers with suggestions of improvements or further research in a similar way to the peer review process.

Costs: The main costs for the Gran Canaria camp, the AirBnBs, meals and low-income travel reimbursements, have been covered now by two funders. The July camp will likely take place in the UK at the EA Hotel, a co-working hub planned by Greg Colbourn (for other options, see here). For this, we will publish a funding proposal around April. Please see here for the draft budgets.

Long-term and wider impacts

If the camp proves to be successful, it could serve as the foundation for yearly recurring camps to keep boosting aspiring researchers into productivity. It could become a much-needed additional lever to grow the fields of AI safety and AI strategy for many years to come. The research camp model could also be used to grow AI safety research communities where none presently exist, but there is a strong need - in China, for instance. By using experienced coordinators and advisors in conjunction with local volunteers, it may be possible to organise a research camp without the need for pre-existing experts in the community. A camp provides a coordination point for interested participants, signals support for community building, and if previous camps have been successful provides social proof for participants.

In addition, scaling up research into relatively new cause areas is a problem that will need to be solved many times in the effective altruist community. This could represent an efficient way to ‘bootstrap’ a larger research community from a small pre-existing one, and so could be a useful addition to the tool set available to the EA community.

This project serves as a natural complement to other AI safety projects currently in development such as RAISE that aim to teach researchers the foundational knowledge they will need to begin research. Once an aspiring AI safety researcher completes one of these courses, they might consider a research camp as a natural next step on the road to become a practicing researcher.


Thanks to Ryan Carey, Chris Cundy, Victoria Krakovna and Matthijs Maas for reading and providing helpful comments on this document.


Tom McGrath

Tom is a maths PhD student in the Systems and Signals group at Imperial College, where he works on statistical models of animal behaviour and physical models of inference. He will be interning at the Future of Humanity Institute from Jan 2018, working with Owain Evans. His previous organisational experience includes co-running Imperial’s Maths Helpdesk and running a postgraduate deep learning study group.

Remmelt Ellen


Remmelt is the Operations Manager of Effective Altruism Netherlands, where he coordinates national events, works with organisers of new meetups and takes care of mundane admin work. He also oversees planning for the team at RAISE, an online AI Safety course. He is a Bachelor intern at the Intelligent & Autonomous Systems research group.

In his spare time, he’s exploring how to improve the interactions within multi-layered networks of agents to reach shared goals – especially approaches to collaboration within the EA community and the representation of persons and interest groups by negotiation agents in sub-exponential takeoff scenarios.

Linda Linsefors

Linda has a PhD in theoretical physics, which she obtained at Université Grenoble Alpes for work on loop quantum gravity. Since then she has studied AI and AI Safety online for about a year. Linda is currently working at Integrated Science Lab in Umeå, Sweden, developing tools for analysing information flow in networks. She hopes to be able to work full time on AI Safety in the near future.

Nandi Schoots

Nandi has a research master in pure mathematics and a minor in psychology from Leiden University. Her master was focused on algebraic geometry and her thesis was in category theory. Since graduating she has been steering her career in the direction of AI safety. She is currently employed as a data scientist in the Netherlands. In parallel to her work she is part of a study group on AI safety and involved with the reinforcement learning section of RAISE.

David Kristoffersson

David has a background as R&D Project Manager at Ericsson where he led a project of 30 experienced software engineers developing many-core software development tools. He liaised with five internal stakeholder organisations, worked out strategy, made high-level technical decisions and coordinated a disparate set of subprojects spread over seven cities on two different continents. He has a further background as a Software Engineer and has a BS in Computer Engineering. In the past year, he has contracted for the Future of Humanity Institute, has explored research projects in ML and AI strategy with FHI researchers, and is currently collaborating on existential risk strategy research with Convergence.

Chris Pasek

After graduating from mathematics and theoretical computer science, Chris ended up touring the world in search of meaning and self-improvement, and finally settled on working as a freelance researcher focused on AI alignment. Currently also running a rationalist shared housing project on the tropical island of Gran Canaria and continuing to look for ways to gradually self-modify in the direction of a superhuman FDT-consequentialist entity with a goal to save the world.

9 comments, sorted by
magical algorithm
Highlighting new comments since Today at 11:33 AM
Select new highlight date

Awesome! Great to hear more people are doing stuff like this.

Yeah - I unfortunately don't have much in the way of more specific comments, but am definitely excited to see this sort of thing happening, and it sounds like you're approaching it a generally sane way.

Thanks for mentioning it.

If later you happen to see a blind spot or a failure mode we should work on covering, we'd like to learn about it!

I can’t make it in person due to school commitments, but I’m extremely interested

Do you mean for the Gran Canaria camp?

We're also working towards a camp 2.0 in late July in the UK. I assume that's during summer break for you.

I’d probably be able to make that! Depends how long I could get off work and whether I’m able to make a CFAR workshop like I planned

Hey David, just a note that I've moved this post to your personal blog. One of the key properties that makes a post a fit for the frontpage is whether the primary purpose of the post is explaining a concept versus anything else, like attempting to persuade the readers to take a particular action, or announcing a new project. This post is primarily a combination of those two, which is why I moved it back to your personal blog.

(There are very occasionally exceptions to these guidelines, but if it falls in the middle I'd encourage folks to err on the side of posting to their personal blog and letting the mods move it to the frontpage.)

I found this post from the frontpage view just now ( Is that not expected for personal blog posts?

Yep, unexpected behavior. Turns out MongoDB distinguishes between "unset", "null" and "false" in weird ways...