What is the Machine Intelligence Research Institute's research agenda?

The Machine Intelligence Research Institute (MIRI) is an alignment research organization which aims to "ensure that the creation of smarter-than-human artificial intelligence has a positive impact." MIRI's main research agendas are "Agent Foundations for Aligning Machine Intelligence with Human Interests" and "Alignment for Advanced Machine Learning Systems". Together, these focus on three groups of technical problems:

  • Highly reliable agent design: Learning how to specify highly autonomous systems that reliably pursue some fixed goal.

  • Value specification: supplying autonomous systems with the intended goals; and

  • Error tolerance: making such systems robust to programmer error.

Since 2018, MIRI's work has been non-disclosed by default, but they do publish new mathematical results. Some of MIRI's recent research output includes:

  • Embedded Agency (Demski, 2020) presents a framework for understanding an agent as a part of the environment it interacts with, rather than as a separate entity (as many formalizations of agency, such as AIXI, do).

  • Cartesian Frames (Garrabrant, 2021) - Creating a basis for understanding groups of potential actions agents could take, and manipulations on those sets as a possible means to understand how action spaces can evolve over time.

  • Evan Hubinger’s Research Agenda [Interpretability]. Develop “acceptability desiderata” (two examples are myopia: the agent is unable to look/receive signals from future time steps, and broad corrigibility: the agent is actually, actively helping you to determine if things are going well and to clarify your preferences) which we can measure using interpretability techniques. Ideally, so long as these conditions are met, there will not be catastrophic issues.

  • Vanessa Kosoy’s Research Agenda [Agent Foundations, Reinforcement Learning] - Work on a general abstract view of intelligence, of which AI alignment seems to be a part, use the framework to formulate alignment problems within learning theory and evaluate solutions based on these formal properties. Example work: Delegated Reinforcement Learning (Kosoy, 2019) - Avoids assumption of no traps or episodic regret bounds, by allowing an agent to optionally delegate certain tasks to an advisor.

Outreach and communication

MIRI has done a significant amount of public communication about AI alignment. In particular, MIRI co-founder Eliezer Yudkowsky is among the most prolific popularizers of the ideas of AI risk and AI alignment.

They generally express the view that solving alignment is very difficult, that AGI will end in "ruin" by default, and popularize the application of the "security mindset" to AI research. They have also expressed pessimism about policy solutions in the past, but have more recently begun to advocate for an indefinite moratorium on new large training runs.

They maintain a blog on their website, and have published conversations between MIRI staff and other AI researchers, collected at the links below: