Is recursive self improvement possible?

There is a debate within the field of AI regarding whether recursive self improvement (RSI) by an AI is possible, and, if so, whether it is likely to happen.

Prominent critics of RSI include:

  • François Chollet, who argued in 2017 that recursively self-improving systems cannot achieve exponential progress in practice[1].

  • Ted Chiang, who argues that AIs belong in the reference class of compilers, where bootstrapping allows self-improvement, but only up to a point.

Points of disagreement include:

  • Whether full access to one’s entire codebase is necessary for RSI. Stuart Armstrong posits that this is not necessary, because many pathways to self-improvement (e.g. creation of sub-agents, partial self-modification, increasing the scale, as well as many others) do not rely on this ability.

  • Whether a self-improving AI’s capabilities will eventually reach the point of diminishing returns, and where this point may lie. For example, when it comes to AIs built using modern deep learning paradigms, some have argued that self-improvement may be infeasible and not cost-effective, due to the complexity and computational demand of these types of models. If this is true, then such AIs wouldn’t satisfy the criteria for being seed AI, which were imagined to be designed and not selected for by search-like processes, requiring understanding of their own source code and the ability to make goal-preserving modifications to themselves.

  • Whether AGI will be incentivized to self-improve in the first place.

A common response to this objection is that the difficulty of self-improvement in a deep learning paradigm may be simply an artifact of this specific paradigm, as the field of AI is replete with examples of a radical change in approaches allowing previously unfeasible problems to be solved extremely easily. Moreover, improving source code in the context of deep learning does not just refer to, say, a neural network changing each of its weights directly (although steering it as such may well be possible), but also things like code defining the architecture or the code for collecting training data, etc.

Ultimately, the danger of self-improvement does not lie in hypothetical infinite recursive improvements, but in pushing the AGI further down a path to AI takeover - even fairly modest, concrete and reasonable improvements may push an AGI beyond the point of controllability. Given that some trends in modern AI provide for steps on the path to self-improvement, the question of it’s long-term feasibility remains a crucial one. Furthermore, you don’t even need self-improvement to get things that look like FOOM.

  1. See also Eliezer Yudkowsky’s rebuttal. ↩︎