Is recursive self improvement possible?

3 min read

Suggest changes in Google Docs

There is a debate within the field of AI regarding whether recursive self improvement by an AI is possible, and, if so, whether it is likely to happen.

Prominent skeptics of recursive self-improvement include:

François Chollet, who argued in 2017 that recursively self-improving systems cannot achieve exponential progress in practice¹.
Ted Chiang, who argues that AIs belong in the reference class of compilers, where bootstrapping allows self-improvement, but only up to a point.

Points of disagreement include:

Whether an AI would need full access to its entire codebase for recursive self-improvement. Stuart Armstrong posits that it would not, because many pathways to self-improvement (e.g. creation of sub-agents, partial self-modification, increasing the scale, as well as many others) do not rely on this ability.
Whether a self-improving AI’s capabilities will eventually reach the point of diminishing returns, and where this point may lie. For example, when it comes to AIs built using modern deep learning paradigms, some have argued that self-improvement may be infeasible and not cost-effective, due to the complexity and computational demands of these types of AIs. If this is true, then such AIs wouldn’t satisfy the criteria for being seed AI, which were imagined to be designed and not selected for by search-like processes, requiring understanding of their own source code and the ability to make goal-preserving modifications to themselves.
Whether AGI will have incentives to self-improve in the first place.

A common response to this objection is that self-improvement may be easier outside the deep learning paradigm, as the field of AI is replete with examples of a radical change in approaches allowing previously unfeasible problems to be solved extremely easily. Moreover, improving source code in the context of deep learning does not just refer to, say, a neural network changing each of its weights directly (although steering it as such may well be possible), but also things like code defining the architecture or the code for collecting training data, etc.

Ultimately, the danger of self-improvement does not lie in hypothetical infinite recursive improvements, but in pushing the AGI further down a path to AI takeover — even fairly modest, concrete and reasonable improvements may push an AGI beyond the point of controllability. Given that some trends in modern AI provide for steps on the path to self-improvement, the question of its long-term feasibility remains a crucial one. Furthermore, you don’t even need self-improvement to get things that look like FOOM.

See also Eliezer Yudkowsky’s rebuttal. ↩︎

Is recursive self improvement possible?

In progress