Take AISafety.info’s 3 minute survey to help inform our strategy and priorities

Take the survey
Beyond the basics

Language models
Mesa-optimizers and subagents
Decision theory
Mathematics of agents
Strategy and outcomes
Brain emulation
Human intelligence enhancement
Computer science
Values
AI consciousness

What is a subagent?

A subagent is a part of a larger system or agent

which itself has agent-like properties. For example, we can model a corporation as an agent, since it makes decisions and pursues goals, but it is also made up of subagents: the people who work for it, who have their own goals.

There are numerous theories of human psychology, such as internal family systems, which model the human mind as also being made up of subagents. This is used to explain thoughts like “one part of me wants this, but the other parts are getting in the way”.

An agentic AI may also contain subagents, and insofar as subagents are involved, they present an additional alignment challenge: to make sure that they are also aligned with our values.

A related concept, which should not be confused with a subagent, is a mesa-optimizer.

Keep Reading

Continue with the next entry in "Beyond the basics"
What are tiling agents?
Next
Or jump to a related question


AISafety.info

We’re a global team of specialists and volunteers from various backgrounds who want to ensure that the effects of future AI are beneficial rather than catastrophic.

© AISafety.info, 2022—2025

Aisafety.info is an Ashgro Inc Project. Ashgro Inc (EIN: 88-4232889) is a 501(c)(3) Public Charity incorporated in Delaware.