Take AISafety.info’s 3 minute survey to help inform our strategy and priorities

Take the survey
Beyond the basics

Language models
Mesa-optimizers and subagents
Decision theory
Mathematics of agents
Strategy and outcomes
Brain emulation
Human intelligence enhancement
Computer science
Values
AI consciousness

What is perverse instantiation?

Perverse instantiation is fulfilling instructions in a way that undermines the intended objective.

Think of the many stories about someone who finds a genie and gets to make a wish, but the genie takes the wish literally and fulfills it in a way that undermines the person’s hopes, and may even harm them. For example, one easy way to make someone’s toe stop hurting is to amputate their leg.

The concern is that an AI is likely to fulfill commands in this kind of way. Algorithms need to be specified precisely, and if the goal is misstated therein, the AI may well pursue the programmed goal, indifferent to the intent of the programmers (even though it might be capable of figuring out that intent). So it might pursue its goals without concern for the side effects, even if these are extremely harmful.

Keep Reading

Continue with the next entry in "Beyond the basics"
What is mindcrime?
Next
Or jump to a related question


AISafety.info

We’re a global team of specialists and volunteers from various backgrounds who want to ensure that the effects of future AI are beneficial rather than catastrophic.

© AISafety.info, 2022—2025

Aisafety.info is an Ashgro Inc Project. Ashgro Inc (EIN: 88-4232889) is a 501(c)(3) Public Charity incorporated in Delaware.