What are Responsible Scaling Policies (RSPs)?

METR1 defines a responsible scaling policy (RSP) as a specification of “what level of AI capabilities an AI developer is prepared to handle safely with their current protective measures, and conditions under which it would be too dangerous to continue deploying AI systems and/or scaling up AI capabilities until protective measures improve.”

Anthropic published an RSP in September 2023. In November 2023, OpenAI announced that it planned to publish a similar document, which it calls a Risk-Informed Development Policy (RDP)2. Deepmind published a similar document in May 2024 which it calls a Frontier Safety Framework.

RSPs have received positive and negative reactions from the AI safety community. Evan Hubinger of Anthropic, for instance, argues that they are “pauses done right”; others are more skeptical. Objections to RSPs include that they serve to relieve regulatory pressure and shift the "burden of proof" from the people working on capabilities to people concerned about safety.

  1. Formerly known as ARC Evals. ↩︎

  2. They published a beta version in December 2023. ↩︎