What are Responsible Scaling Policies (RSPs)?

ARC Evals defines a Responsible Scaling Policy (RSP) as a specification of “what level of AI capabilities an AI developer is prepared to handle safely with their current protective measures, and conditions under which it would be too dangerous to continue deploying AI systems and/or scaling up AI capabilities until protective measures improve.”

Anthropic has published an RSP and, as of November 2023, OpenAI is planning to publish a similar document which they call a Risk-Informed Development Policy.

RSPs have elicited various reactions, with for instance Evan Hubinger calling them “pauses done right” but others being more skeptical. Objections to RSPs include that they serve to relieve regulatory pressure and shift the burden of proof from the people working on capabilities to people concerned about safety.