Take AISafety.info’s 3 minute survey to help inform our strategy and priorities

Take the survey
Alignment research

Current techniques
Benchmarks and evals
Prosaic alignment
Interpretability
Agent foundations
Other alignment approaches
Organizations and agendas
Researchers

What is AI Safety via Debate?

This text was automatically imported from a tag on LessWrong.

Debate is a proposed technique for allowing human evaluators to get correct and helpful answers from experts, even if the evaluator is not themselves an expert or able to fully verify the answers.1

 The technique was suggested as part of an approach to build advanced AI systems that are aligned with human values, and to safely apply machine learning techniques to problems that have high stakes, but are not well-defined (such as advancing science or increase a company's revenue). 23


  1. https://www.lesswrong.com/posts/Br4xDbYu4Frwrb64a/writeup-progress-on-ai-safety-via-debate-1 ↩︎

  2. https://ought.org/mission ↩︎

  3. https://openai.com/blog/debate/ ↩︎

Keep Reading

Continue with the next entry in "Alignment research"
What is "HCH"?
Next


AISafety.info

We’re a global team of specialists and volunteers from various backgrounds who want to ensure that the effects of future AI are beneficial rather than catastrophic.

© AISafety.info, 2022—2025

Aisafety.info is an Ashgro Inc Project. Ashgro Inc (EIN: 88-4232889) is a 501(c)(3) Public Charity incorporated in Delaware.