In recent years, we’ve seen AI systems grow increasingly capable. They may soon attain human and even strongly superhuman skill in a wide range of domains. Such systems could bring great benefits, but if their goals don’t line up with human values, they could also cause unprecedented disasters, including human extinction.

This website exists to answer questions about AI safety, which focuses on preventing such disasters.

Rapid progress in the capabilities of current AI systems has pushed the topic of existential risk from AI into the mainstream. GPT-4 and other recent systems display abilities that used to seem out of reach in the foreseeable future, including playing Go, composing human-like text, writing code, and modeling protein folding. AI labs now explicitly aim to create “artificial general intelligence” in the not-too-distant future, and many top researchers are warning about its dangers.

As far as we know, even when AI becomes as smart as humans in most domains, there’s nothing to stop it from continuing to get smarter: just as current AI vastly outperforms us at arithmetic, future AI will vastly outperform us in science, technology, economic competition, and strategy. When AI becomes capable of replacing humans for most of the work involved in AI research, this will accelerate such research, potentially resulting in a “superintelligence” in a short time.

A superintelligent AI could be an incredibly powerful aid to human flourishing, if its actions are in line with human values. But it’s not guaranteed that they will be. A central concern of AI safety is making sure that AI systems try to do what we want, and that they keep doing so even if their circumstances change fundamentally – for example, if their cognitive capabilities exceed those of humans. This is called the “AI alignment problem”, and it’s widely regarded as unsolved and difficult.

AI alignment researchers haven’t figured out how to ensure, after choosing an objective, that a powerful AI system will reliably pursue that exact objective. The way the most capable systems are trained today makes it hard to understand how they even work. The research community has been working on these problems, trying to invent techniques and concepts for building safe systems.

It’s unclear whether these problems can be solved before a misaligned system causes an irreversible catastrophe. However, success becomes more likely if more people make well-informed efforts to help. We made this site to help people understand the challenges at hand and the solutions being worked on. The related questions below are a good place to start learning more, or you can enter your questions into the search bar if there’s a specific topic you’re curious about.