What are the main sources of AI existential risk?
While we can't predict the full story of how AI will affect the future, there are several broad dynamics that seem like plausible building blocks of an existential catastrophe.
One perspective is to look at different ways a dangerous AI could come about. Instances include:
-
Training could produce a misaligned mesa-optimizer.
-
We could accidentally misspecify our goals.
-
AIs could be misused.
Another perspective focuses on features of the world that could make avoiding a disaster harder, such as:
-
Insufficient time to solve the problem.
-
A lack of coordination between the most important actors.
-
The acceleration of progress through cheaper computing hardware, algorithmic progress and increased investment
And one could look at different kinds of dangerous uses it could be put to, like locking in undesirable values or inventing powerful weapons. Different types of errors could persist in an AI even as its capabilities became highly advanced, like bad assumptions about metaethics, decision theory, or metaphilosophy.
A post-AGI world could end up with different broad patterns where human values lose influence, like new competitive pressures or concentration of power.