Why, in outline form, should we be concerned about advanced AI?

We sketch here a high-level argument for how and why future AI might pose great risks.

First, it seems theoretically possible to build AIs that are capable of reproducing a majority of the cognitive abilities of a typical human. AI researchers also generally believe that it is possible to build superintelligent AI which far exceeds human-level intelligence. It is possible that increased amounts of compute are the main ingredient needed to build such an AI, in which case it might become technologically feasible to build one soon.

Second, actors such as corporations or governments will have incentives to build these powerful AIs as soon as they can[1]. Some of these AIs will be agentic[2] since such AIs are expected to be more profitable. especially if they are agentic. [3]

At this point, even if these AIs are not agentic, there is a risk that they would be misused by some actor, for instance to conduct invasive surveillance, perform acts of terror, or simply watch the world burn.

Third, the first agentic AIs that are more generally capable than humans[4] may be misaligned. [5]

Finally, there’s a risk that such an AI will be driven to gain a decisive strategic advantage over humanity in order to further its misaligned goals. It is possible that it could be capable of doing so, leading to our disempowerment and, possibly, extinction. There are many ways this could happen and it is hard to be confident in any specific scenario, but there are broad reasons to think this might happen.


  1. It’s possible that when AI that surpasses humans is perceived to be imminent, these actors might rush to be the first to build such an AI, triggering an arms race, which could lead to safety being de-emphasized. ↩︎

  2. It has been argued (e.g. here and here) that a non-agentic tool AI is likely to become agentic, either through its creators finding an agentic version more useful or through the AI self-modifying itself. ↩︎

  3. We note elsewhere that things can go wrong even with non-agentic AI, although this seems less likely. ↩︎

  4. The line for what counts as human-level is quite fuzzy, but for this scenario it is the combination of power and generality that makes it dangerous. ↩︎

  5. It’s also possible that the first superhuman AI that is deployed is safe, either because it is reasonably well aligned or is not agentic, but if it does not perform a pivotal act there might be a second such AI that is deployed shortly after by another actor that could be unsafe. ↩︎