Would AI technology favor offense or defense in a conflict?

The offense-defense balance refers to whether a certain situation between two potential adversaries favors the attacking party or the defending party. For instance, a well-fortified medieval castle favored defense over offense because it took many more attackers to capture than defenders to keep it.

In the context of conflicts between humans using AI1, we don’t know whether future AI advances will favor offense (the ability to gain strategic advantages, control resources, or cause harm) or defense (the ability to protect assets, maintain control, or prevent harm). Some dynamics that might favor offense include:

  • Biosecurity: AI could be used to engineer dangerous new pathogens.2

  • Cybersecurity: AI could automate the search for and deployment of cyberattacks.

  • AI-controlled drone swarms or other lethal autonomous weapons might be used to attack targets in a way that is hard to defend against.

  • AI allows automation of phishing and facilitates impersonation attacks.

On the flip side, defense could be made easier:

These attack vectors exist on multiple levels (nation, organization, individual) for which the offense-defense balance varies. For instance, in the Middle Ages, conflicts between feudal lords might have favored defense because of castles, while conflicts between individuals in towns might have favored offense because they could stab each other by surprise. Some defensive capabilities — e.g., a Leviathan or generic defensive AIs — could effectively protect against attackers on most levels, but if these are not implemented, some of these levels will probably favor offense, which could be quite disruptive.

Conversely, Maxwell Tabarrok argues that previous technologies which we might have expected to disrupt this balance have not generally done so. Tabarrok mentions cybersecurity and biosecurity as areas in which both offensive and defensive capabilities have greatly increased over the last century, without the offense-defense balance substantially shifting or attacks becoming more common relative to the overall use of the technology. He predicts that this might turn out to be the case for AI as well.

Further reading:


  1. This framework is less relevant in the case of a conflict between a misaligned superhuman AI and humanity in general since such an AI is expected to wield a large technological advantage. ↩︎

  2. Kevin Esvelt contends that current and near-future LLMs may exacerbate such risks, whereas David Thorstad disagrees. ↩︎

  3. Vitalik Buterin argues that in the limit AI would favor defense in cybersecurity since 100% of vulnerabilities would be patched. ↩︎

  4. While this might increase safety, it might be undesirable for other reasons. ↩︎