What are some objections to the importance of AI alignment?

2 min read

Suggest changes in Google Docs

Søren Elverlin has compiled a list of counter-arguments and suggests dividing them into two kinds: "weak" and "strong".

Weak counter-arguments point to problems with the "standard" arguments (as given in, e.g., Bostrom’s Superintelligence), especially shaky models and assumptions that are too strong. These arguments are often of a substantial quality and are often presented by people who themselves worry about AI safety. Elverlin calls these objections “weak” because they do not attempt to imply that the probability of a bad outcome is close to zero: “For example, even if you accept Paul Christiano's arguments against ‘fast takeoff’, they only drive the probability of this down to about 20%. Weak counter-arguments are interesting, but the decision to personally focus on AI safety doesn't strongly depend on the probability — anything above 5% is clearly a big enough deal that it doesn't make sense to work on other things.”

Strong arguments argue that the probability of existential catastrophe due to misaligned AI is tiny, usually by some combination of claiming that AGI is impossible or very far away. For example, Michael Littman has suggested that as (he believes) we’re so far from AGI, there will be a long period of human history wherein we’ll have ample time to grow up alongside powerful AIs and figure out how to align them.

Elverlin opines that “There are few arguments that are both high-quality and strong enough to qualify as an ‘objection to the importance of alignment’.” He suggests Rohin Shah's arguments for “alignment by default” as one of the better candidates.

What are some objections to the importance of AI alignment?

In progress