At a high level, what is the challenge of AI alignment?
Nick Bostrom
Philosopher who has done research on existential risk from AI and other causes. Formerly head of FHI at Oxford, which he founded. Author of the 2014 book Superintelligence: Paths, Dangers, Strategies.
Many problems relevant to AI alignment are problems philosophers have been dealing with for centuries. To what degree is meaning inherent in language, versus something that requires external context? How do we translate between the logic of formal systems and normal ambiguous human speech? Can morality be reduced to a set of ironclad rules, and if not, how do we know what it is at all?
Existing attempts to answer these questions — from Aristotle, Kant, Mill, Wittgenstein, Quine, and others — may help people understand these issues better, but are not formal in a way that could be implemented in computer code. Just as a good textbook can help an American learn Chinese, but cannot be encoded into machine language to make a Chinese-speaking computer, so the philosophies that help humans are only a starting point for the project of computers that understand us and share our values.
The field of AI alignment combines formal logic, mathematics, computer science, cognitive science, and philosophy in order to advance that project.
This is the philosophy; the other half of Bostrom's formulation is the deadline. Traditional philosophy has been going on for almost three thousand years; AI alignment must be solved before the development of superintelligent
An AI with cognitive abilities far greater than those of humans in a wide range of important domains.
If the alignment problem isn't adequately addressed by then, we are likely to see poorly-aligned superintelligences that are unintentionally hostile to the human race, with some of the catastrophic outcomes mentioned above. This is why so many experts are urging quick action on getting AI alignment research up to an adequate level.
If it turns out that superintelligence is centuries away and such research is premature, little will have been lost. But if our projections were too optimistic, and superintelligence is imminent, then doing such research now rather than later becomes vital.