Intro to AI safety

2 min read

Suggest changes in Google Docs

In recent years, AI has exceeded many people’s expectations in a wide variety of domains. Current AI systems can compose human-like text, play Go, write code, and model protein folding. It may not be long until we create AI systems that are as capable as humans at carrying out most intellectual tasks.

Eventually — perhaps soon after — AI could surpass human intelligence altogether, including in science, technology, economic competition, and strategic planning.

Such advanced AI could provide great benefits, but if designed unsafely or deployed recklessly, it could also cause unprecedented disasters — even human extinction.

Making sure that future, highly-advanced AI is beneficial to humanity will involve technical, political, and social challenges. A key technical component of AI safety is the “alignment problem”: ensuring that AI systems try to do what we want them to, even if circumstances change fundamentally — for example, if AI becomes smarter than humans and ends up in a position to exert major influence over human society. Many AI researchers regard the alignment problem as unsolved and difficult.

The topic of existential risk from AI has recently entered into the mainstream. The leading AI labs are aiming to create “artificial general intelligence” in the not-too-distant future, and many top researchers are warning about its dangers.

The articles in the sidebar make up the first of the site’s eight sections, outlining these issues for newcomers.