I’d like to get up to speed on current alignment research. What should I read?
The intent of this question assumes a certain level of background claims for AI safety. Therefore, the focus is to provide resources on staying current in alignment research.
-
Center for AI Safety newsletters [CAIS] - AI Safety and ML Safety (more technical)
Start here
For those wanting to get up to speed with the relevant core background knowledge for AI alignment, we recommend starting with materials below:
-
Intro to AI Safety (18 minute video by Robert Miles, a PhD YouTuber who makes content about AI safety)
-
AI Safety Seems hard to Measure (22 minute podcast read by Holden Karnofsky, the former CEO of and now the Director of AI Strategy at Open Philanthropy)
-
The case for taking AI seriously as a threat to humanity (article by Kelsey Piper on Vox that’s estimated to take 25 minutes to read*)
Continuation of Resources to Read:
-
Benefits & Risks of Artificial Intelligence (15 minutes)
-
Instrumental Convergence (10 minutes)
-
What failure looks like (15 minutes)
-
The alignment problem from a deep learning perspective (40 minutes)
-
Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover (90 minutes)
For those wanting to work these readings and much more with a group of people virtually, we recommend BlueDot Impact’s courses covering either AI Alignment or AI Governance. For those wanting a more technical route of upskilling, CAIS has an Intro to ML Safety.
*Reading estimates in parentheses are based on the average reading speed of 250 words per minute.