Where can I find videos about AI safety?
Also available at aisafety.video
Top recommendation: AI Safety Intro Video playlist
Generally good sources
Channels
-
Robert Miles AI Safety (and Rob's videos on Computerphile)
-
AI Safety Talks (and its playlists)
-
Relevant, but less focused on AI existential risk: Rational Animations, Centre for Effective Altruism, Future of Humanity Institute, Center for Security and Emerging Technology, CSER, SSC meetups, Foresight Institute, Science, Technology & the Future, Berkman Klein Center, Schwartz Reisman Institute, Stanford HAI, Carper AI, Lex Fridman, Digital Humanism, Cognitive Revolution Podcast, NIST AI Metrology Colloquia Series
-
AI content without much AI safety: AI Explained, Andrej Karpathy, John Tan Chong Min, Edan Meyer, Yannic Kilcher, Mutual Information, Computerphile, CodeEmporium, sentdex, nPlan, Jay Alammar, Assembly AI, Aleksa Gordić, Simons Institute, 2 Minute Papers, Machine Learning Street Talk, ColdFusion, HuggingFace, AI Coffee Break, Alex Smola, Welcome AI Overlords, Valence Discovery, The Alan Turing Institute, Jordan Harrod, Cambridge Ellis Unit, Weights
& Biases, UCL CSML Seminar Series, Harvard Medical AI, IARAI, Alfredo Canziani, Andreas Geiger, CMU AI Seminar, Jeremy Howard, Google Research, AI for Good, IPAM UCLA, One world theoretical ML, What's AI, Stanford MedAI, MILA neural scaling seminars, Digital Engine (sometimes misleading), Steve Brunton, PyTorch, What's AI by Louis Bouchard, Eye on AI, Super Data Science, Waterloo AI, Matt Wolfe, DeepLearningAI, TechTechPotato, AsianometryModel weightsThe parameters of a neural network. They are tuned during training and are mostly sufficient to implement the AI model.
-
Other languages: Karl Olsberg (German)
Lists
-
AI Alignment YouTube Playlists – excellent resource. Slide-light (reordered) and slide-heavy playlists.
-
the gears to ascenscion lists many channels for understanding current capabilities trends
-
A ranked list of all EA-relevant documentaries, movies, and TV | Brian Tan on EAF (“AI Safety / Risks” section)
-
Towards AGI: Scaling, Alignment & Emergent Behaviors in Neural Nets
Specific suggestions
Note that:
-
I haven’t watched all of these videos. Feel free to comment with more recommendations!
-
This list does not focus on podcasts, although there are a few podcast recommendations. See this page for some AI safety podcasts.
Introductory
-
See also AI safety intros for readings
-
AI on the Hill: Why artificial intelligence is a public safety issue (Jeremie Harris)
-
AI and Evolution (Dan Hendrycks)
-
Intro to AI Safety, Remastered (Rob Miles)
-
Connor Leahy, AI Fire Alarm, AI Alignment & AGI Fire Alarm - Connor Leahy (ML Street Talk), and Connor Leahy on AI Safety and Why the World is Fragile
-
Eliezer Yudkowsky
– AI Alignment: Why It's Hard, and Where to Start and Sam Harris 2018 - IS vs OUGHT, Robots of The Future Might Deceive Us with Eliezer Yudkowsky (full transcript here) and 159 - We’re All Gonna Die with Eliezer YudkowskyEliezer YudkowskyCo-founder of MIRI, known for his early pioneering work in AI alignment and his predictions that AI will probably cause human extinction.
-
Brian Christian and Ben Garfinkel and Richard Ngo and Paul Christiano on the 80,000 Hours Podcast
-
Richard Ngo and Paul Christiano on AXRP
-
Ajeya Cotra on how Artificial Intelligence Could Cause Catastrophe
-
Jeremie Harris - TDS Podcast Finale: The future of AI, and the risks that come with it
-
Rohin Shah on the State of AGI Safety Research in 2021 and AI Alignment: An Introduction | Rohin Shah | EAGxOxford 22
-
The Alignment Problem: Machine Learning and Human Values with Brian Christian Q&A section
-
X-Risk Overview (Dan Hendrycks)
-
Myths and Facts About Superintelligent AI (Max Tegmark + minutephysics)
-
What happens when our computers get smarter than we are? | Nick Bostrom
Nick BostromView full definitionPhilosopher who has done research on existential risk from AI and other causes. Formerly head of FHI at Oxford, which he founded. Author of the 2014 book Superintelligence: Paths, Dangers, Strategies.
-
Risks from Advanced AI with Jakub Kraus and AI safety intro talk
-
What is the alignment problem? (Samuel Albanie)
-
SaTML 2023 - Jacob Steinhardt - Aligning ML Systems with Human Intent
-
How We Prevent the AI’s from Killing us with Paul Christiano
Landscape
-
Current work in AI alignment | Paul Christiano | EA Global: San Francisco 2019
-
Paradigms of AI alignment: components and enablers | Victoria Krakovna | EAGxVirtual 2022
-
How to build a safe advanced AI (Evan Hubinger) | What's up in AI safety? (Asya Bergal)
Inner alignment
Inner misalignment
When an AI system ends up pursuing a different objective than the one that was specified.
View full definition
When an AI system ends up pursuing a different objective than the one that was specified.
-
The OTHER AI Alignment Problem: Mesa-Optimizers and Inner Alignment
-
Deceptive Misaligned Mesa-Optimisers? It's More Likely Than You Think...
Outer alignment
Outer alignment
The problem of making sure that the precise formulation of what we train the AI to do matches what we intend it to do.
View full definition
The problem of making sure that the precise formulation of what we train the AI to do matches what we intend it to do.
-
9 Examples of Specification Gaming
Specification gamingView full definitionBehavior where an AI performs a task in a way that scores highly according to the objective that was specified, while going against the task’s intended “spirit.”
-
How to Keep Improving When You're Better Than Any Teacher - Iterated Distillation and Amplification
-
AI Toy Control Problem (Stuart Armstrong)
-
AIS via Debate (Joe Collman)
Agent foundations
Agent foundations
A research agenda which tries to understand the nature of agents and their properties.
View full definition
A research agenda which tries to understand the nature of agents and their properties.
-
Intro to Agent Foundations (Understanding Infra-Bayesianism Part 4)
-
EC'21 Tutorial: Designing Agents' Preferences, Beliefs, and Identities (Part 3) and part 4 from FOCAL at CMU
Interpretability
-
See the interpretability playground
-
ROME: Locating and Editing Factual Associations in GPT (Paper Explained & Author Interview)
-
Feature Visualization & The OpenAI
microscope and Building Blocks of AI Interpretability | Two Minute Papers #234 -
25. Interpretability (MIT 6.S897 Machine Learning for Healthcare, Spring 2019)
-
Cohere For AI - Community Talks - Catherine Olsson on Mechanistic Interpretability: Getting Started
-
A Walkthrough of A Mathematical Framework for Transformer Circuits
-
A Walkthrough of Interpretability in the Wild Part 1/2: Overview (w/ authors Kevin, Arthur, Alex) and part 2
-
Reliable and Interpretable Artificial Intelligence -- Lecture 1 (Introduction)
Organizations
-
Training machine learning (ML) systems to answer open-ended questions | Andreas Stuhlmuller + Amanda Ngo, Ought | Automating Complex Reasoning (Ought)
Individual researchers
-
Peter Railton - A World of Natural and Artificial Agents in a Shared Environment
-
Provably Beneficial AI and the Problem of Control and Human-compatible artificial intelligence - Stuart Russell, University of California (Stuart Russell)
-
Victoria Krakovna–AGI Ruin, Sharp Left Turn
, Paradigms of AI AlignmentSharp left turnView full definitionAn event in which an AI’s capabilities suddenly generalize, but its alignment doesn’t also generalize.
-
David Krueger—AI Alignment, David Krueger: Existential Safety, Alignment, and Specification Problems
-
Holden Karnofsky - Transformative AI & Most Important Century
-
Prosaic Intent Alignment (Paul Christiano)
-
Timelines for Transformative AI and Language Model Alignment | Ajeya Cotra
-
AI Research Considerations for Existential Safety (Andrew Critch)
-
Differential Progress in Cooperative AI: Motivation and Measurement (Jesse Clifton and Sammy Martin)
-
Open-source learning: A bargaining approach | Jesse Clifton | EA Global: London 2019
-
AGISF - Research questions for the most important century - Holden Karnofsky
-
Ethan Perez | Discovering language model behaviors with model-written evaluations
Reasoning about future AI
-
Optimal Policies Tend To Seek Power (Alex Turner at NeurIPS 2021)
-
Why Would AI Want to do Bad Things? Instrumental Convergence
Instrumental convergenceView full definitionThe idea that agents with widely different terminal goals will end up adopting many of the same instrumental goals.
-
Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover
AI takeoverView full definitionA hypothetical event where a powerful AI effectively takes over the world.
Frontier AI regulation
International AI governance
US AI Policy
Compute governance
Hardware supply chain
-
Notes on AI Hardware - Benjamin Spector | Stanford MLSys #88
-
“The Decision of the Century”: Choosing EUV Lithography, Intel & AMD: The First 30 Years, A Brief History of Semiconductor Packaging, What Goes On Inside a Semiconductor Wafer Fab, and many more from Asianometry
-
EUV lithography systems (scroll to “How does EUV work?”)
-
The AI Hardware Show 2023, Episode 1: TPU, A100, AIU, BR100, MI250X
-
All about AI Accelerators: GPU, TPU, Dataflow, Near-Memory, Optical, Neuromorphic & more (w/ Author)
-
ASML's Secret: An exclusive view from inside the global semiconductor giant | VPRO Documentary
-
Semiconductor Expert Reveals Why US Export Controls Have Failed
-
How ASML, TSMC And Intel Dominate The Chip Market | CNBC Marathon
-
How We DESTROYED the NVIDIA H100 GPU: The ULTIMATE Comino Tear Down! COMINO H100 WATERBLOCK TEASER
Misc AI governance
-
AI Ethics Seminar with Matthijs Maas - Pausing AI & Technological Restraint - April 25, 2023
-
Markus Anderljung Regulating increasingly advanced AI some hypotheses
-
Sam Altman and William G. Gale discuss Taxation Solutions for Advanced AI
-
Paul Scharre & Helen Toner on AI Capabilities & the Nature of Warfare
-
Why governing AI is our opportunity to shape the long-term future? | Jade Leung | TEDxWarwickSalon + Priorities in AGI governance research | Jade Leung | EA Global: SF 22
-
An Introduction to AI Governance | Ben Garfinkel | EAGxVirtual 2022
-
The Windfall Clause
: Sharing the benefits of advanced AI | Cullen O'Keefe + Sharing the Benefits of AI: The Windfall Clause (Rob Miles)Windfall clauseView full definitionA commitment by an AI firm to donate a significant portion of its profits if they build transformative AI.
-
Margaret Roberts & Jeffrey Ding: Censorship’s Implications for Artificial Intelligence
-
Preparing for AI: risks and opportunities | Allan Dafoe | EAG 2017 London + AI Strategy, Policy, and Governance | Allan Dafoe
-
Simeon Campos–Short Timelines, AI Governance, Field Building
-
More than Deepfakes (Katerina Sedova and John Bansemer)
-
AI and the Development, Displacement, or Destruction of the Global Legal Order (Matthijs Maas)
-
Future-proofing AI Governance | The Athens Roundtable on AI and the Rule of Law 2022
-
Having Our Cake and Eating It Too with Amanda Askell (covers incentives in AI development)
Ethics
Career planning
-
How I think students should orient to AI safety | Buck Shlegeris | EA Student Summit 2020
-
AGISF - Careers in AI Alignment and Governance - Alex Lawsen
-
Catherine Olsson & Daniel Ziegler on the 80,000 Hours Podcast
-
AI Safety Careers | Rohin Shah, Lewis Hammond and Jamie Bernardi | EAGxOxford 22
-
Early-Career Opportunities in AI Governance | Lennart Heim, Caroline Jeanmaire | EAGxOxford 22
-
Artificial Intelligence Career Stories | EA Student Summit 2020
Forecasting
-
Will AI end everything? A guide to guessing | Katja Grace | EAG Bay Area 23
-
Neural Scaling Laws
and GPT-3 (Jared Kaplan)Scaling lawsView full definitionThe relationship between a model’s performance and the amount of compute used to train it.
-
WHY AND HOW OF SCALING LARGE LANGUAGE MODELS | NICHOLAS JOSEPH
-
Reasons you might think human level AI soon is unlikely | Asya Bergal | EAGxVirtual 2020
-
Existential Risk Pessimism and the Time of Perils | David Thorstad | EAGxOxford 22
-
Alex Lawsen—Forecasting AI Progress and Alex Lawsen forecasting videos
-
Betting on AI is like betting on semiconductors in the 70's | Danny Hernandez | EA Global: SF 22 + Danny Hernandez on the 80,000 Hours Podcast
-
Economic Growth in the Long Run: Artificial Intelligence Explosion
or an Empty Planet?Intelligence explosionView full definitionA hypothetical scenario where machines become more intelligent very quickly, driven by recursive self-improvement.
-
Moore's Law, exponential growth, and extrapolation! (Steve Brunton)
Capabilities
-
Satya Nadella Full Keynote Microsoft Ignite 2022 with Sam Altman, start at 12:25
-
Competition-Level Code Generation with AlphaCode (Paper Review)
-
The text-to-image revolution, explained + How the World Cup’s AI instant replay works (Vox)
How AI works
-
Transformers, explained: Understand the model behind GPT, BERT, and T5; Transformers for beginners | What are they and how do they work
-
Reinforcement learning
playlist (Steve Brunton)Reinforcement learningView full definitionA machine learning method in which the machine gets rewards based on its actions, and is adjusted to be more likely to take actions that lead to high reward.
-
How AI Image Generators Work (Stable Diffusion / Dall-E) - Computerphile + Stable Diffusion in Code (AI Image Generation) - Computerphile
-
What is a transformer? + Implementing GPT-2 from scratch (Neel Nanda)
-
The spelled-out intro to neural networks and backpropagation: building micrograd
-
DeepMind x UCL RL Lecture Series - Introduction to Reinforcement Learning [1/13]
-
Deep Learning for Computer Vision (Justin Johnson) lecture videos
-
CS25 I Stanford Seminar - Transformers United: DL Models that have revolutionized NLP, CV, RL
-
Broderick: Machine Learning, MIT 6.036 Fall 2020 + course page
China
-
Re-deciphering China’s AI dream | Jeffrey Ding | EA Global: London 2019
-
Sino-Western cooperation in AI safety | Brian Tse | EA Global: San Francisco 2019
Rationality
-
Effective behavior change | Spencer Greenberg | EA Global: San Francisco 2019
-
Making high impact decisions | Anna Edmonds | EA Global: SF 22
-
Decision-making workshop: learn how to make better decisions | Spencer Greenberg
-
Decoupling: a technique for reducing bias | David Manley | EA Student Summit 2020
Debates / discussions between people with different perspectives
Misc
-
Slaughterbots + Why We Should Ban Lethal Autonomous Weapons + A.I. Is Making it Easier to Kill (You). Here’s How. | NYT
-
Tobias Baumann on Artificial Sentience and Reducing the Risk of Astronomical Suffering
-
The Doomsday Argument | PBS Space Time
-
Forming your own views on AI safety (without stress!) | Neel Nanda | EA Global: SF 22 – also see Neel's presentation slides and "Inside Views Resources" doc
-
Applied Linear Algebra Lectures (John Wentworth)
-
AI alignment, philosophical pluralism, and the relevance of non-Western philosophy | Tan Zhi Xuan
-
Moloch section of Liv Boeree interview with Lex Fridman (and Ginsberg)
-
ChatGPT in Context. Part 1 - The Transformer, a Revolution in Computation (Piero Scaruffi)
-
Is AI (ChatGPT, etc.) Sentient? A Perspective from Early Buddhist Psychology
Watching videos in a group
Discussion prompts
-
Paul Christiano on AI alignment - discussion + Paul Christiano alignment chart
-
Allan Dafoe on AI strategy, policy, and governance - discussion
-
Vael Gates: Researcher Perceptions of Current and Future AI - discussion
-
Sam Harris and Eliezer Yudkowsky on “AI: Racing Toward the Brink” -- discussion
Higher-level meeting tips
-
Show the video → people discuss afterwards with prompts
-
Active learning techniques: here
-
You can skip around through parts of the video!
-
See “Discussion Groups” from the EA Groups Resource Center
-
Make the official meeting end after 1 hour so people are free to leave, but give people the option to linger for longer and continue their discussion.
-
You can also do readings instead of videos, similar to this. Or play around with a model (e.g. test out hypotheses about how a language model works).
-
Try to keep the video portion to under 20 minutes unless the video is really interesting.
-
For a short video you could watch one of Apart’s ML + AI safety updates. Some of these contain many topics, so people can discuss what they find interesting.