What are some AI governance exercises and projects I can try?
This list is largely focused on projects within AI policy rather than other career paths like AI safety technical research. See here for projects focused on technical research (there is some overlap).
-
[Public] Some AI Governance Research Ideas (from GovAI)
-
Project page from AGI Safety Fundamentals and their Open List of Project Ideas
-
AI Safety Ideas by Apart Research; EAF post
-
-
Competitions like SafeBench (see example ideas)
-
Student ML Safety Research Stipend Opportunity – provides stipends for doing ML research.
-
course.mlsafety.org projects — CAIS is looking for someone to add details about these projects on course.mlsafety.org
-
-
Distilling / summarizing / synthesizing / reviewing / explaining
-
Forming your own views on AI safety (without stress!) – also see Neel Nanda's presentation slides and "Inside Views Resources" document
-
"Mostly focused on AI" section of "A central directory for open research questions" – contains a list of links to projects, similar to this document
-
Possible ways to expand on "Discovering Latent Knowledge in Language Models Without Supervision"
-
Answer some of the application questions from the winter 2022 SERI-MATS application process, such as Vivek Hebbar's problems
-
10 exercises from Akash in “Resources that (I think) new alignment researchers should know about”
-
[T] Deception Demo Brainstorm has some ideas (message Thomas Larsen if these seem interesting)
-
Alignment research at ALTER – interesting research problems, many have a theoretical mathematics flavor
-
Steven Byrnes: [Intro to brain-like-AGI safety] 15. Conclusion: Open problems, how to help, AMA
-
Evan Hubinger: Concrete experiments in inner alignment, ideas someone should investigate further, sticky goals
-
Richard Ngo: Some conceptual alignment research projects, alignment research exercises
-
Buck Shlegeris: Some fun ML engineering projects that I would think are cool, The case for becoming a black box investigator of language models
-
Implement a key paper in deep reinforcement learning
-
Amplify creative grants (old)
-
“Paper replication resources” section in “How to pursue a career in technical alignment”
-
ELK – How can we train a model to report its latent knowledge of off-screen events?
-
Daniel Filan idea – studying competent misgeneralization without reference to a goal
-
Summarize a reading from Reading What We Can
-
Zac Hatfield-Dodds: “The list I wrote up for 2021 final-year-undergrad projects is at https://zhd.dev/phd/student-ideas.html - note that these are aimed at software engineering rather than ML, NLP, or AI Safety per se (most of those ideas I have stay at Anthropic, and are probably infeasible for student projects).” These projects are good for AI safety engineering careers.