Learning Objectives
- Articulate the orthogonality thesis and instrumental convergence and their centrality to advanced AI risk
- Connect technical alignment failures to real-world harm scenarios at scale
Core Readings
Recommended
AI 2027AI Futures Project · 2025
The ProblemMIRI · 2025
Intelligence and Stupidity: The Orthogonality ThesisRobert Miles · 2018
The Basic AI DrivesSteve Omohundro · 2008
Gradual DisempowermentKulveit et al. · 2025
The Superintelligent Will: Motivation and Instrumental Rationality in Advanced Artificial AgentsNick Bostrom · 2012
Existential Risk from Power-Seeking AIJoe Carlsmith · 2022
Further Reading
Instrumental ConvergenceEliezer Yudkowsky · 2025
Why Would AI "Aim" To Defeat Humanity?Cold Takes · 2022
AI Could Defeat All Of Us CombinedCold Takes · 2022
Two Types of AI Existential Risk: Decisive and AccumulativeAtoosa Kasirzadeh · 2025
International AI Safety Report 2026International AI Safety Report · 2026
The Vulnerable World HypothesisNick Bostrom · 2019
AI-Enabled Coups: How a Small Group Could Use AI to Seize PowerForethought · 2025
Impact of AI on Cyber Threat From Now to 2027NCSC · 2025
How AI Threatens DemocracyKreps & Kriner · 2023
Can Democracy Survive the Disruptive Power of AI?Carnegie Endowment · 2024
The Authoritarian Risks of AI SurveillanceLawfare · 2025
The Operational Risks of AI in Large-Scale Biological AttacksRAND · 2024