The AGI endgame

Published on October 26, 2024 | 217 views

Nearly half of AI researchers believe that artificial intelligence could spell the end of humanity within the current century. With predictions pointing towards the emergence of artificial general intelligence (AGI) in the next 20 years¹, it’s time we take these warnings seriously.

What might the AGI endgame look like?

Let’s explore the potential scenarios that could unfold from the development of AGI. I’ve attached my probability estimates for the 22nd century to each scenario.

Complete technology ban (3%): Imagine a world where a catastrophic event leads to a Dune-like prohibition of advanced technology. Software becomes taboo, and society regresses to prevent the resurgence of AGI. This scenario might also emerge if a dominant entity develops AGI first and suppresses others to maintain control.
Severe throttling of technology (10%): In response to disaster or proactive political foresight, humanity imposes an indefinite ban on AGI development. This is similar to what happens in the Three Body Problem trilogy and will probably slow down progress significantly. Specialized models like the ones Deepmind developed may still be allowed. Such a world would require a global surveillance state to enforce compliance.
Decentralized control utopia (15%): What if we could ensure safety without sacrificing freedom? Technological solutions like verification, network protocols, and deployment controls (most of which do not exist today) could enable us to control AGI deployment without oppressive oversight. This would still require robust government regulation, including compute governance, GPU firmware for auditing, and KYC for cloud compute providers.
AI plateau (<1%): We hit a plateau and AGI remains out of reach. This seems unlikely and even if it does happen, it would imply a short delay until brain uploads, active inference, or some other intelligent design ends up working (max. 50 years)².
Coincidental utopia (2%): Basic alignment techniques turn out to work exceptionally well, there’s nearly no chance of rogue AI, and we don’t need to put in place any legislation stronger than regulation on recommendation algorithms and redistribution of AI wealth. Or it turns out that AGI doesn’t pose a risk for related reasons, such as disinterest in humanity.
Failure (20%-40%): Despite our best efforts, we could face existential risks or systemic collapse—a “probability of doom,” P(doom), if you will. This could manifest as:
- Obsolescence: Trillions of agents make humanity obsolete and we are displaced similar to how humanity displaced other less intelligent species.
- Race: An AGI arms race culminating in a catastrophic global conflict, featuring autonomous military technologies that lead to mutual destruction.
Uncertainty (30%-50%): Many individuals have a default expectation of complete collapse or utopia. My bet is on positive futures but with important actions to avert disaster. I’m not sure if this will look like a completely different scenario, such as “Mediocre Futurism” (where we for example get autonomous cars in 2143) or if this uncertainty should simply be distributed among the scenarios above.

Notes

Beyond these scenarios, here are some additional thoughts and considerations about the endgame.

Co-existence with AGI seems implausible. We’re creating hundreds of new alien species with a practically unbounded upper limit to their capabilities. There are nearly no scenarios where we reach a status quo of robots cohabiting with humans on Earth.
AGI will be diverse and we have to plan for this. Banking on AGI behaving in one specific way is risky. While some envision benevolent AGI assistants or gods, we must consider the simultaneous existence of open-source models, massive corporate AGIs, and militarized autonomous systems. Success requires addressing all these facets collectively. Any effective plan must tackle all these challenges simultaneously³.
Everyone’s betting on exponential growth. Jensen Huang (CEO of NVIDIA) bet on GPUs over 15 years ago because he saw an exponential growth curve. With R&D costs for AGI in the billions, this curve does not seem to stop.
Race dynamics may be extremely harmful. The 20th century Cold War policy of racing to the endgame of world-ending technology is not a sustainable⁴. We should focus on empowering global democratic governance, improving our ability to coordinate, and determining a reasoned response to this change.
Pausing AGI progress seems reasonable. Like anyone else here, I’m optimistic about the potential for AGI to solve most of our problems. But it seems like pausing for ten years to avoid catastrophe while still reaching that state at a slightly delayed timeline is a good idea⁵.
AGI seems more moral than humanity on the median. While AGI models like Claude and ChatGPT are designed to uphold ethical guidelines, this doesn’t guarantee that all AGI systems will align with human values, especially those developed without strict oversight. The bottom of AGI morality will include everything from cyber offense agents that are designed to disrupt infrastructure to slaughterbots.

Predictions

Similar to the section in Cybermorphism, I’ll add my subjective predictions about the future to put the scenarios above in context.

OpenAI will acquire more than $50b in a new funding round within a year (Nov ‘25) to support continued scaling (30%).
Open weights models (such as Llama) will catch up to end-2025 performance, even to o1, at the end of 2026 (35%).
Most human-like internet activity (browsing, information gathering, app interaction) will be conducted by agents in 2030 (90%).
Within a year, we’ll have GPT-5 (…or equivalent) (80%) which will upend the agent economy, creating an expensive internet (or the expectation thereof), where every action needs checking and security to avoid cyber offense risks and tragedies of the commons (70%).
Trillions of persistent generally intelligent agents will exist on the web by 2030 (90%), as defined by discrete memory-persistent instantiations of an arbitrary number of agent types.
A sentient and fully digital lifeform will be spawned before 2035, irrespective of the rights it receives (99% and I will argue my case).
Despite the tele-operated robots at the “We, Robot” event, the Optimus bot will be seen as the most capable personal robotics platform by 2028 (30%) (and I will own one; conditional 90%).
Before 2035, we will reach something akin to a singularity; a 20% US GDP growth year-over-year, two years in a row, largely driven by general intelligence (30% probability, highly dependent on the perpetuity of US hegemony tactics).
The web (50%< of ISP traffic) will have federated or decentralized identity controls that tracks and ensures whether actions are done by humans or agents before 2030 (25%) or 2035 (60%).

Concluding

Among the scenarios I’ve outlined, I am most optimistic about the potential for a decentralized control utopia. By confidently defining our goals and collaborating globally, we can make concrete progress towards a successful AGI endgame.

What is your endgame?

Grace et al. (2023) and the Metaculus prediction for AGI. ↩
A likelier version of this that will affect AGI timelines temporarily. For example, the transformer architecture might not be enough, we might face limits to what is stored in internet data, or new algorithmic improvement will be slower than expected (such as the multitude of research underlying o1). ↩
The most comprehensive plan that takes this into account seems to be from ARIA’s research director Davidad. However, there’s hundreds of complex problems to solve and the plan should be changed to fit AGI along the way. ↩
I highly recommend Haydn Belfield’s Why policy makers should beware claims of new ‘arms races’ in the Bulletin of the Atomic Scientists for context from the Cold War. ↩
It’s difficult to quantify the negative effects of such a pause, specifically because it has a chance to solve general suffering (e.g. replacing billions of animals in factory farms with lab grown meat) while the alternative is for conscious life on Earth to disappear. I’m personally driven by enabling conscious experience to expand and a hard stop to that would be absolutely and obviously catastrophic. ↩

The AGI endgame

What might the AGI endgame look like?

Notes

Predictions

Concluding

Related writing