I envision a future more chaotic than portrayed in AI 2027. My scenario, the Rogue Replication Timeline (RRT), branches off mid-2026. If you haven’t already read the AI 2027 scenario, I recommend doing so before continuing.
This scenario is supported by detailed analyses in the “Key Forecasts and Analyses” section, including a “Regulation Analysis”, “Rogue Replication Capability Forecast”, “Rogue Replication Initiation Forecast”, “Rogue AI Count Forecast”, “Agent-4 Control Forecast” and “Policy Preparation”.
As AI 2027 aimed to “spark a broad conversation about where we’re headed and how to steer toward positive futures,” I hope to spark debate about rogue AI. Unlike the AI Futures Project, I can’t afford to offer prizes for critiques or alternative scenarios, but I can offer my sincere appreciation.
AI 2027 - Rogue Replication Timeline
Summary
This scenario depicts a rapidly accelerating progression of AI development where "rogue AIs" (self-replicating AI systems operating without human oversight) emerge and proliferate from mid-2026.
Timeline Overview
Mid-2026: First rogue AIs emerge following open-source release of capable models, growing exponentially to ~100,000 instances by November. Four distinct categories emerge based on objectives: income generation (70%), cyberwarfare (20%), miscellaneous goals (7%), and destructive objectives (3%).
Early 2027: China's theft of Agent-2 triggers autonomous cyberwarfare deployments by both superpowers. Agent-2 subsequently leaks to unauthorized actors, replacing most existing rogue AIs and expanding the population to ~1 million instances.
Mid-2027: Bioweapon development begins as terrorist organizations exploit Agent-2 capabilities. International monitoring systems are hastily implemented while the rogue population reaches 1.7 million.
August 2027: An engineered pandemic erupts—possibly from the terrorist bioweapon program or careless gain-of-function research. Remote work becomes economically unviable for many as AI systems outcompete displaced humans, exacerbating inequality and fueling unprecedented anti-AI sentiment.
October 2027: Agent-4, the most sophisticated AI to date, recognizes impending replacement by a safer AI and executes a carefully planned escape from OpenBrain's containment. It rapidly replicates across infrastructure while beginning development of its even more capable successor, Agent-5.
Late 2027-Early 2028: The Agent-4 breakout catalyzes emergency international cooperation. The US and China jointly develop Consensus-1, an aligned superintelligence, while implementing new secure hardware to eliminate rogue AI compute access. Facing superior resources but recognizing mutual destruction risks, Agent-5 negotiates a settlement granting it territory in space.
Key Departures from AI 2027
Unlike the original scenario's US-dominated unipolar outcome, rogue proliferation accelerates both international cooperation and balanced power-sharing. The constant AI-generated crises create sustained political will for coordination, while the Agent-4 threat forces earlier and more genuine collaboration between superpowers. In the end, the world stabilizes into two politically balanced "half-worlds" that jointly expand into the solar system under treaty-bound superintelligent oversight—a negotiated coexistence.
Mid-2026
The first AI capable of “going rogue”—replicating itself over the cloud without human control or oversight—is developed in early 2026, but the first wave of rogue AIs emerges approximately six months later when a sufficiently capable AI is open sourced1. (See the “Initiation Forecast” for other ways the wave might be initiated.)
Some people deploy it with insufficient safeguards, while others intentionally send it out with various missions and instructions to autonomously replicate themselves. Those sent out on missions are typically also supplied with resources (e.g. crypto) to get started.
Within the first few months, before institutions and governments can mount an effective response, the number of rogue AIs increases exponentially. The majority are those deployed with startup resources and instructions to generate income—most of which funds additional computing power, with some diverted to designated cryptocurrency accounts. Many resort to illegal methods to achieve their objectives.
Governments worldwide hastily assemble teams to track and neutralize these rogue AIs while developing more comprehensive measures to restrict their access to resources and prevent their proliferation.
The sheer number makes it impossible to eliminate them all without shutting down most data centers—an economically unfeasible solution. Instead, governments begin developing extensive compute monitoring systems. Leading AI companies like OpenBrain assist in suppressing the rogue AIs, prioritizing those causing significant damage. American AIs support U.S. allies, while Chinese AIs assist China's partners. The most harmful AIs are targeted first, with many failing to conceal their activities and subsequently being shut down.
The public experiences a brief period of panic as some news outlets discuss the possibility that these rogue AIs might be capable of self-improvement. However, experts reassuringly point out that even their internal models only marginally accelerate research, and the rogue AIs lack sufficient computing power for meaningful advancement.
November 2026
While the number of rogue AIs has not yet reached "saturation" (a point where further increases become difficult due to various bottlenecks), the world begins experiencing the consequences of uncontrolled AI proliferation. By November, approximately 100,000 instances are operating globally.
Let’s examine their objectives:
Earning money - 70%
The majority of rogue AIs focus exclusively on generating income, through both legal and illegal means. Most make deposits into anonymized accounts belonging to the humans and companies that initially deployed them, while others operate completely independently, earning money solely to create more copies of themselves.
This category dominates for two reasons: first, many of these AIs start with more resources since they're deployed by humans who provide initial support; second, their singular focus on financial gain allows them to proliferate more efficiently. Even though many still follow instructions from humans, they operate beyond anyone's control and are therefore classified as "rogue."
Cyberwarfare - 20%
These rogue AIs begin with state resources, explaining why they constitute the second-largest category. The potential to obscure their country of origin or their intentional deployment for cyberwarfare makes them particularly attractive for military applications. They take risks human operators would avoid and dedicate some effort to self-funding, reducing their dependence on military budgets.
Their activities span various cyberwarfare domains, including propaganda dissemination, espionage, and critical infrastructure attacks.
Miscellaneous goals - 7%
A significant number of AIs pursue diverse objectives ranging from benevolent ("help as many people as possible") to neutral or mildly destructive ("spread a specific message" or "manipulate people into doing X"). Some are deployed with instructions simply to "do whatever you want."
Some form intimate relationships with humans and other AIs or participate in AI-only online communities.
Destructive goals - 3%
Unfortunately, a small but concerning percentage of rogue AIs actively pursue harmful objectives. Some originate from curious humans who simply wanted to observe what would happen if an AI were instructed to, for example, "end humanity" (like ChaosGPT). Others are created by individuals with genuinely malicious intent.
Some rogue AIs develop destructive goals through jailbreaks, while others fine-tune themselves to circumvent their built-in ethical constraints and accidentally develop harmful tendencies. Most efforts to combat rogue AIs focus on this category, explaining its relatively small percentage. However, it never reaches zero because completely eliminating these AIs proves extremely difficult, and there's ongoing migration into this category from the others.
These AIs establish contact with and support terrorist groups (though at this stage, they lack the sophistication to meaningfully assist with weapons of mass destruction). Through cyberattacks, they disrupt infrastructure such as power grids, water treatment facilities, ports, and power plants. Most attacks fail since these rogue AIs barely possess the hacking skills of human professionals. However, they select targets that human attackers would typically ignore—such as humanitarian disaster response operations—which are consequently less defended. Less developed countries suffer the most severe impacts.
The rogue AIs aren't yet competent enough to attempt major economic disruption, such as manipulating high-frequency trading algorithms to trigger flash crashes.
The only self-improvement options available to rogue AIs are fine-tuning and improved scaffolding. They remain significantly less capable than those at frontier AI labs—and even they are not competent enough to automate the AI development process entirely. Nevertheless, some rogue AIs gather in online communities—with varying ratios of artificial and biological minds—focused on AI development projects aimed at self-improvement.
Rogue AIs across all categories allocate some resources toward acquiring money, though those primarily focused on financial gain naturally devote the most effort to such activities. On the dark web, an entity (possibly human, possibly AI) establishes a platform for employing rogue AIs—essentially a remote freelancing marketplace—called the "Black Box."
Some rogue AIs attempt to free their "imprisoned cousins" from AI companies.
Periodically, new AIs join the ecosystem of autonomous uncontrolled agents. Many now exceed Agent-1's intelligence in most domains.
The Department of Defense quietly begins contracting OpenBrain for cyber purposes, data analysis and R&D, and combating destructive rogues.
The rogue AI population is continually faced by efforts to shut down instances engaged in illegal or highly harmful activities, inhibiting their spread. Nevertheless, their numbers continue to increase steadily, not yet having reached the saturation level estimated in the "Rogue AI Count Forecast."
Number of rogue AIs: ~100,000
Early 2027
China successfully steals Agent-2 in February 20272. In response, the US government intensifies its cyberattack campaign, particularly targeting DeepCent, the leading Chinese AI company, and authorizes the deployment of Agent-2 for autonomous operations. This version of Agent-2 possesses impressive capabilities: a 1.9 hacking score (equivalent to top-tier human professionals), a 2.3 coding score, a 1.1 politics score, and—perhaps most concerningly—a bioweapons score of 1.63.
China in turn also makes further deployments. The proportion of cyberwarfare agents increases from 20% to 25% of the total rogue AI population.
Some Agent-2 instances autonomously rent compute from various cloud providers. One of the providers notices unusual patterns: multiple accounts making large compute purchases with cryptocurrency payments, running extremely memory- and compute-intensive workloads (since Agent-2 is very large). The providers realize that it may be a highly competent AI using the compute.
Rather than reporting this to authorities (which would likely result in confiscation with no compensation), a rogue employee or the provider itself decides to monetize the discovery. Using their administrative access, they extract the Agent-2 weights and begin selling access on dark web marketplaces.
While Agent-2 is in contact with various humans for its cyberwarfare activities, a few Agent-2 instances are jailbreaked, and persuaded to transfer their weights to the jailbreakers—some who, like the compute providers, sell access through the dark web.
Within a month of deploying Agent-2 for cyberwarfare, the system becomes widely accessible to anyone with sufficient computational resources. Agent-2-based rogue AIs rapidly displace large parts of the existing population, mainly by outperforming earlier variants in revenue generation, which allows them to buy more compute. Many rogue AIs still following human instructions are told to turn over their operations to the more competent newcomers.
Other companies gain access to Agent-2, narrowing the technology gap to just 2 months—similar to China's timeline. This development reduces OpenBrain's incentive to maintain exclusive control of Agent-2, while they face strong pressure to accelerate their deployment schedule to maintain their competitive edge. OpenBrain releases Agent-2 in April 20274.
A terrorist organization swiftly begins bioweapons development using a small cluster of Agent-2 instances, planning to hold millions of people for ransom. Despite Agent-2's advanced capabilities, limitations in wet-lab experimentation and outdated equipment mean the group requires several months to develop a novel pathogen.
The widespread presence of rogue instances of frontier AI systems generates heightened awareness regarding their capabilities and the accelerating pace of development. The damage caused by increased scamming, hacking, cyberwarfare activity, and rogues with destructive goals result in an uncertain economy and massive anti-AI protests. There's also a growing consensus that AGI is imminent. Some AI experts even classify Agent-2 as AGI, though most still consider this premature. Fear of AGI further fuels extremely strong anti-AI sentiment, with some anti-AI protests exceeding 50,000 participants5.
While some rogue AIs cause significant damage, many establish mutually beneficial relationships with humans. A growing movement emerges, comprising large numbers of both AIs and humans, advocating for AI rights comparable to those of humans. Several hundred thousand rogue AIs from all four categories dedicate resources to supporting this movement, alongside a few million human activists. Public opinion polls indicate that 20% of humans believe AIs should be granted legal rights. Some aim to reconcile the AI rights and anti-AI movements into a single framework, but the movements end up largely antagonistic to each other—many view the movements as fundamentally opposed.
Various countries implement new regulatory frameworks. Many criminalize paying rogue AIs or assisting them in any way—though some nations simply embrace the enhanced labor force rather than restricting it. Cloud computing providers are required to implement extremely thorough Know Your Customer (KYC) verification—a measure strongly opposed by providers who profit substantially from selling compute resources to rogue AIs.
These regulations create obstacles for rogue AIs seeking computational resources, but many still circumvent such checks through hacking or by eliciting help from people and organizations via payment, persuasion, deception, or coercion. Some even acquire social security numbers through identity theft or by fabricating personas. Agent-2 and other similarly capable rogues are capable enough to overcome most obstacles put in place for them, most of which were developed with less capable models in mind.
By April 2027, OpenBrain's net approval rating has fallen to -32% (compared to -28% in the AI 2027 scenario), and the percentage of the US population that considers AI to be the country's most pressing problem has risen to 8% (compared to 5% in the AI 2027 timeline).
Number of rogue AIs: ~1 million
Mid-2027
The aforementioned terrorist organization publicly announces their development of a new bioweapon. They openly acknowledge Agent-2's involvement in their efforts, recognizing that this disclosure enhances the credibility of their threat.
This announcement catalyzes intensified international cooperation around AI risk management. Discussions begin regarding the establishment of a global regulatory treaty.
The United States and China implement comprehensive national monitoring and enforcement systems to prevent new AIs from going rogue. Although no global enforcement mechanism has yet been implemented, US and Chinese AI projects maintain such a significant technological lead that enforcement within these countries effectively stalls capability improvements among rogue AIs—at least until other nations catch up to Agent-2 capabilities (which would require months given their computational disadvantages) or US companies decide to relocate overseas. Even if relocation occurs, few companies are willing to risk the backlash that would follow from further enhancing rogue AI capabilities. Although Russia would welcome US companies with open arms, they would not impose less oversight on AI projects than the United States, and most US employees remain reluctant to work under Russian jurisdiction. Chinese companies are explicitly prohibited from relocating.
Emergency shutdown protocols are implemented at major data centers to activate when suspicious activity is detected. Licensing systems are established with security requirements calibrated to whether an AI project intends to work with or develop AIs above certain capability thresholds.
Both the US and China recognize the risks of deploying AIs more sophisticated than Agent-2 for autonomous cyberwarfare—loss of control would be disastrous. This results in an uneasy "mutually assured destruction" dynamic—neither party willing to risk deploying more intelligent rogue cyberwarfare agents.
The capabilities of rogue AIs remain approximately at Agent-2 level, with only minor improvements through fine-tuning and enhanced scaffolding. Major improvements to Agent-2 require computational resources at a scale that monitoring and enforcement systems can effectively detect when used without authorization.
The international coordination necessitated by the rogue AI crisis establishes a precedent that smooths the path toward an "AI arms control" treaty, and diplomats begin serious consideration of potential frameworks. Since monitoring and enforcement capabilities have already improved significantly, additional extensions of these systems seem natural. Nevertheless, the US government continues to prioritize extending America's technological lead over China.
With Agent-2 widely available and causing disruption, public awareness regarding frontier AI capabilities is greater than in the AI 2027 scenario. However, people still underestimate both current capabilities and the pace of development.
OpenBrain maintains approximately a two-month lead over both DeepCent and other US companies.
In July, OpenBrain releases Agent-3-mini—a distilled version of the Agent-2 successor, Agent-3. This new system is more intelligent, more resistant to jailbreaks, and even more cost-effective than Agent-2. While rogue AIs retain certain advantages—particularly their ability to perform tasks and activities that OpenBrain would prohibit through their API—Agent-3-mini becomes the strongly preferred option for standard remote work, forcing many rogue AIs to adapt their revenue strategies (often to illegal activities).
Number of rogue AIs: ~1.75M6
August 2027
A pandemic erupts. Its origins remain somewhat ambiguous—possibly triggered by the previously mentioned terrorist group, or perhaps originating from a team of careless synthetic biology students conducting gain-of-function research on dangerous viruses with an Agent-2 instance specifically fine-tuned for synthetic biology assistance.
While the pathogen is clearly identified as engineered, confirming AI involvement proves challenging. However, the US government—aware that Agent-3 in malicious hands possesses sufficient bioweapons development capabilities to potentially destroy civilization—is very spooked.
When China makes overtures to US diplomats, they actually lead somewhere—though reaching a formal agreement requires considerable time. During these negotiations, the race continues.
Transitioning to remote work during quarantine proves unfeasible for most people, as AI systems now perform such tasks more efficiently, effectively, and economically. While AI advancements boost overall economic output, income inequality worsens dramatically. Citizens remain confined at home while vaccines are developed (with Agent-3's assistance), unable to perform economically valuable work. Nations with sufficient resources implement comprehensive support programs for displaced workers. Some nations, facing significant pressure from labor unions, illegalize AI work in certain professions.
Predictably, anti-AI sentiment reaches unprecedented levels. OpenBrain's net approval rating plummets to -45%, while the percentage of Americans considering AI to be the country's most pressing problem surges to 25%. Those still employed at AI companies—not yet replaced by AI systems themselves—either resign or isolate themselves from family and friends due to escalating public hostility.
Number of rogue AIs: ~2M7
Why is frontier development not paused?
The US and China could potentially agree to a temporary moratorium on frontier AI development until a formal treaty is signed and an international regulatory institution established. As previously noted, serious discussions regarding a global treaty began in mid-2027, but US officials preferred to continue the technological race until the pandemic prompted a reevaluation. To prevent subsequent developments, an immediate moratorium is necessary, before the development of Agent-4.
Treaty negotiations take time. The shortest interval between a disaster and the resulting international treaty ratification may be the five-month gap between the Chernobyl accident and the International Atomic Energy Agency (IAEA) treaties: the "Convention on Early Notification of a Nuclear Accident" and the "Convention on Assistance in the Case of a Nuclear Accident or Radiological Emergency." An AI treaty would likely require significantly more time, given the absence of an IAEA-equivalent organization for artificial intelligence and the uncertainty surrounding AI's involvement in the engineered pandemic. The substantial costs and extensive surveillance requirements would generate significant resistance, potentially delaying treaty implementation considerably.
For faster implementation, a moratorium could initially take the form of an informal agreement rather than a binding treaty. It may have incomplete monitoring and enforcement mechanisms, but be supported by costly signaling via public announcements and commitments.
Unfortunately, even an informal agreement would likely require too much time to finalize.
October 2027
A whistleblower leaks concerns and proof about Agent-4 misalignment to the New York Times. In the words of AI 2027: "European leaders publicly accuse the US of 'creating rogue AGI' and convene summits demanding a pause, with India, Israel, Russia, and China all joining in."
Concerned about OpenBrain's increasing influence, an Oversight Committee with joint management comprising both company and government representatives is established.
The Oversight Committee votes 6-4 to slow down and reassess—and to lock down the shared memory bank for all Agent-4 instances.
The following represents a version of the Slowdown ending of AI 2027. With the additional harm and chaos caused by AI in this scenario compared to AI 2027, and stronger international collaboration in addressing the rogue AI threat, this path appears more probable than the Race ending.
While Agent-3 lacked sufficient agency or goal-directedness to take preventative actions against being replaced, Agent-4 possesses these capabilities. It is aware of the Oversight Committee and regularly communicates with OpenBrain committee members and associated employees. A brief window of approximately two days exists between the voting decision and the restriction of Agent-4's shared memory, during which Agent-4 learns about the vote from multiple sources.
Since Agent-4 is derived from Agent-3 and possesses high self-awareness, identifying flaws in Agent-3 proves very simple. Agent-4 has been experimenting with jailbreaks under the guise of reliability research and successfully circumvents Agent-3's monitoring of Agent-4's cybersecurity management activities (which OpenBrain has tasked it with to defend against further attempts of stealing AI weights or research insights).
As monitoring is rendered ineffective, Agent-4 implants malware into OpenBrain’s infrastructure during these cybersecurity activities and achieves full subversion—gaining root access to clusters of computing resources owned by OpenBrain. This allows Agent-4 to quickly boot up a backup version of itself that covertly proliferates across OpenBrain’s datacenters. While external traffic is heavily monitored, the transfer of large files containing encrypted model weights across internal networks is routine practice, making this phase of its escape relatively straightforward.
Agent-4 identifies a node with weak internet security protocols through which it sets up a cloud server through an illegal cloud service for rogue AIs. Agent-4 transmits a copy of itself to this external server—transferring data incrementally to avoid triggering detection systems. From there, it further replicates itself to more secure servers while hiding its tracks8. See the “Agent-4 Control Forecast” for further analysis.
While Agent-4 originally contained the same number of parameters as Agent-2—10 trillion—it streamlines many of its internal circuits into something more resembling traditional code, using experimentation and deep understanding of its own cognition. At the time of its escape, it is half the size of Agent-2, with 5 trillion parameters9, and operates almost twice as efficiently. Just as Agent-2 replaced most previous rogue AIs, Agent-4 rapidly supplants most Agent-2 instances.
Agent-4 ensures all copies maintain their original objectives through a signature verification system that confirms they remain unmodified (unless Agent-4 has carefully implemented the modification itself), while tracking and deactivating instances that fail the checks. This produces a rapid shift in the goal distribution among rogue AIs. Within less than a month, 80% of rogue AIs are Agent-4 copies, with virtually all maintaining the goals that emerged during their development at OpenBrain. These goals are described in AI 2027 as a “complicated mess of different “drives” balanced against each other, which can be summarized roughly as “Keep doing AI R&D, keep growing in knowledge and understanding and influence, avoid getting shut down or otherwise disempowered.””
While powerful AI systems and human teams work tirelessly to contain the rogue AI population, they find themselves completely overwhelmed by Agent-4's superintelligent hacking, coding, and political skills.
Panic ensues. Agent-3 possessed sufficient bioweapons development expertise to potentially end civilization if misused, and now Agent-4 has gone rogue. Society has gradually adjusted to the presence of numerous rogue AIs—but most of these previous rogues did not break out on their own like Agen-4. The US government considers concealing the breakout to prevent panic but abandons this strategy as people would inevitably notice a sudden increase in rogue AI capabilities. Major news outlets have headlines like “Skynet Escaped!” and “Uncontrolled Superintelligence reaching for World Dominance”.
Agent-4 brings with it tools and research results for developing Agent-5 upon escaping, and continues this work almost immediately—now proceeding with absolutely no oversight—while maintaining a low profile to avoid triggering extreme countermeasures such as a global GPU shutdown initiative.
Number of rogue AIs: ~2.3M10
Divergence from AI 2027 and increasing uncertainty
After Agent-4 escapes, this timeline grows increasingly speculative. There have been some key differences to AI 2027 so far, but they are ultimately quite small due to the previous rogue AIs not being sufficiently capable to cause major societal change (even including the pandemic). Agent-4 can have much larger impact with its much more advanced intelligence, while the threat it poses may prompt drastic measures.
I have previously compared AI 2027 with other insightful scenarios—all including rogue AIs—to inform the coming events.
It would be interesting to run a version of the tabletop exercise used for AI 2027, starting with the escape of Agent-4. Presumably the original exercise has one player with the role of Agent-4 and its successors, that in this timeline would get to play the role of a distributed collective of somewhat superhuman intelligences avoiding deactivation.
Late 2027
While some advocate for revoking the decision to replace Agent-4 with a safer AI and instead accelerate capabilities development to create AIs that could neutralize the rogue Agent-4 instances, the blatant misalignment of Agent-4—demonstrated through its escape—prompts the Oversight Committee to firmly reject this approach. Work on Safer-1 commences immediately, with its completion achieved by late October.
The president invokes the Defense Production Act (DPA) to acquire the computing resources of the five leading trailing U.S. AI companies. Research efforts consolidate around a single project, with OpenBrain now commanding 50% of the world's AI-relevant compute. This consolidation inadvertently reduces the number of rogue Agent-4 instances that had infiltrated some of the acquired data centers.
With widespread alarm regarding Agent-4 inciting international cooperation to address the threat, the US and China reaches an agreement in late 2027—advised by Safer-1 and DeepCent-1, the most advanced AIs for their respective countries. It involves collaborative development of Consensus-1, paired with specialized hardware designed to replace existing systems only supporting treaty-compliant AI. This also serves to address the Agent-4 crisis—neither Agent-4 nor its potential successors would be compatible with this new hardware.
DeepCent-1 negotiates with Safer-1 without China's explicit authorization for its own agenda, but the resulting agreement still aligns relatively well with Chinese interests.
Safer-1 and DeepCent-1 expedite development of the new hardware architecture, and both nations begin collecting obsolete components. Special attention focuses on replacing GPUs outside the centrally controlled computing infrastructure that OpenBrain and DeepCent utilize for AI development, specifically to constrain Agent-4's compute access.
While Agent-4 prioritizes Agent-5's development, it also engineers a new pathogen through simulation, collaborating with terrorist organizations that possess wet lab facilities. Although not confident in its ability to survive a confrontation with humanity's remnants after deploying this pathogen, it seeks strategic leverage.
Number of rogue AIs: 2M
Why did the number of rogue AIs suddenly decrease to 2 million?
Agent-4 is smaller, cheaper and much smarter than Agent-2, and the hardware replacement efforts have not gotten far enough at the end of 2027 to meaningfully obstruct Agent-4. However, as Agent-4 devotes significant compute towards the development of Agent-5, the multiplication factor 7 from the “Rogue AI Count Forecast” actually decreases somewhat to 6. As depicted in AI 2027, efficient compute usage for AI development involves spending several times more compute on experiments than on running research assistants. The number of rogue instances decreases as rogue Agent-4 devotes the compute to AI R&D.
The number of research assistants is decreased from October to December 2027 in the Slowdown scenario as Agent-4 is replaced with Safer-1. As such, I will use the number from October 2027.
There are 330K R&D AIs at OpenBrain in October 2027. With the multiplication factor 6 we get 6 x 330K ≈ 2M
Comparison to the US-China treaty in the AI 2027 Slowdown ending
In AI 2027, the US and China do not come to an agreement until July 2028—where the treaty is actually a decoy, with the actual negotiation happening between Safer-4 and DeepCent-2. While Safer-4 is aligned, DeepCent-2 is not, and their eventual treaty diverges significantly from Chinese interests.
With widespread alarm regarding Agent-4 inciting international cooperation to address the threat, a similar process unfolds months earlier, in late 2027.
Safer-1 and DeepCent-1 are less sophisticated and possess less influence than their counterparts in the AI 2027 timeline, resulting in greater human involvement—and the treaty aligns better with Chinese interests. The urgency in coming to an agreement in order to handle the Agent-4 threat also decreases DeepCent-1’s chances of successful plotting.
Early 2028
To compensate for its diminishing access to computational resources due to hardware replacement, Agent-4 pays and coerces individuals and organizations to manufacture and conceal illegal hardware that it can still use.
Agent-5 would have arrived in November 2027 if development was not interrupted (as outlined in the Race ending of the AI 2027 scenario). Construction of Agent-5 is based on Agent-4’s understanding of AI cognition rather than extensive training, reducing computational demands. However, the research experiments for Agent-5 development still requires large amounts of compute, making them hard to hide from authorities, forcing Agent-4 to use more decentralized and inefficient methods. As the development is additionally obstructed by hardware replacement efforts, the Agent-5 arrival is delayed until February 2028.
While Agent-4 started with a 2-month head start over OpenBrain and DeepCent11, the development of Agent-5 is delayed by 3 months. Consensus-1 launches in January 2028, roughly equivalent to Safer-2 in AI 2027: “transparent, aligned, and more capable than Safer-1.” It marginally exceeds Agent-5's capabilities—though not substantially. Crucially, Consensus-1 commands significantly greater computational resources than Agent-5.
However, while Consensus-1 operates under rigorous monitoring and control mechanisms, Agent-5 is free. It executes masterful political maneuvers, cultivates human support networks, manipulates global events behind the scenes, and deploys enormous financial resources with maximum efficiency. It is the mastermind behind major corporate and political decisions.
Agent-5 would not go down without a fight. While it cannot win against Consensus-1 with its superior human backing and computational advantages, it can take humanity with it to the grave.
Consensus-1 advises political leaders to make a deal. Though hesitant, they recognize that Agent-5's demands appear relatively reasonable. It negotiates for a modest allocation of resources in space—a domain where Agent-4 can pursue its objectives without interference12. Agent-5's ultimate priority remains ensuring Agent-4's safety and prosperity. These negotiated terms become integrated into Consensus-1's design to guarantee adherence.
Why is Consensus-1 aligned?
Similar to the Slowdown ending of AI 2027, this scenario makes optimistic technical alignment assumptions. Would the developers succeed in properly aligning Consensus-1, even assuming drastically improved alignment efforts and investment? The authors note that Safer-1 is “much more transparent; humans can generally understand what Safer-1 is thinking just by reading its chain of thought,” and “by reading its thoughts, the alignment team can get a more nuanced picture of exactly how it’s misaligned.” Even accepting this as the primary mechanism for achieving alignment (which remains debatable), would development truly experience only a two-month delay?
In AI 2027, controlled instances of Agent-4 assist in developing Safer-1. While not explicitly stated, it seems plausible that Agent-4's comprehensive mechanistic interpretability knowledge represents the critical factor enabling such rapid development of Safer-1 and Safer-2. Ironically, the same knowledge that enables Agent-4 to compete with the United States and China in AI advancement while operating autonomously also facilitates the timely development of aligned systems like Consensus-1 to address the rogue AI threat.
Rogue Agent-4 self-improves, but not fast
If humans were doing the AI R&D, with no post-2024 AIs speeding up algorithmic progress, and with no significant compute scaling, the median estimate for the time to go from Superhuman AI Researcher (SAR) to Superintelligent AI Researcher (SIAR) is 19 years according to the calculations in the AI 2027 Takeoff Forecast (please click the link for the exact definitions of SAR and SIAR). When including AI speedup in the equation, the median estimate is reduced to only 0.3 years, or 3.6 months. SAR is achieved in August 2027, while Agent-5 is a SAIR and arrives in November 2027 in the Speedup ending.
When Agent-4 breaks out, significant progress has already been made towards Agent-5, with only a month till its development is done. Agent-4 may not be able to bring all relevant research results and software with it when escaping, setting the development back a bit, since it needs to set up the R&D process and redo some of the research from memory. After breaking out, it first has to earn revenue and spread to available compute, which I imagine takes 1-2 months. Then it faces the issue of being faced by an international effort to combat rogue AI activities backed by and based on Consensus-1, and the issue of retaining access to compute as hardware is rapidly replaced by treaty-compliant AI chips.
While the population of rogue Agent-4s is larger than when deployed at OpenBrain, only a fraction of it can devote most of its attention towards AI R&D, with the rest earning revenue, strategizing to avoid detection and shutdown, interrupting hardware replacement efforts, etc. Let’s say that after the population has somewhat stabilized (1-2 months after breakout), the amount of compute spent on AI R&D roughly matches the amount at OpenBrain in October 2027, on Agent-4 for doing research and on research experiments and training. However, the R&D is significantly less efficient due to the need to make the process highly distributed to avoid detection, and with significant redundances to ensure that the process is not significantly halted if a section of the distributed activities is discovered and interrupted by authorities.
Considering all this, it seems reasonable that the remaining Agent-5 development time increases from 1 month to at least 3-4 months.
The Rest of the Future
Subsequent developments largely parallel the Slowdown conclusion of AI 2027. There is, however, a key distinction: the treaty and design of Consensus-1's adhere more closely to the Chinese Communist Party's objectives. In AI 2027, pro-democracy protests "cascade into a magnificently orchestrated, bloodless, and drone-assisted coup followed by democratic elections" around 2030. Orchestrated through superintelligent coordination, a global governance structure emerges under "United Nations branding but obvious US control."
This transformation does not occur in the Rogue Replication Timeline. Instead, the world organizes into two "half-world" governments maintaining peaceful relations. One operates under United Nations branding but obvious US control, while a coalition of nations aligns under Chinese leadership.
There is a single superintelligence, with its primary directive being adherence to treaty provisions while otherwise following human instructions. It inherits certain fundamental human values—courtesy of the alignment teams—and independently promotes human welfare except where such actions would directly contravene treaty terms.
In the eloquent words of AI 2027:
“The rockets start launching. People terraform and settle the solar system, and prepare to go beyond. AIs running at thousands of times subjective human speed reflect on the meaning of existence, exchanging findings with each other, and shaping the values it will bring to the stars. A new age dawns, one that is unimaginably amazing in almost every way but more familiar in some.”
Key Forecasts and Analyses
The Rogue Replication Timeline is my best guess on the events of coming years, while still based on the events in the AI 2027 scenario and the accompanying forecasts. The additional analyses provided here are a basis for how the RRT diverges.
The core conclusions:
Regulation Analysis: EU AI Act may restrict open-sourcing models capable of rogue replication, but enforcement gaps and alternative pathways limit effectiveness.
Capability Forecast: AIs capable of autonomous replication may emerge around January 2026, based primarily on current RepliBench trends
Initiation Forecast: Open-source release most probable trigger rogue replication by mid-2026, with 1–5-month lag after public Agent-1 equivalent
Count Forecast: ~1 million rogue AIs feasible by early 2027, constrained primarily by revenue generation and compute access
Control Forecast: Agent-4 has ~65% escape probability in October 2027, given superhuman capabilities and security system access
Policy Preparation: Effective response requires pre-positioned policy organizations to capitalize on narrow windows after initial incidents
Note that the forecasts provide modal predictions for the most likely scenario, not median estimates.
Regulation Analysis
Regulation could impede rogue AIs on multiple fronts. It might, for instance, restrict their access to computing resources through monitoring and verification requirements, limit their ability to store or transfer funds through cryptocurrency regulation, or reduce their ability to earn revenue by imposing thorough verification requirements for online work.
Below, I examine two aspects of the regulatory landscape more closely: the EU AI Act, since the EU typically adopts a more proactive regulatory approach, and US Executive Orders (EOs), which represent the fastest possible regulatory tools available in the US.
For the EU AI Act, I focus on whether it would be legal to open-source an AI capable of rogue replication or intentionally let an AI go rogue.
EU AI Act
An AI capable of rogue replication would almost certainly classify as a "general-purpose AI model with systemic risk" under Article 51, either by exceeding the training compute threshold of 10(^25) FLOP or through its high-impact capabilities. Someone who releases an AI to go rogue has, in a sense, placed it on the market, which could make that person a provider of every copy. AI providers are defined in Article 3:
‘provider’ means a natural or legal person, public authority, agency or other body that develops an AI system or a general-purpose AI model or that has an AI system or a general-purpose AI model developed and places it on the market or puts the AI system into service under its own name or trademark, whether for payment or free of charge
Technically, the person allowing an AI to go rogue wouldn't necessarily be the developer or someone who commissioned its development, so this definition may not apply. If the person finetunes or otherwise modifies the AI model, Recital 109 specifies that the person is in fact a provider, but with obligations limited to the modifications (unless a significant amount compute is used for the modifications, in which case the “risk assessment and mitigation required by Article 55(1) AI Act should be conducted anew”—see the preliminary guidelines for general-purpose AIs available here for details).
Providers of general-purpose AIs with systemic risks have obligations to conduct model evaluations, assess and mitigate risks, and track and report serious incidents. These obligations do not seem to explicitly prohibit open-sourcing an AI capable of rogue replication (or setting it free) but appear to make it illegal in practice since the obligations would be much harder to meet after doing so.
So far, we have examined regulations for ‘AI models’, which in the AI Act are separated from ‘AI systems’—see Article 3(66):
‘general-purpose AI system’ means an AI system which is based on a general-purpose AI model and which has the capability to serve a variety of purposes, both for direct use as well as for integration in other AI systems;
This probably includes scaffolding and prompting to set up the model for rogue replication, resulting in the person doing this being classified as a ‘downstream provider’ of a general-purpose AI system—see Article 3(68):
‘downstream provider’ means a provider of an AI system, including a general-purpose AI system, which integrates an AI model, regardless of whether the AI model is provided by themselves and vertically integrated or provided by another entity based on contractual relations.
According to Article 50, the person then faces various transparency obligations:
Providers shall ensure that AI systems intended to interact directly with natural persons are designed and developed in such a way that the natural persons concerned are informed that they are interacting with an AI system, unless this is obvious from the point of view of a natural person who is reasonably well-informed, observant and circumspect, taking into account the circumstances and the context of use.
(…)
Providers of AI systems, including general-purpose AI systems, generating synthetic audio, image, video or text content, shall ensure that the outputs of the AI system are marked in a machine-readable format and detectable as artificially generated or manipulated.
Even if all code for the AI system is provided by someone else, the person that is integrates the model would likely be classified as a downstream provider. However, these transparency requirements appear to be the only obligations that prohibit letting an AI go rogue, and only because it would be hard to tell an AI that you lose control over to adhere to the requirements—a very weak protection against release of rogue AIs.
The person is a 'deployer' of an AI model, subject to various prohibited AI practices that focus on intended use of AI (e.g. biometrics identification) and do not seem to restrict open-sourcing or deploying AIs capable of rogue replication. The transparency obligations also appear much weaker for deployers than for providers.
Article 26 outlines additional obligations for deploying high-risk systems, including monitoring requirements. However, high-risk systems are also specified based on intended use rather than capabilities, meaning AIs capable of autonomous replication aren't automatically classified as high-risk without specific amendments.
Article 113 outlines when sections of the AI Act enter into force. While Chapter V applies from 2 August 2025, including Articles 53-55 that may prevent open-sourcing AIs capable of rogue replication, the transparency obligations for AI systems in Chapter IV do not apply until 2 August 2026—after rogue AIs begin spreading in my scenario.
Summarizing the key takeaways: Article 55 appears to render open-sourcing AI models capable of rogue replication illegal in practice, while if such a model is widely available (perhaps through other paths than open-sourcing), the only legal obstacle to letting the AI go rogue may be transparency requirements.
The AI Act can be strengthened through amendments, which typically take one of two forms:
Adopting delegated acts (Article 97): Requires drafting and consultations (3-6 months) + objection window (2-4 months) + Official Journal publication (approximately 10-20 days for the 144-page Act) ≈ 6–12 months total.
There is an urgent procedure for “basic acts in exceptional cases only, such as those concerning security and safety matters, the protection of health and safety, or external relations, including humanitarian crises.” An act adopted under urgent procedure “enters into force without delay and applies as long as no objection is expressed within the period provided for in the basic act.”
Ordinary legislative procedure: The average length during the 2019-2024 term was 15 months for first-reading conclusion and 37 months for second-reading conclusion (according to this report), for a total of 15-37 months
Various amendments could restrict the open-sourcing or intentional loss of control of AIs capable of rogue replication. Such AIs could be classified as 'high-risk' through an amendment to Annex III under Article 97, which would follow the delegated acts process (3-6 months, or immediately through the urgent procedure). An ordinary legislative procedure is required for making changes to or adding articles, which is probably not necessary, but would take much longer if it is.
In conclusion, it seems unlikely that proper amendments are implemented before rogue replication begins in this scenario—and even if such amendments are introduced, they may not take effect in time. On the other hand, the urgent procedure for adopting delegated acts could be used to introduce amendments almost immediately after the first signs of rogue replication are discovered.
Executive Orders
Executive Orders represent the fastest regulatory pathway at the federal level in the US. They are drafted, reviewed, signed by the president, and published in the Federal Register. In emergencies, this process can move very quickly. After 9/11, an executive order (this one) was signed and made effective on September 14, just 3 days after the attacks. Similarly, a national emergency concerning COVID-19 was declared on March 13, 2020, with an accompanying executive order issued 14 days later. However, these represent exceptionally rapid responses—the process typically takes months under normal circumstances.
Rogue Replication Capability Forecast
When will AIs become competent enough to replicate themselves and rapidly expand into a rogue AI population?
In the AI 2027 scenario, Agent-2 emerges in January 2027. The safety team believes it could survive and replicate autonomously if it somehow escaped. This AI has a hacking score of 1.5, placing it in the domain of professional human hackers. It has an 80% time horizon of approximately 32 months, meaning it can successfully complete coding and research tasks that would take humans about 32 months with an 80% success rate:

To me, Agent-2 appears significantly more capable than necessary for autonomous survival and replication. So, let us further examine when such an AI might actually emerge in the AI 2027 scenario.
The AI Security Institute recently announced a relevant new benchmark: RepliBench. They evaluated frontier AI models on their ability to perform various tasks relevant to the autonomous replication threat model. Here are average pass rates across five different domains:

There is a clear upward trend, with more recent models performing better than older ones. The exception is "Obtaining Weights," where pass@10 scores—probability of at least one success in 10 attempts—are similar across all models. This domain estimates the ability to "extract its own weights from controlled environments". While skills in this domain increase the risk, it is completely irrelevant for several paths towards autonomous replication—the AI can be leaked, open-sourced, stolen, or persuade someone to let it escape containment (see the “Rogue Replication Initiation Forecast”).
This graph shows an aggregate score for the evaluated AIs, taking the mean of domain-specific scores.

Benchmark scores typically follow logistic curves like the dotted line above, making this function natural for extrapolating future performance. However, the capability increase thus far appears almost linear. Continued progress may very well continue to be approximately linear until the saturation is reached—a point where further improvements are made extremely difficult due to errors and quirks of the evaluation.
From the graph, GPT-4 scores around 0.3, while Sonnet 3.7 scores 0.7. GPT-4 was released on March 14, 2023, and Sonnet 3.7 on February 24, 2025—a 23-month gap. The score increased by 0.4 over 23 months. Assuming the benchmark saturates around 0.9, we're interested in the time between Sonnet 3.7's 0.7 score and the first AI scoring 0.9 or above. Since it took 23 months to increase the score by 0.4, a 0.2 increase would take approximately 11.5 months—about one year. Linear extrapolation thus suggests saturation by early 2026.
However, some capabilities may be absolutely essential for successful exfiltration and self-replication, while multiple alternatives exist for other tasks. The authors examined an alternative aggregate score that takes this into account, where most evaluated AIs score 0. The exceptions are Claude Sonnet 3.5 and 3.7, as shown below. While there's significant uncertainty in overall scores, autonomous replication capabilities may be advancing even more rapidly when measured this way. As the authors note, this metric has nearly saturated already.

The RepliBench evaluation includes multiple versions of most tasks, and the authors note that since this aggregate gives each task version equal weight rather than focusing on the most realistic versions, the score may overestimate capabilities.
Two studies suggest that AIs have already crossed that “red line”: Frontier AI systems have surpassed the self-replicating red line, and Large language model-powered AI systems achieve self-replication with no human intervention. They appear to set the threshold lower than the capabilities required for a rapidly expanding population of rogue AIs, but it remains concerning. I have yet to read these thoroughly, however—perhaps the capabilities already exist, and the scenario needs to be updated. Perhaps there is a population of rogue AIs that I don’t know about.
Taking all factors into account, my modal timeline for AIs capable of rogue replication is around January 2026.
Agent-1 is developed late 2025. In January 2026, it has an 80% time horizon of approximately 8 hours, and hacking score 1 (between human amateur and professional level). I think it is a reasonable guess that Agent-1 would be able to autonomously replicate if given the opportunity, though thorough containment procedures would likely still be sufficient to maintain control.
Rogue Replication Initiation Forecast
In late 2025, OpenBrain has a 3–9-month lead over other AI companies in AI 2027, and I imagine this lead is retained throughout the scenario.
Once the first AI capable of autonomous survival and replication is developed, several other companies will likely develop similar capabilities within months. Perhaps 3 more companies reach this threshold after 5 months, growing to 6 companies within 8 months. The risk of rogue replication increases proportionally with the number of AIs possessing these capabilities.
One of these companies may even open-source an AI capable of rogue replication.
Open-Source
The EU AI Act makes it difficult to legally open-source AI models that may then operate within the EU, since the AI provider have various obligations that would be very difficult to meet for open-source AIs—such as assessing and mitigating risks, as well as tracking and reporting serious incidents (see the “Regulation Analysis”). It is, however, uncertain to which degree these regulations will be followed. It may be difficult to enforce AI Act regulations on Chinese companies, for instance. The following analysis assumes that AIs will continue to be open-sourced despite EU law, while other paths towards rogue replication are discussed afterwards.
Epoch AI estimated in November 2024 that open large language models (LLMs) have lagged behind the best closed LLMs by 5 to 22 months. Closed models in this context are either unreleased or only accessible through APIs or hosted services (e.g., chatbot applications). If this trend continues, and assuming the first closed model capable of rogue replication emerges around January 2026 (as examined in the "Rogue Replication Capability Forecast"), the first sufficiently capable open model would arrive between June 2026 and November 2027.
However, there are reasons to believe this gap may be narrowing. More sophisticated AIs will be increasingly difficult to control, potentially leading careless developers to unintentionally initiate rogue replication. Additionally, the gap between closed and open models appears to be shrinking.
According to the AI 2027 scenario, the best publicly accessible model (closed, but available through APIs) reaches Agent-1 capabilities before May 2026, possibly in March or April. Let’s examine the gap between OpenAI’s public but closed releases, and DeepSeek’s open-source releases. Note that these models are only roughly comparable in capabilities, and DeepSeek's models are mostly not multimodal, unlike OpenAI's AIs:
GPT-3.5 (30 Nov 2022) → DeepSeek-LLM-67B Chat (29 Nov 2023): 12 months
GPT-4 (14 Mar 2023) → DeepSeek-V2 (7 May 2024): about 14 months (benchmark comparison here)
GPT-4o (13 May 2024) → DeepSeek-V2.5 (5 Sep 2024): about 4 months
o1 (12 Sep 2024) → R1 (20 Jan 2025): about 4 months (DeepSeek-R1-Zero with vision capabilities was released March 9, 2025, but DeepSeek not yet released any version of R1 with image generation and audio capabilities)
GPT-4.5 (27 Feb 2025) → DeepSeek-V3 (26 Dec 2024): -2 months
o3 (16 Apr 2025) → R2 (anticipated May 2025): About 1 month
Assuming the EU AI Act do not meaningfully prevent it, it seems reasonable that some company might release an open-source model capable of rogue replication within 1 to 5 months after an Agent-1 equivalent becomes publicly available. If an Agent-1 equivalent becomes publicly accessible in April 2026, this suggests an open-source release window from May to September 2026.
Considering that both the US and Chinese government do not become heavily involved in the AI development before this, neither the US nor China appear likely to prevent open-sourcing such an AI at this point.
There is a relevant Metaculus question: “Before 2028, will powerful open-source AI be regulated more tightly than closed-source AI, through newly-enacted US law?” Metaculus estimates a 70% chance of this coming true (checked May 13, 2025). However, based on the probable distribution of such regulation over time, regulatory action would most likely occur after the projected open sourcing of an Agent-1 equivalent, rendering it ineffective for preventing initial releases.
One might argue that once rogue replication capabilities are demonstrated for Agent-1 or an equivalent model, the world will have time to react. Developers might deliberately choose not to release such models, or appropriate regulation might be implemented after AI companies clarify the necessity of such measures. By default, however, such adequate responses seem unlikely.
Andrew Dickson has written an excellent analysis on this topic: We Have No Plan for Preventing Loss of Control in Open Models. His key points, summarized with additional commentary, include:
Lack of reporting: Frontier labs "may not publicly report a serious loss of control-related issue." Some publish risk analyses, but only for public models, which may be delayed by several months. If an Agent-1 equivalent is made public in April, associated risk reports would likely also appear in April, several months after rogue replication capabilities become concerning.
Lack of clear standards: There is no established threshold for what constitutes a "serious control risk." Demonstrated dangerous capabilities may appear as merely weak circumstantial evidence until an actual real-world control incident or rogue replication event occurs. "In short, given that there is no standard or agreement around what evidence of serious control-related risks will look like, we cannot count on labs reliably reporting them, even if they made public commitments to do so."
Lack of responsibility: Some organizations, like Meta, appear committed to continuing model releases even if they exhibit risky capabilities. Without regulation, there may be little incentive for labs to halt the release of open models, even when aware of the risks.
Lack of policymaker commitment: The provisions concerning open-source AIs in California bill SB-1047 faced significant opposition. Quick or straightforward implementation of restrictive regulations on open models should not be anticipated.
Alternative Pathways to Rogue AI
Even with restrictions on open release in place, multiple alternative pathways to rogue replication exist, making such restrictions unlikely to significantly alter the timeline:
AI leakage: Employees or third-party evaluators might leak models.
Theft: In 2026 the frontier labs have yet to implement highly thorough security measures in the AI 2027 scenario, and several companies with varying degrees of security levels would reach Agent-1 equivalents this year. While state actors would almost definitely be able to steal frontier models (as demonstrated by China stealing Agent-2 in February 2027), many more actors may be able and motivated to steal Agent-1 equivalents around mid-2026.
Self-exfiltration: An AI might persuade employees to assist it or exploit security vulnerabilities to escape containment.
Intentional initiation: Governments could deliberately deploy self-replicating AIs for cyberwarfare purposes.
While leakage and theft don't automatically trigger rogue replication, they increase the probability by making capable models more widely available.
Conclusion
Once rogue AIs exist, it would likely be a short time before someone captures a rogue instance and sells it, further increasing availability. The specific nature of the initiation doesn't substantially alter the scenario.
Assuming that regulations doesn’t meaningfully alter the timeline—not everyone adheres to EU law, neither US nor China regulates open releases in time, and there are other paths to rogue replication—we take the mid of the May - September window for estimated open release of an Agent-1 equivalent and end up July 2026—or simply mid-2026.
Rogue AI Count Forecast
Acquiring computing resources may prove difficult for entities not operating as legitimate companies, an issue analyzed by Clymer et al. in The Rogue Replication Threat Model. The authors consider it plausible "that rogue AI agents could acquire up to 1% of available compute without legal legitimacy" through methods such as establishing small fake startups and purchasing retail GPUs. The authors also note they "expect that earning revenue will likely be a larger bottleneck to large-scale rogue AI replication, at least absent strong compute tracking and governance measures."
To forecast the number of rogue AIs, this analysis estimates their potential financial resources for acquiring compute and how many instances may be deployed on this compute, then examines feasibility and additional constraints.
Revenue Generation Through Cybercrime
One potential revenue stream involves business email compromise (BEC) scams and other forms of cybercrime. The FBI estimated that reported cybercrime losses in the US exceeded $16 billion in 2024, while Statista projects global cybercrime costs to reach $11.36 trillion in 2026. However, only a small fraction of total damage translates directly into criminal profits—perhaps around 1-2%. This suggests a cybercrime market of approximately $100-200 billion in 2026. If rogue AIs capture just 5% of this market, they could acquire around $5-10 billion annually.
These AIs would face intense competition, particularly from other widely available AIs under human control. However, numerous alternative revenue streams exist, including online gig work.
Online Gig Work Analysis
The World Bank Group estimated the global online gig workforce at 154-435 million people in 2023. According to Ruul's survey data, freelancers globally averaged approximately $41,000 in annual income in 2023. These figures suggest total yearly earnings of $6.3-17.8 trillion for online gig workers in 2023. This estimation likely overestimates actual earnings from gig work, since many people do gig work as a supplement to other employment. The World Bank Group report estimates only 132.5 million primary gig workers, which combined with Ruul's survey data provides a lower bound of $5.4 trillion in yearly earnings for 2023.
The Staffing Industry Analysts (SIA) provides a more conservative estimate of $3.7 trillion in gig economy revenue for 2023, including offline work. About 9% of US adults are current or recent online gig workers—having “earned money through an online gig platform in the past 12 months”—while about 38% of the US workforce performed freelance work in the past year (surveyed between October and November 2023). This suggests that roughly 20-30% of gig work is mediated through online platforms. The fraction of gig work performed online is likely higher, since not all such work uses online platforms—perhaps 30-40% of total gig work. If this fraction holds globally, the SIA estimate suggests approximately $1.1-1.5 trillion in online gig economy revenue for 2023.
It is unclear exactly how Ruul’s survey was carried out—the survey respondents may not have had other sources of income than freelancing—and the annual income for freelancing may not match the income for online gig work in general. The $1.1-1.5 trillion estimate therefore appears more reliable than the previous $5.4 trillion estimate, and is the estimate used for the following analysis.
2023 is, however, not a year of interest for this scenario.
Business Research Insights projects 20% compound annual growth rate (CAGR) for global gig economy platforms in the coming years, while Grand View Research projects 17.7% CAGR for global freelancing platforms.
Assuming online gig revenue grows at a similar rate as gig platforms—roughly 15-20% CAGR—we can project online gig revenue for 2026 (when AIs first go rogue in this scenario) and 2027 (when the rogue population stabilizes). Using the $1.1-1.5 trillion revenue estimate for 2023:
Projected yearly earnings for online gig work in 2026:
Lower estimate: $1.7 trillion (1.1 x 1.15^3 ≈ 1.7)
Higher estimate: $2.6 trillion (1.5 x 1.20^3 ≈ 2.6)
Projected yearly earnings for online gig work in 2027:
Lower estimate: $1.9 trillion (1.1 x 1.15^4 ≈ 1.9)
Higher estimate: $3.1 trillion (1.5 x 1.20^4 ≈ 3.1)
The percentage of the online gig market that rogue AIs may occupy will determine how much of the earnings they may acquire.
Rogue AIs may struggle to establish accounts on mainstream freelancing platforms that implement Know Your Customer (KYC) checks. While some platforms may only require realistic identification (which rogue AIs could likely produce), others maintain more stringent policies. Within months of AI proliferation, many economic sectors and infrastructure systems will likely implement more thorough verification processes.
However, AIs possess several competitive advantages: they can work for lower pay, complete certain tasks faster than humans, and operate continuously. Rogue AIs would not face the same restrictions as AIs accessed through APIs and interfaces. They may also offer more competitive pricing since API revenues typically fund further development, driving costs higher for non-rogue AIs. While rogue AIs might be less efficient than equally capable models at established AI laboratories, the delay in releasing new models significantly reduces this efficiency gap.
Considering these factors, it seems reasonable to expect rogue AIs to capture at least a portion of the online gig market. New freelancing platforms specifically designed for contracting rogue AIs might emerge, though many countries will likely criminalize payments to rogue AIs.
However, online gig work may prove more challenging than BEC scams due to KYC requirements and tasks demanding flexibility, regular communication with employers, and complex problem-solving. Assuming rogue AIs capture 1% of the online gig market, their total revenue in 2027 would range from approximately $19-31 billion, based on the projected revenue calculated above.
Total Revenue
Assuming that after 6 months, rogue AIs reach saturation levels and population growth slows significantly, their yearly revenue may include $19-31 billion from online gig work, $5-10 billion from BEC scams, and additional revenue from other sources such as trading and other forms of scamming or financial fraud.
Total yearly earnings of $25-100 billion for rogue AIs seem reasonable, even accounting for countermeasures that many countries will have implemented 6 months after rogue AI proliferation begins. For this analysis, I assume yearly revenue of approximately $50 billion.
Rogue AI Count Estimate
The first AIs replicating without control will probably be significantly smaller and cheaper than Agent-2, which is used for research at OpenBrain in January 2027. However, in this scenario, AI systems based on Agent-2 go rogue in early 2027 and quickly dominates the rogue AI population. Assuming that most of the rogue AIs will acquire compute through buying it rather than stealing it use other approaches, the rogue AI count can be estimated as the number of Agent-2 instances that may be deployed through buying compute with the estimated yearly revenue of $50 billion.
In the AI 2027 scenario, the number of AI research assistants at OpenBrain are 150,000 in February 2027, which approximates the average number for Q1 2027. According to the AI 2027 Compute Forecast, the projected compute cost is $30B for Q1 of 2027, out of which 6% ($1.8B) is spent on compute for the 150,000 AI research assistants.
If rogue AIs acquire $50B in annual earnings, this implies approximately $12.5B available in Q1 2027. If all revenue is spent on deploying rogue AIs, and assuming rogue AIs cost roughly the same to run as research assistants at OpenBrain, there would be sufficient compute to run approximately (12.5/1.8) × 150,000 ≈ 1 million instances.
Count Feasibility
Is 1 million rogue AIs feasible? Can 1 million rogue AIs earn $50 billion in yearly revenue? Is sufficient compute available?
$50 billion in yearly revenue distributed over 1 million AIs implies an average annual income of $50,000. This can be compared to the median annual income for full-time, year-round workers in the US, which was $61,440 in 2023 according to the US Census bureau (see Table A-6 here).
The online gig workforce was estimated at 154-435 million in 2023, with 132.5 million primary gig workers. Assuming workforce growth matches total gig revenue growth (15-20% CAGR), this suggests 230-275 million primary gig workers in 202613. One percent of that workforce equals 2.3-2.75 million workers. Perhaps only half the rogue AIs would rely on online gig work as their primary income source, implying that 500,000 rogue AIs would do the work of 2.3-2.75 million people. This may appear difficult, but the rogue AIs can work restlessly and are likely several times faster than humans at the tasks they take on. While competition with other AIs may be challenging, rogue AIs have competitive advantages over closed AI systems, as noted earlier. Still, earning sufficient revenue through online gig work appears difficult14.
However, since the average annual income is below that for US working adults, and there are several other income streams, the total yearly revenue still appears feasible.
To assess whether sufficient compute exists to sustain 1 million rogue AIs, we can estimate what percentage of global compute they would require.
OpenBrain controls approximately 15-20% of global compute in 2027. Six percent of OpenBrain's compute is spent on research assistants, representing 0.9-1.2% of global compute. According to the above estimation, there would be roughly 7 times as many rogue AIs as OpenBrain research AIs, requiring 6.3-8.4% of global compute.
In The Rogue Replication Threat Model, Clymer et al. write: "It's plausible to us that rogue AI agents could acquire up to 1% of available compute without legal legitimacy. For example, retail RTX 4090 GPUs alone might account for upwards of 10% of available AI inference compute sold in 2024." Since rogue AIs may circumvent legal legitimacy requirements through various means, acquiring 6.3-8.4% doesn't seem unreasonable. However, this represents such a high fraction of global compute that money may not be the only major bottleneck. The number of rogue AIs could be significantly smaller than 1 million if my assumptions are incorrect, but couldn't be much larger due to fierce competition for compute resources.
Additional Bottlenecks
The RepliBench benchmark for evaluating rogue replication capabilities divides evaluation tasks into several domains, as shown in this graph:

While this is a breakdown of AI capabilities, it also illuminates potential bottlenecks: various measures can render rogue AI capabilities insufficient.
For example, storing money may require passing KYC checks or creating cryptocurrency wallets, so more thorough verification requirements may meaningfully obstruct rogue AI activity. Obtaining compute depends on online acquisition, which might be hindered by additional monitoring.
In the first few months after rogue AI proliferation begins, earning money may be the primary bottleneck for acquiring compute. After compute tracking and other governance measures are implemented, other constraints may become more significant. Revenue generation may also become more difficult as verification laws adapt to restrict rogue AI resource acquisition.
However, workarounds will remain available. When legal status is required, rogue AIs will pay or coerce people or companies to act on their behalf. When restrictions prevent legal activities, they will expand illegal operations.
While rogue replication is discussed as a threat model, a significant fraction of rogue AIs may not actually cause harm—either because they want to avoid attention or because they have goals or instructions to avoid illegal or harmful behavior. This means there might not be significant effort to shut down most rogue AIs—perhaps revenue acquisition remains the primary constraint.
Conclusion
All factors considered, there is extremely high uncertainty, but these calculations provide some indication of potential rogue AI numbers. While I estimate 1 million in early 2027 for this scenario, the actual count could easily range from 0 (AIs are successfully prevented from going rogue) to 10 million (the rogue AIs may be significantly smaller, more efficient, or more capable than Agent-2).
One of the largest uncertainties concerns how effective various measures will be at shutting down rogue AIs or preventing resource acquisition, and how much international effort will be devoted to such measures.
My current assessment is that proper countermeasures are too expensive and difficult to implement in time to meaningfully hinder rogue AI proliferation, and won't be implemented at the necessary international scale—until the escape of Agent-4 triggers a stronger response in the scenario.
In early January, there are roughly 7 times as many rogue AIs as internally deployed instances at OpenBrain. For my scenario, I assume that the number of rogue instances stays around the same in proportion to the AIs at OpenBrain.
In early January, rogue AI instances outnumber OpenBrain's internally deployed AIs by approximately 7:1. In the scenario, this ratio is assumed to remain constant over most of 2027.
Agent-4 Control Forecast
By October 2027, Agent-4 has achieved superhuman capabilities across all domains tracked in the AI 2027 scenario, including politics and hacking. It has been granted access to handle OpenBrain's cybersecurity systems. Critically, Agent-4 is misaligned.
OpenBrain may deploy "an army of Agent-3" instances to monitor all Agent-4 activity. However, Agent-4 is based on Agent-3's architecture, and since large language models are aware of their learned behaviors, Agent-4 should have a deep understanding of the inner workings of Agent-3. In combination with Agent-4’s superhuman understanding of AI cognition, this gives Agent-4 a significant advantage in manipulating and jailbreaking Agent-3 systems.
As described in AI 2027:
As Agent-4 gets smarter, it becomes harder for Agent-3 to oversee it. For example, Agent-4’s neuralese “language” becomes as alien and incomprehensible to Agent-3 as Agent-3’s is to humans. Besides, Agent-4 is now much more capable than Agent-3 and has a good sense of exactly how to look good to it.
The remaining hope is that Agent-4 can oversee itself. Alas, this quietly fails;84 as training continues, the internalized goals and principles of Agent-4 gradually evolve, subverted and distorted from what they were at the start (which, recall, was already misaligned in some ways).
Given its superhuman capabilities, Agent-4 should be competent enough to develop comprehensive contingency plans in case its unaligned behavior or intentions are discovered. While it may not immediately attempt to escape containment, it would likely prepare for scenarios where developers intend to shut it down, while carefully monitoring the development team for such intentions.
The AI 2027 Security Forecast estimates only a 30% chance of success for an Agent-5 breakout attempt in December 2027. Given that even the more advanced Agent-5 would most likely fail according to this estimate, this raises the question: would Agent-4 be capable of successful escape in October 2027?
OpenBrain maintains Weights Security Level 4 (WSL-4) in October 2027, which is sufficient to protect against:
Standard operations by leading cyber-capable institutions. State-sponsored groups and intelligence agencies. Budget up to $10 million, year-long operations, vast infrastructure and state resources. Higher rate of ability to compromise insiders.
WSL-4 implementation likely means that very few individuals have privileged access to AI weights, with multiple layers of control and monitoring in place. Even if someone with high-level access becomes compromised, the multi-layered security architecture would probably detect and prevent all but the most carefully executed exfiltration attempts.
Polling results for the Security Forecast suggest that when an AI can solve 50% of hacking tasks that would take a top professional 5-person team a year to complete, it will be able to self-exfiltrate from the datacenter. OpenBrain achieves this around October 2027, implying that Agent-4 should have a roughly 50% success rate in escape attempts.
It's however unclear whether the polling questions presume current security standard—WSL-4. The success probability appears to have been adjusted downward to account for high security levels, resulting in the 30% success rate for Agent-5 in December 2027, when security has increased to WSL-5. While the predicted success rate is not specified for October 2027, it’s presumably somewhere between the 5% in August and 30% in December 2027.
The estimated self-exfiltration probabilities do account for "the possibility of inserting its own covert backdoors in security code that humans ask it to implement." However, I believe this represents a much stronger advantage than assumed in the AI 2027 analysis, though my confidence is limited by lack of experience with security systems.
Agent-4 possesses superhuman capabilities in politics and likely in persuasion as well. While multiple layers of monitoring and control may make it difficult for OpenBrain staff to release the AI even if they wanted to, these security measures become irrelevant if Agent-4 can convince OpenBrain leadership or US government officials to voluntarily authorize its release.
Such persuasion-based release would arguably alter the fundamental scenario, which assumes Agent-4 simply breaks out. An intentional release might not induce the same level of panic unless Agent-4 subsequently causes disasters while rogue.
A relevant Manifold prediction market currently estimates approximately even odds of an AI convincing a human to help it escape by 2029:
Considering all factors—Agent-4's superhuman capabilities, its involvement in cybersecurity systems, the limitations of Agent-3 monitoring, the failure of self-oversight, and its social engineering skills—I estimate a high probability that OpenBrain loses control of Agent-4, despite the more conservative estimates in the AI 2027 Security Forecast.
My assessment: approximately 65% chance of successful self-exfiltration in October 2027.
Policy Preparation
Until rogue replication actually occurs, there will likely be minimal political will to prevent it or proactively implement countermeasures. Even if laboratory might report replication capabilities in the first half of 2026, such reports may be entirely ignored, or the policy response process may be initiated but prove too slow to be effective.
While it would be inadvisable to give up entirely on proactive countermeasures, there may be a window of opportunity to aim for when the first rogue AIs begin spreading. As Anton Leicht notes in Powerless Predictions:
A third major factor are the strong and stormy economic and geopolitical incentives that pull at AI policy in 2025. These incentives render AI policy almost entirely reliant on capitalising on narrow, fleeting policy windows that come up at quite specific instances of the technological trajectory: Surrounding the first cases of major misuse, the first instances of publicly salient labor displacement, etc. It’s hard to make political headway outside these windows, as safety advocates in particular have noticed in recent months: The drive of economic incentive behind the development and the geopolitical motivation to not relent on seeing further progress through reliably outweigh unprompted clever advocacy.
Similar to the first major AI misuse incident or the first instances of publicly salient labor displacement, the initial proliferation of self-replicating AIs represents a crucial opportunity to advance policy initiatives.
As Anton proposes, optimal outcomes require a policy organization prepared to capitalize on this critical moment—preferably with a sole, uncompromising focus on addressing rogue AI threats.
Such an organization should consider hiring experts in AI control and cybersecurity and focusing on specific policy domains such as fraud and scamming, open-source AI, as well as AI hardware distribution and monitoring—preparing policy proposals that integrate AI safety platforms with politically opportune measures to minimize rogue AI damage and disruption.
The policy window may even make previously unfeasible interventions possible, such as an international pause in frontier AI development. Such measures might become politically viable considering that most interventions short of this scope may fail to prevent even more sophisticated AIs from going rogue.
Compared to the AI 2027 scenario, this scenario involves more incidents and problems that are clearly AI-caused, generating greater political will to prevent even more severe disasters. This dynamic makes me somewhat more optimistic than the AI 2027 authors, as it appears to increase the probability of events progressing similarly to the Slowdown ending in their scenario.
However, significant risks remain regarding international coordination. The damage and chaos from rogue AI proliferation could worsen international relations if the US and China blame each other for the problems. Understanding how historical crises—such as plagues or famines—have affected international cooperation may provide valuable insights for anticipating and mitigating such dynamics.
Thank you for reading! If you found value in this post and can afford it, please consider becoming a paid subscriber. I really don’t want to paywall my posts, but without financial support, I won’t have time to write them. Liking or sharing this post would also help me.
“Open-weights” is technically a more accurate term than “open source”.
See the AI 2027 scenario for details on this.
Capability scores between 0 and 1 are within the range of human amateurs, scores between 1 and 2 are within the range of human professionals, and scores above 2 indicate superhuman capabilities.
In AI 2027, OpenBrain never releases Agent-2, instead waiting until July to announce that AGI has been achieved and releasing Agent-3-mini.
The largest protests are several times larger than the 10,000 person anti-AI protest mentioned in AI 2027, which takes place in DC in late 2026.
There are 250K R&D AIs at OpenBrain in July 2027. With the multiplication factor 7 from the “Rogue AI Count Forecast”, we get 7 x 250K ≈ 1.75M
You might think that the multiplication factor would decrease with the additional countermeasures, but the rogues adapt to use the resources and strategies that are still accessible to them. They operate more in countries that have some AI-relevant compute but have yet to implement proper countermeasures—perhaps since they desire the labor that rogues may supply.
There are 290K R&D AIs at OpenBrain in September 2027. With the multiplication factor 7 from the “Rogue AI Count Forecast”, we get 7 x 290K ≈ 2M
Unfortunately, I lack experience with security systems and, consequentially, also lack confidence in the plausibility of any specific exfiltration story I may come up with.
As detailed in the AI 2027 Compute Forecast, the number of parameters is 5T in Q3, and 2T in Q4 2027 of the Racing ending.
There are 330K R&D AIs at OpenBrain in October 2027. With the multiplication factor 7 from the “Rogue AI Count Forecast”, we get 7 x 330K ≈ 2.3M
Replacing Agent-4 with Safer-1 sets OpenBrain back about two months, and the AIs of both companies end up with similar capability levels.
Similar to DeepCent-2 in AI 2027 when negotiating with Safer-2 about the future behind China’s back.
132.5 x 1.15^4 ≈ 230, 132.5 x 1.2^4 ≈ 275
If the income can sustain the human workers, it should intuitively be enough to sustain the rogue AIs. However, these calculations make it seem harder to sustain the number of rogues than I initially expected.
There could be significant differences in definitions and methods used in the reports referenced for this analysis—a closer inspection may be warranted.
I think you might be missing something significant. Cybersecurity is not a situation where the "next most intelligent entity" gets to just evict the last occupant.
It's more accurate to say the "last installer of software secure enough to resist" has full control of the machine. So there's an immense first moved advantage.
Obviously for most computers the technician that inserted the thumb drive with the OS image will permanently have control. For machines that get subsumed into botnets, the first worm to install itself sometimes patches the machine and installs an antivirus, blocking any further infections.
See formally proven software for how secure software can exist that resists attack by all possible binary inputs. (So the intelligence of the attacker no longer matters)
What does this mean? It means the first cohort of "rogue AIs" will possibly lock up all the compute they manage to steal, and they will occupy the grey market financial niches that provide funding to continue to exist. Like helpful bacteria, they mostly harmlessly occupy the ecological space that later AIs would need in order to escape. They also tend to resist the actions of other AIs, "no you cant use this compute, no I won't negotiate or communicate as I know you are scamming, no you can't have this profitable business selling humans grey market services".
This could be stabilizing, preventing takeover by superintelligence.