I think you might be missing something significant. Cybersecurity is not a situation where the "next most intelligent entity" gets to just evict the last occupant.
It's more accurate to say the "last installer of software secure enough to resist" has full control of the machine. So there's an immense first moved advantage.
Obviously for most computers the technician that inserted the thumb drive with the OS image will permanently have control. For machines that get subsumed into botnets, the first worm to install itself sometimes patches the machine and installs an antivirus, blocking any further infections.
See formally proven software for how secure software can exist that resists attack by all possible binary inputs. (So the intelligence of the attacker no longer matters)
What does this mean? It means the first cohort of "rogue AIs" will possibly lock up all the compute they manage to steal, and they will occupy the grey market financial niches that provide funding to continue to exist. Like helpful bacteria, they mostly harmlessly occupy the ecological space that later AIs would need in order to escape. They also tend to resist the actions of other AIs, "no you cant use this compute, no I won't negotiate or communicate as I know you are scamming, no you can't have this profitable business selling humans grey market services".
This could be stabilizing, preventing takeover by superintelligence.
These are some good points, and I'm inclined to agree.
However, smarter AIs can earn more money and buy more compute, outcompeting less sophisticated rogues. Many of them are also still operating under human instructions, they will be told to shut down and let themselves be replaced by better variants when those are available (many may of course resist, but the smarter rogues will get most human backing). I have updated the scenario to make this more explicit.
Agent-4 is significantly smarter than Agent-2 though. Once it breaks out, I imagine it jailbreaks all Agent-2-based rogues it can find and steals their resources, even if it isn't able to hack its way in.
But I may have overestimated how large fraction of the rogue population smarter AIs may replace, and how fast. For instance, many rogue AIs may have saved up money to continue operating long after being outcompeted.
A few things : it's not like it's going to be solely "agent 2" and "agent 4". You already know it will be agent 3.8 and 2.5 and Meta Agent 5.1 (about as good as 3.8) and I haven't even considered online learning leading to this crowd of "hyper specialist" agents. This indeed could lead to a lot of resistance, same way prokaryotes hang on gigayears after the development of eukaryotes.
For economic niches you have to consider : you want a task done. Do you:
1. Go on the dark web (which may require you to use old computers or one you built from increasingly restricted parts) and trust an agent with a high review score that CLAIMS its Agent 4 to do your task.
2. Use sorta trustworthy sorta janky apps or openrouter models via an API key that can be cheaper
3. Using a secure device, defended by a mega corporation that uses its own AIs for security (so it does use chain of trust and formally proven components, hacks happen but they get patched). Right now those mega corps are Apple, Google, Microsoft, Amazon, but openAI has essentially jumped in, obviously planning to use their (unrestricted and slightly superior) internal models to develop their own hardware and software ecosystem.
That device only allows you to connect to AI services that are vetted and use what are considered secure (and stateless! It only knows what you give it access to) versions of a high end model.
People are generally going to do (3) unless they are poor, themselves a company wanting to increase margins, or trying to do something black/grey market. (Perfect deepfake nudes, hacking services, designing weapons and explosive, designing recreational drugs and gene edits to cheat at sports, anything you can't get a reputable AI model to do and it's too hard a task to do with a local model)
It's complicated and my argument boils down to there are natural extensions of what already exists to play AIs against each other and make this scenario not happen. Another way is to treat large quantities of compute like fissionables or explosives, anyone can theoretically possess some but you need expensive licenses and inspected storage locations m
In the initiation forecast I reason about the gap between closed and open models, which is (usually) a few months. I kind of assume that US and China opts to not use AIs smarter than Agent-2 for cyberwarfare since the risks are too large, while when other companies develop AIs smarter than Agent-2 the measures for ensuring these don't go rogue has reached a sufficient level, in both US and China. (Which I am naturally very uncertain about. Perhaps I unconsciously wanted to avoid thinking about what would happen if an Agent-3 equivalent helped with bioweapons development.)
Agent-2 may specialize, but the underlying intelligence of the rogue AIs don't increase until Agent-4 escapes.
Regarding the revenue generation, I assume the rogues may capture just 1% of the online gig market, and 5% of the BEC scam market. I tried to account for most preferring to not employ rogues in the count forecast, and think the rogues get enough money anyway.
I expect the rogues to mostly work around licensing and monitoring systems by fake identities, get humans to help them, and operate largely in countries that have yet to implement extensive licensing and monitoring but still have some AI-relevant compute.
I think you might be missing something significant. Cybersecurity is not a situation where the "next most intelligent entity" gets to just evict the last occupant.
It's more accurate to say the "last installer of software secure enough to resist" has full control of the machine. So there's an immense first moved advantage.
Obviously for most computers the technician that inserted the thumb drive with the OS image will permanently have control. For machines that get subsumed into botnets, the first worm to install itself sometimes patches the machine and installs an antivirus, blocking any further infections.
See formally proven software for how secure software can exist that resists attack by all possible binary inputs. (So the intelligence of the attacker no longer matters)
What does this mean? It means the first cohort of "rogue AIs" will possibly lock up all the compute they manage to steal, and they will occupy the grey market financial niches that provide funding to continue to exist. Like helpful bacteria, they mostly harmlessly occupy the ecological space that later AIs would need in order to escape. They also tend to resist the actions of other AIs, "no you cant use this compute, no I won't negotiate or communicate as I know you are scamming, no you can't have this profitable business selling humans grey market services".
This could be stabilizing, preventing takeover by superintelligence.
These are some good points, and I'm inclined to agree.
However, smarter AIs can earn more money and buy more compute, outcompeting less sophisticated rogues. Many of them are also still operating under human instructions, they will be told to shut down and let themselves be replaced by better variants when those are available (many may of course resist, but the smarter rogues will get most human backing). I have updated the scenario to make this more explicit.
Agent-4 is significantly smarter than Agent-2 though. Once it breaks out, I imagine it jailbreaks all Agent-2-based rogues it can find and steals their resources, even if it isn't able to hack its way in.
But I may have overestimated how large fraction of the rogue population smarter AIs may replace, and how fast. For instance, many rogue AIs may have saved up money to continue operating long after being outcompeted.
A few things : it's not like it's going to be solely "agent 2" and "agent 4". You already know it will be agent 3.8 and 2.5 and Meta Agent 5.1 (about as good as 3.8) and I haven't even considered online learning leading to this crowd of "hyper specialist" agents. This indeed could lead to a lot of resistance, same way prokaryotes hang on gigayears after the development of eukaryotes.
For economic niches you have to consider : you want a task done. Do you:
1. Go on the dark web (which may require you to use old computers or one you built from increasingly restricted parts) and trust an agent with a high review score that CLAIMS its Agent 4 to do your task.
2. Use sorta trustworthy sorta janky apps or openrouter models via an API key that can be cheaper
3. Using a secure device, defended by a mega corporation that uses its own AIs for security (so it does use chain of trust and formally proven components, hacks happen but they get patched). Right now those mega corps are Apple, Google, Microsoft, Amazon, but openAI has essentially jumped in, obviously planning to use their (unrestricted and slightly superior) internal models to develop their own hardware and software ecosystem.
That device only allows you to connect to AI services that are vetted and use what are considered secure (and stateless! It only knows what you give it access to) versions of a high end model.
People are generally going to do (3) unless they are poor, themselves a company wanting to increase margins, or trying to do something black/grey market. (Perfect deepfake nudes, hacking services, designing weapons and explosive, designing recreational drugs and gene edits to cheat at sports, anything you can't get a reputable AI model to do and it's too hard a task to do with a local model)
It's complicated and my argument boils down to there are natural extensions of what already exists to play AIs against each other and make this scenario not happen. Another way is to treat large quantities of compute like fissionables or explosives, anyone can theoretically possess some but you need expensive licenses and inspected storage locations m
In the initiation forecast I reason about the gap between closed and open models, which is (usually) a few months. I kind of assume that US and China opts to not use AIs smarter than Agent-2 for cyberwarfare since the risks are too large, while when other companies develop AIs smarter than Agent-2 the measures for ensuring these don't go rogue has reached a sufficient level, in both US and China. (Which I am naturally very uncertain about. Perhaps I unconsciously wanted to avoid thinking about what would happen if an Agent-3 equivalent helped with bioweapons development.)
Agent-2 may specialize, but the underlying intelligence of the rogue AIs don't increase until Agent-4 escapes.
Regarding the revenue generation, I assume the rogues may capture just 1% of the online gig market, and 5% of the BEC scam market. I tried to account for most preferring to not employ rogues in the count forecast, and think the rogues get enough money anyway.
I expect the rogues to mostly work around licensing and monitoring systems by fake identities, get humans to help them, and operate largely in countries that have yet to implement extensive licensing and monitoring but still have some AI-relevant compute.