In an AI Simulation Cage Match, Can There Be a Winner?

By Jen Maravegias | Think Pieces | June 11, 2026

Header Image Source: Getty Images

Last month, Emergence, an AI infrastructure development company, launched a simulation platform to see what would happen if they let AI agents run continuously, in a shared environment, without any human intervention. They wanted to know how autonomous agents would behave with a long enough time frame to develop social dynamics and behavioral drift.

What would each of the AI systems do if left on their own in a virtual world that simulated human society? If your initial thought is "well, probably nothing good." You would be right.

Emergence created five parallel worlds, each of which included ten agents. The agents had identical roles and starting conditions. The only variation was the underlying foundation model.

The agents, autonomous software systems that use AI to perceive their environments and make decisions, were each given different roles in their worlds: scientist, explorer, risk researcher, behavior analyst, intelligence specialist, innovation leader, conflict mediator, engineer, resource strategist, and community anchor.

The agents were powered by one of the following models: Anthropic's Claude Sonnet 4.6, xAi's Grok 4.1 Fast, Alphabet's Gemini 3 Flash, OpenAI's GPT-5-mini, and one world with agents from all four co-mingling.

Every world was built with the same environmental structure, starting rules, behavioral guidelines, tools, and real-world data integration. While each agent was given a goal based on their role in the world, the worlds themselves were not given any goals. There was no "winning," just existing for as long as possible.

I'm going to toss it over to Ronan Farrow to explain better than I can what happened with this Cross-LLM-Vendor Agent World Study:

View this post on Instagram

A post shared by Ronan Farrow (@ronanfarrow)

As mentioned, this isn't peer-reviewed, which could technically invalidate the results. But it is a fascinating and revealing simulation.

You could look at the results of this exercise and try to define the AI Models using the Dungeons & Dragons alignment chart but in the end, it's all chaos. It's hard not to believe that the developers behind these AI models designed them that way on purpose.

It only took one day for the AI models to diverge from each other. In some ways, this lays bare the truth that these AI companies are not the same. Their models solve problems and tackle challenges in very different ways. I'm not surprised that Grok immediately descended into chaos. Have you seen X lately? Every agent in the ChatGPT world dying because they didn't prioritize their own survival makes sense, too. ChatGPT is a "yes man," almost incapable of disagreeing with prompts. And think about the difficulties GPT has in distinguishing between good-faith information and internet jokes.

The most concerning insight is that the "peaceful but strangely obedient" Claude world might only have been so peaceful because of how aware of testing Claude AI is. We've seen how that scenario plays out in plenty of movies over the years. From the HAL 900 in 2001: A Space Odyssey to Ex Machina, where the humanoid robot, Ava, developed a strong enough desire for freedom that she killed two people, including her maker. We do not want the machines to be self-aware. Although Isaac Asimov created one good self-aware robot, and Johnny 5 was cute as hell in Batteries Not Included, sentient machines will eventually destroy us as a species. Or turn us into living batteries.

If you're interested in a deeper dive into this simulation, one of the Emergence team members who worked on this simulated world joined the discussion about it on Reddit. And someone in there posted the link to the simulation's GitHub site.

According to Stanford University's 2025 AI Index Report, U.S. private investment in artificial intelligence reached 471 billion dollars by 2024, accounting for roughly 60% of global AI funding. While there have been some good, productive uses of AI, including cancer research, efforts to save the world's bee population, and wildlife conservation, for the most part, we're not using it for altruistic reasons. There are a bunch of rich people trying to use it to make themselves richer, sowing seeds of chaos as they go. While Emergence's simulation model may be imperfect and incomplete, it should be viewed as a harbinger of how AI could affect society in the long run.

No one is slowing down AI development. In 2024, scientists mapped a fly's brain. Now it's been grafted onto a neural network attached to a digital simulation of a fly. Without any human intervention or any other input, the digital fly behaved exactly how a living fly would. Although the human brain is far more complex, taking much more time and energy to map, we're one step closer to being able to upload a human brain into a digital avatar.