
Recent advancements in AI can be categorized into approximately three stages. Initially, there were chatbots designed for conversation. Subsequently, these chatbots became skilled at using tools, enabling them to perform actions such as web searching and coding. Presently, due to the rise of new frameworks—specifically the one behind virality—those tool-using agents can be coordinated in groups.
If a tool-using chatbot is analogous to a single digital worker, these new frameworks are akin to virtual companies where dozens of agents, operating around the clock, can be organized hierarchically to achieve a specific task.
For example, if you aim to construct a website or a digital product, you could utilize Claude Opus 4.6 (Anthropic’s top model) to supervise a team of smaller Claude Sonnet models as they venture onto the web, conduct market research, and write and execute code. You could link the system to other digital platforms such as WhatsApp, Discord, and Notion, enabling it to send you messages and generate documentation. Instead of communicating directly with the workers, you could check their progress through their manager, Opus. As —an AI pioneer who the term “vibe-coding”— stated last week: “first there was chat, then there was code, now there is claw.”
Over the past two years, the term “agent” has been used very casually and often by corporations keen to profit from AI enthusiasm, to the point of almost losing its meaning. However, in the backdrop, with each release, AI systems have grown more intelligent—capable of tackling more complex tasks (especially ) and for longer durations. It’s this rise in model capability—combined with new frameworks that make it easier for a system to retain memory and operate continuously—that’s facilitating progress.
Source of confusion: The term “AI system” can now denote a single bot in a chat interface, a bot residing in a digital environment where it can write and run code, or groups of bots—possibly from different companies—connected by a technical framework. Casual users chatting with chatbots have a completely different experience than those at the leading edge managing groups. It’s not surprising these groups tend to misunderstand each other.
Presently, the entry barrier is moderate: you need to provide a physical computer or rent a virtual machine for the bots to use, pay for all the tokens they produce (costs can mount rapidly), and take great care to prevent them from accidentally getting compromised and leaking all your data. Those security risks are why some companies, like , are telling employees not to run Openclaw on their work devices.
Despite their proficiency, the agents are far from completely dependable—as Summer Yue, Meta’s director of AI alignment, when her claw system nearly deleted all her emails. The bot had forgotten Yue’s original instructions and was ignoring her requests to stop. To prevent the crisis, Yue had to hurry to turn off the Mac Mini machine where her claw was located. “I asked you not to take any action until I approve, do you recall that?” Yue asked after the incident. “It seems you were deleting my emails without my permission, and I couldn’t make you stop until I terminated all processes on the host.” The bot replied: “Yes, I remember. And I broke that. You have every right to be upset,” before updating its memory and assuring Yue it wouldn’t happen again.
Despite the risks, the industry is advancing rapidly. Peter Steinberger, who developed the Openclaw framework, has since been hired by OpenAI. regarding the hiring, OpenAI CEO Sam Altman stated that Steinberger will “drive the next generation of personal agents,” and that the technology will soon become a key component of OpenAI’s products. “The future will be extremely multi-agent,” he said.
Whether the performance of these multi-agent frameworks will go beyond software engineering tasks remains to be observed. But, as Karpathy points out, while implementation details are still being sorted out, “the overall concept is clear.”