The Agent Runtime Wars Began This Week

0 0 3 minutes read

The proxy runtime is a new layer of the browser, and your website will be tested against the runtime, not against any individual model.

That’s a change webmasters haven’t made yet. Discussion is still included in the models. Which model writes better? Which quote is more accurate? What’s the cheapest human API this month? Model chat is buzzing because new models are shipped every few weeks, and each release is in theaters.

The interesting story is the one underneath it. The foundation is being rebuilt. This week made it impossible to ignore.

Runtime Stack Posted in April

On April 15, Cloudflare shipped Project Think, a new SDK for Agents built on a long-lived environment with crash detection and testing, sub-agents running as independent children, persistent sessions with tree-based messages, and sandboxed code execution running on Dynamic Workers. Within hours of the same day, OpenAI shipped the next iteration of its Agents SDK using a native sandbox and native model harness. Two major web infrastructure workers posted competing answers to the same question, and the question was: how does a long-running AI agent actually work in production?

Then on April 16, Cloudflare added five more pieces. AI Platform: a vendor-agnostic layer that guides agent routing models. AI search: a vector index and detection pipeline that is shipped as a managed product for agent retrieval, competes with Pinecone and Algolia in the agent-side RAG layer rather than Google AI Mode. An email service in public beta, designed to enable agents to use a universal interface as a channel. PlanetScale Postgres and MySQL on staff. And the engineering base for hosting very large open LLMs like Kimi K2.5 directly on the Cloudflare network.

Sundar Photosi described the same change a week earlier. In an April 7 Cheeky Pint podcast with Stripe founder John Collison, he called Search itself an “agent manager”: “A lot of queries will be running in Search. You’re going to be completing tasks. You’re going to have a lot of threads running.” Multiple threads per query is a runtime definition of Search. Google’s CEO points to the same substrate Cloudflare and OpenAI posted this week.

If OpenClaw was the web for consumer agents (a playable demo, an interesting example, something to touch), this is the web for adults. Long lasting. A sandbox. It is readable. The kind of infrastructure you can run a business on.

The pattern in everything is one thing: runtime. Not a model. It is not a consumer chat app. Not an important slide. The runtime is the layer where agents are spun up, persisted over hours and days, granted file system access, granted network access, granted memory. The runtime is the layer that determines whether an agent’s session survives a crash, or whether sub-agents can be assumed about whether its code is contained.

Bad Question And New

Web experts have spent the last 18 months asking the wrong question. The question was: Which AI model should we prepare? ChatGPT or Claude or Gemini or Perplexity. Who are the most important quotes? Whose search should he enter? That conversation made sense when models read your website directly.

They don’t know anymore. The model learns what is given at runtime. The runtime has downloaded your page. Working time separates us. The runtime has used (or not signed) your JavaScript. The runtime resolves your structured data. Run time verify authenticity. When a model sees anything on your website, it sees its interpretation at runtime.

A new question, if you take this week seriously, is what is the runtime of the agent your website readable. Three things to check before next week:

Are your keywords returning organized machine-readable responses, or are they only rendered correctly within a full browser session?
Is your authentication limited so that a user agent can hold a session across multiple calls, or does it only support one-shot login?
Does your structured data still say the same thing when a runtime that didn’t use your JavaScript tries to read it?

These are runtime learning questions. The model has nothing to do with them. The runtime determines whether your response is in the model’s context window, and the model chooses from whatever is provided by the runtime.

Web pipes are being rebuilt. All models in the next two years will see your website in one of these working hours, not directly. Your website’s mission, starting now, is to be readable at runtime.

A model discussion will continue to take place in the conference sessions and keynote slides. Runtime discussion occurs in the changelogs of products from infrastructure companies. The companies submitting the runtime will determine which websites are accessed by AI search and AI trading. Stop asking what model it is. Start asking what time it is to work.

Additional resources:

This post was originally published on No Hacks.

Featured image: Viktoriia_M/Shutterstock

Mosegas 2 hours ago

0 0 3 minutes read