Digital Marketing

Google’s Mueller Says llms.txt Can’t Help LLMs Classify Sites

John Mueller of Google argued that LLM systems cannot use files like llms.txt to determine which websites will appear for a given query.

He commented on a recent episode of Search Off the Record, a podcast from the Google Search Relations team.

His comments point to a broader signal problem, not just deliberate play. Even a well-written llms.txt file is self-reported information from a site that wants to be selected.

To find out, Mueller pointed back to normal HTML pages and internal links.

What Mueller said

The discussion started with a question about whether publishers should convert websites to Markdown for LLMs. Mueller and his partner Martin Splitt agreed that HTML is still the foundation of crawling and discovery.

The conversation became clear when Mueller turned to llms.txt. He explained the case for the use of availability as an indeterminate:

“You basically tell these programs, like, I have the best website ever. And here are all the pages that everyone should go to. And you have to buy all my products and whatever you put on there. So in the LLM program, basically, by design, they can’t trust what’s here as a way to distinguish between different websites.”

His argument comes down to division. When sites use llms.txt to promote themselves, the files can make similar claims. An LLM that determines which site best answers a question still needs some way to distinguish between them.

What ‘Design’ Can Mean

“By design” can mean two different things, and Mueller did not specify which.

One study is architecture. LLM systems scan web content and cannot use self-reported files when selecting sources.

Another reading treats it as a signal problem. Self-reported symptoms lose value if everyone provides them. Meta keywords stop working for the same reason. Every site got them stuck, and the search engines couldn’t put out a useful ranking signal.

Both readings come to the same conclusion about discovery. But they suggest different things about whether the limit could change over time.

When Mueller Sees a Role

Mueller did not deny all uses of llms.txt. Create one case where it can help:

“If someone is already on your website, maybe some kind of automated system is useful.”

He used the example of an agent trying to buy a picture in a certain place. LLM will visit the site and look for instructions on how to complete the purchase.

The argument separates the navigation detection. llms.txt cannot help LLM choose which site to visit. But it can help if the agent already exists, such as a store directory for someone who is already logged in.

Outside of the Games Controversy

Mueller called creating Markdown pages for bots “a stupid idea”. He also compares llms.txt with meta tag keywords.

Roger Montti of SEJ wrote that llms.txt is “inherently unreliable” because there is nothing to prevent site owners from adding their own content. SE Ranking’s analysis of 300,000 domains found no correlation between llms.txt adoption and citation frequency in LLM responses.

Those arguments centered on what happened when people played the files. Mueller’s podcast comments add the nuance that there is no way within the files to help LLM choose one site over another.

Why This Matters

The game’s argument against llms.txt has always had countermeasures available. Platforms can learn to penalize plagiarism, the way search engines handle structured spammy data.

The separation argument leaves a difficult problem. Punitive manipulation may address abuse, but it does not explain how self-reported files help LLM choose one site over another. Your most accurate llms.txt file still can’t tell LLM to choose your site over a competitor’s.

Looking Forward

The standards for how agents navigate the sites aren’t set yet, Mueller acknowledged. He mentioned WebMCP and other file types discussed.

No one has become the standard. By his estimate, it could take six months to a year, or more, for the agent’s systems to stabilize in a certain way. The discovery layer, where HTML and internal linking already work, is not part of that discussion.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button