97% of llms.txt Files Have No Applications, Ahrefs Data Shows

Mosegas 17 hours ago

0 0 2 minutes read

97% of llms.txt Files Have No Applications, Ahrefs Data Shows

Ahrefs analyzed logs from 137,000 domains and found 97% of llms.txt files received zero requests. No bots, no humans.

The analysis used Ahrefs data to identify the user agents that downloaded the files. About 28% of the 137,000 domains publish the llms.txt file, but since Ahrefs’ clients are professional, the actual adoption on the wider web is probably low.

Of the nearly 38,000 domains with valid files, only about 1,100 received any traffic.

Of the files with requests, 96% come from bots, mostly non-AIs. AI retrieval bots connected to ChatGPT and Perplexity made 1%.

Who Downloads llms.txt Files

SEO testing tools had 21% of requests, followed by anonymous bots (14%), web crawlers like Googlebot (13%), and technology profiling tools like BuiltWith (11%).

AI bots, across all four categories, made up 19% of requests. AI is the largest segment, but the segmentation differs from what many llms.txt advocates expect.

Coding agents sent 10% of requests, crawlers trained 5%, assistants 2%. Claude-Code and GPTBot were the top individual bots.

Slackbot alone downloaded llms.txt files more often than PerplexityBot did.

The Industry Learns Itself

The report found 12% of requests came from tools that search, scan, or read llms.txt files rather than using them.

GEO and AEO readiness tools sent 5% of applications; dedicated scanners and verifiers posted 3%, more than AI recovery bots and assistants combined. Research bots posted 2%, with the largest research search identifying as a rapid injection test.

The ecosystem developed in terms of scoring and documenting the file format before a significant audience emerged.

No AI Bot Checks for Missing Files

/llms.txt path requests with 404 errors did not draw AI traffic. The people hitting those 404s seem to be people typing the URL into browsers, possibly looking at competitors.

The Chrome Lighthouse llms.txt test, which restarted the llms.txt debate in May, produced about 22 requests across the dataset, about 1 in 1,000.

Why This Matters

The details match what Google’s John Mueller has been saying about llms.txt for over a year. Lily Ray pressed Mueller on the gap between the Google Search defunding and Chrome’s Lighthouse investigation. He said llms.txt was “not intended for search” and called it “a temporary crutch, maybe to save some tokens” for AI coding tools.

The data shows that the audience for the file is coding and training search engines, not AI search and retrieval bots that can generate citations.

We reported on the split between Google Search and Lighthouse listings in May. A previous SE Ranking analysis of 300,000 domains showed no correlation between having llms.txt and AI citation frequency. Ahrefs’ data points to one possible reason: the bots connected directly to the live retrieval AI did not request these files in May.

Looking Forward

Getting a quick injection should be looked at. Ahrefs found the crawler reading llms.txt to be an immediate injection risk, as agents trust ingested content. Sites that automatically generate these files with a CMS should update their content.

All figures in this report are estimates. Ahrefs measured requests, not whether bots actually downloaded them.

Featured Image: sdecoret/Shutterstock

Mosegas 17 hours ago

0 0 2 minutes read