Consensus Gap

Mosegas 4 hours ago

0 0 6 minutes read

Improve your skills with Growth Memo’s expert weekly insights. Sign up for free!

Many groups talk about “AI visibility” as if it were one thing. New data on 3.7 million citations across ChatGPT, Perplexity, and Google AI Overview suggests not. And the gap between the three engines is wider (and more strategically important) than your dashboard might admit.

Today’s memo breaks down:

Why the combined AEO score hides the only important finding.
What page types and domains actually run on all engines.
A shift in measuring AI you were there balancing portability.

Another major difference between AEO and SEO is that AEO plays on multiple platforms.

Omnia’s data shows across multiple samples that only 2.35% to 2.45% of cited URLs appeared in ChatGPT, Perplexity, and Google AI Overview with the same information. 91% of the citations came from only one engine.

Conclusion: AI visibility is not a single leaderboard. Instead, there are three different systems for distributing that sometimes meet again generally don’t do it.

Only 2% of URLs Are Citations by All 3 Engines

Most people would guess that if a URL is cited by one major AI engine, it has a reasonable probability of appearing in others.

But 20,000 quick samples show only 2.37% of URLs cited from all three engines with the same information.

Meanwhile, 91.07% came from only one place. Those two numbers are close because they define each other. The remaining ~7% overlap in pairs, meaning the engines draw from very different pools rather than lining up the same pool differently.

Photo Credit: Kevin Indig

For AEO/SEO teams, that means a single composite visibility score is the wrong unit of measure. Average AEO results hide this. The product can look solid when combined and invisible in 2 out of 3 engines. Teams chasing a single visibility number for integrated AI compress the three ranking systems into one metric and call it a strategy.

2% Hold on All Cuts

An overlap rate of ~2% and a specificity rate of ~91% remain nearly perfect across the four samples.

That consistency is more important than the exact decimal point. The consensus gap is not an artifact of a single question set or a single time window. It looks structured.

In Q3 2025, global growth was 2.2%. In Q4 2025 and Q1 2026, it increased to 2.7%. Engine-only quotes dropped from 90.1% to around 88%. So yes, a small number of encounters. But even after that change, division still reigns.

Commercial Notices Are Not Included

Intent segmentation is one of the quietest yet most useful aspects of a dataset. You might argue that trade questions should generate more consensus. When someone searches [the best CRM], [best running shoes]or [best project management software]the pool of acceptable sources sounds smaller than the wider body of information.

Surprisingly, the data does not support a significant difference.

Trade data shows an overlap of 2.4%. The data shows 2.0%. Even if the query should narrow down the answer set, the engines still choose different sources most of the time.

That goes against common sense in SEO and content strategies. Groups that tend to take highly objective questions are where shared authorization will emerge. The opposite seems to be closer to the truth. Even in a commercial environment, the logical way to find each engine, which sources to rely on, which brands to choose, does a lot of work.

Guides Beat Homepages By 2x

The page type breakdown below shows guides and tutorials with the highest engine overlap at 2.3%, followed by blogs at 1.8%, category pages at 1.6%, product pages at 1.2%, and homepages at 1.1%.

Two lessons:

First, Descriptive content fares better rather than product or transaction goods. If you’re looking for the best image on display across all search engines, the strongest candidate is neither the homepage nor the product page. It’s a helpful, descriptive, comparative, or informative page, but remember that these are content formats that AIs can respond to well.
Second, even the best page types perform poorly in absolute terms. Guides don’t win on all engines in any meaningful sense. The proper reading from this is not “publish more guides and you will win everywhere.” It’s simpler than that: Helpful content goes better than product content.

Visibility Is Not The Same As Touchability

One of the easiest mistakes in this space is to confuse citation frequency with citation portability. Wikipedia is the purest example. It appears 16,073 times in the dataset, but only 1.3% of those instances are found in all engines. Reddit appears 14,267 times, but only 0.1% is everywhere. Reuters shows 1,202 times and still up to 0.0% global overlap.

That’s why an important metric is portability. A domain can appear on a single engine and be seamless, meaning that a product that looks powerful on a unified dashboard can be a single platform practice far from invisible. Presence tells you if you are visible. The portability tells you if that seems solid.

What This Means for Users

The practical implication is simple: Stop treating AI visibility as one thing. Check the overall visibility of your site by measuring:

1. Existence, % of your information tracked when your site appears on any engine. Presence tells you if you are visible.

2. Portability, % of your cited URLs from all three engines. The portability tells you if that seems solid.

3. Concentration, % of your citations from one engine. Concentration tells you which engine your current dashboard is secretly built on.

If the overlap between engines is this low, a single AEO strategy is too impractical to be implemented.

If we approach the visibility of AI from a holistic perspective, it forces some sharp questions:

Which engine is most important to us?
Which of our properties are motorized, and which are only one-powered?
Are we measuring presence when we should be measuring portability?

This also changes the way product teams must think about diagnostics. A weak homepage for all engines may not be the homepage’s problem. It’s a sign of something broader: Engines tend to be used in more than just a product domain. In that world, visibility comes less from being a officer source and more from being useful the source.

The question of strategy no longer exists,”How do we measure up to AI?” We must ask ourselves, “How do we build assets that survive different engine preferences?” That is a small question and a good one.

How to do it

There are a few caveats to this analysis:

The dataset is inverted on Omnia’s customer base.
Target and page type classifications depend on regex classification, which is useful for directional analysis but not a complete taxonomy task.

Those caveats don’t make it any less of a great find. The main signal is not the accuracy of the edges. It is a constant in the center. No matter how the cuts change, the same pattern emerges: very little overlap, very high engine specifications, and only minor differences in time, purpose, or page type.

Data Set Size and Time Window

The analysis draws on four immediate samples. Three groups of 5,000 orders each, tracked from Jan. 1, 2025; July 1, 2025; and Jan. 1, 2026. A separate sample of 20,000 random data supports the topic 2.37% and statistics 91.07%. The time view cut spans Q3 2025 through Q1 2026 (to date) and includes a total of 3.7 million URL citations. The Commercial/Informational/Other purpose classification is taken from approximately 2.6 million URLs in the pooled samples. Page type classification takes 4.1 million URLs.

How Commandments Were Chosen

20,000 notifications are drawn as a random sample from Omnia’s immediate monitoring pool. The pool shows what real sales teams have chosen to track, measured by Omnia’s customer base (Spain-heavy, as well as the UK, Nordics, and other EU markets). Each dataset is in the primary language of its country, so Spanish is overrepresented when compared to the US-only dataset. Industry mix fintech/insurtech, travel, SaaS, B2B services. Treat the findings as guiding the search for European AI.

Engine Coverage

The research involves three engines: ChatGPT, Perplexity, and Google AI Overviews. Each blasts the same information at the same time within the same minute, twice a day, with country localization, and each engine is asked in its default web-enabled, unauthorized state. Anomaly tracking works in Sonar, while ChatGPT and Google AI Overview use each vendor’s default generation model for logged-out web browsing (which neither OpenAI nor Google publicly pins on a particular version).

Method of Classification

The purpose and type of the page are assigned by regex. Purpose buckets are for commerce, information, and more. Page type buckets are Guide/tutorial, article/blog, category page, product page, homepage, Wikipedia, and more. The rules are keyword- and URL-pattern-based, making it fast enough for datasets of millions of URLs but rough around the edges. Edge cases fall into Other, which is why Other holds the highest share in both the intent and page type tables. Treat regex cuts as optional, not optional.

Additional resources:

Featured Image: FGC/Shutterstock; Paulo Bobita/Search Engine Journal

Mosegas 4 hours ago

0 0 6 minutes read