The Limits of GEO: What the Field Can't Prove Yet

GEO is real and well-supported enough to invest in, but parts of it are genuinely unproven, and honest practitioners say so. No controlled intervention study has been published, attribution to revenue isn't solved, and agentic search is uncharted. Resonate Labs is clear about what the evidence shows, what it doesn't, and why it's still worth acting on now.

Strong enough to act on

Platform divergence, the structured-content advantage, and the conversion premium are well-replicated. The case to invest holds.

But not yet a science

No published study proves a specific GEO intervention caused a specific visibility gain. The field is where SEO was in the mid-2000s.

Honesty is the strategy

Knowing where the evidence runs out is what separates a real partner from one telling you what you want to hear.

Why an honest map matters

Most GEO advice sounds more certain than it is. A confident claim circulates, gets repeated, and turns out to trace back to a vendor study of fifty prompts, a LinkedIn post that went viral, or reasoning by analogy from SEO that was never tested against real citation data. The field rewards confidence, not calibration.

This page does the opposite. The case for GEO is strong, and the rest of this library makes it. But knowing where the evidence runs out matters as much as knowing what it shows, because the gaps below aren't edge cases. They're structural, and any strategy that doesn't account for them is working around them rather than through them. Anyone who tells you these questions are settled is either ahead of the published research or telling you what you want to hear.

The biggest gap: no intervention study

The most consequential gap in the field, and the one that matters most for an investment decision, is that nobody has published a rigorous study showing: we made these specific GEO changes for this brand, and its AI visibility moved by this much over this time, controlling for everything else.

What exists instead are vendor case studies, the before-and-after reports that platforms and agencies publish about their own work. They're useful as directional evidence, but they don't control for confounding variables, they don't report the failures, and no one independent verifies them. A case study showing a brand's citations rose after an engagement tells you something happened; it doesn't tell you what caused it, whether it would happen again, or what the brand's visibility would have done without the work.

The published research is cross-sectional: snapshots of what kinds of content get cited, correlations between authority and citation, comparisons of platform behavior at a moment in time. What's missing is the longitudinal before-and-after. We know structured content appears in citations at higher rates than unstructured content. We don't yet have published proof that restructuring an existing page produces the same gain. The programs the field recommends are built on the best available evidence, and that evidence supports them, but the gap between "well-supported by correlational research" and "demonstrably causes the improvement in controlled conditions" is real, and it hasn't been closed.

Attribution to revenue is still unsolved

The conversion data is genuinely compelling. Multiple studies find AI-referred traffic converting at several times the rate of standard organic, roughly four to nine times in the credible range, because buyers who arrive from AI research have already done much of their evaluation and are further along.

But a conversion rate is not attribution. No independent study has traced the full chain from GEO investment to visibility gain to pipeline to revenue. The chain is plausible; it hasn't been proven from the outside. And the measurement environment makes it harder than it sounds: AI Overview clicks show up in analytics as ordinary organic traffic, with no native way to isolate them, and most AI research never produces a click at all, because the buyer reaches a decision inside the interface and arrives later through branded search or a direct URL. The influence that AI created gets laundered into channels that look familiar, and the session that did the convincing is invisible.

So a company that sees pipeline grow alongside its GEO investment usually can't cleanly attribute it. The correlation is real; the causal link is obscured. Building a rigorous attribution model for AI-influenced pipeline is one of the hardest open problems in the field. The fingerprinting approach the better measurement programs use, inferring AI influence from indirect signals, is the current state of the art, but it's a workaround for the absence of a solution, not the solution itself. It's also why a serious program reports AI visibility as the leading indicator rather than promising a clean last-click number, the discipline behind how we measure GEO results.

Four more open questions

Beyond those two, four more questions sit unanswered, and each one bears on a real decision.

Cross-platform mechanics, beyond ChatGPT

Most rigorous citation analysis has studied ChatGPT, because it has the largest share and the most active research community. Perplexity, Gemini, Claude, and Copilot have far thinner coverage. We know from overlap studies that the platforms differ sharply in what they cite; we don't know the mechanics driving those differences, what each one's source selection actually optimizes for. The platform guidance worth following is the best synthesis of what's known, but it rests on rich data for one engine and thin data for the rest, so any brand allocating across platforms is working with real uncertainty about most of them. It's the gap covered in depth on platform divergence.

Training data versus real-time retrieval

Every major engine blends two sources: knowledge baked in during training, and content pulled from the web at query time. The weighting between them, and how it shifts by model, query, and topic, isn't publicly documented, and it sits at the center of GEO strategy. If training data dominates, a brand with years of authoritative presence has a structural advantage that's hard to displace. If real-time retrieval dominates, the field is more dynamic and current content can move citations faster. A long-established brand with strong AI visibility might be benefiting from training-data presence, or from current optimization, or its investment might be exactly what's defending it against better-structured challengers. The research can't yet separate those explanations.

International GEO

Nearly every finding in circulation is English-language and US-centric: the citation studies, the platform-behavior analyses, the source-composition research. For a global organization that's not a footnote. Regulatory regimes differ, platform availability varies by market, and language-specific training data likely produces entirely different citation landscapes. A result that holds for English-language queries on one engine may not hold in German or Japanese on another. Extrapolating to non-English markets may be the right bet, but it hasn't been validated.

Agentic GEO

The largest potential shift is also the least studied. When an AI agent researches on a buyer's behalf, does it read web content the way an assistant answering a person does? Does it cite at all, or query structured data and APIs directly? The answers reshape everything downstream. If agents use the same citation mechanics, today's programs scale into the agentic layer naturally. If they bypass citation and talk to product feeds and machine-readable specs, then measurement, content priorities, and optimization targets all shift toward data architecture rather than narrative content. The current paradigm assumes a human reading a sourced answer; if the audience becomes another machine, visibility means something different, and the evidence base has to be built again from scratch.

What to do with the uncertainty

The honest response to these gaps isn't doubt, it's calibration. The evidence that does exist is more than enough to justify serious investment: platform divergence is well-replicated, the structured-content advantage is consistent across independent studies, the technical-accessibility requirements aren't optional, and the B2B conversion premium holds across methods. None of that depends on the missing longitudinal data.

The right move is to build the practices that will eventually fill the gaps. Run your program with enough rigor to generate your own before-and-after: track what you publish, when, and what citation looked like on either side of it. Build the attribution fingerprinting that connects AI influence to pipeline even without clean causal proof. Do the cross-platform testing the literature hasn't done at scale. Contribute to the evidence base instead of waiting for someone else to produce it.

The field is roughly where SEO was in the mid-2000s: clearly important, directionally understood, better guided by evidence than instinct, but not yet the science it will become. The teams that treated early SEO as an informed craft, built their own data, and helped define the methodology became the practitioners who set best practice for a generation. The ones who waited for certainty built on everyone else's compounding head start.

Where Resonate Labs fits

This is the part of the pitch most vendors skip, and it's the reason to trust the rest of it. Resonate Labs will tell you what the evidence supports, what it doesn't, and where a claim is a bet rather than a finding. That honesty isn't a hedge; it's the whole point. A partner who pretends the open questions are settled is the partner most likely to sell you a number that doesn't survive contact with your own data.

In practice, the work is building the measurement that turns these field-wide gaps into answers for your specific context: tracking your interventions and their before-and-after, fingerprinting AI influence into your pipeline view, and testing each platform rather than assuming ChatGPT's behavior generalizes. The case for investing now is made in the business case for GEO; the measurement that keeps it accountable is how we measure GEO results. If you want a grounded read on where you actually stand, a free AI Visibility Snapshot is the place to start.

Frequently asked questions

Is GEO real, or is it still too early to invest?

It's real and well-supported, and also honestly mid-stage. The findings that justify investing, platform divergence, the advantage of structured content, the technical requirements, and the higher conversion rate of AI-referred traffic, are replicated across independent studies. What's missing is the longitudinal proof that a specific intervention causes a specific gain. That gap is a reason to measure carefully, not a reason to wait: the evidence supports acting now, and the brands building their own data are compounding an advantage while the field catches up.

What can go wrong with a GEO engagement?

The honest failure modes are mostly about overclaiming. Treating correlational evidence as if it guaranteed a causal result; promising clean revenue attribution the measurement environment can't deliver; assuming what's true on ChatGPT holds on every other engine; and not building the before-and-after measurement that would tell you whether the work is actually moving anything. A program that names these risks and measures against them is far more likely to deliver than one that promises certainty.

Can anyone prove GEO causes pipeline?

Not yet, externally. The conversion premium for AI-referred traffic is real, but no independent study has traced the full chain from GEO investment to visibility to pipeline to revenue, and AI's influence tends to launder into organic and direct traffic where it can't be cleanly isolated. The defensible approach is to treat AI visibility as the leading indicator it is, track it rigorously, and connect it to pipeline through fingerprinting rather than claiming a last-click number that doesn't exist.

Is GEO a one-off audit or an ongoing program?

Ongoing, partly because the field itself is still moving. An audit tells you where you stand today, but what the engines cite changes month to month, the open questions are still being answered, and the only way to know what works in your specific context is to run a program rigorous enough to generate your own before-and-after data. The brands that treat it as an ongoing, measured practice are the ones building an advantage that compounds.

Related resources

Does GEO Actually Work? — the evidence side of the same honesty: what GEO demonstrably moves, and where the proof runs out
The Business Case for GEO — the case for investing now, in the language your board evaluates
How We Measure GEO Results — the measurement that turns these field-wide gaps into your own before-and-after data
GEO for Executives — the executive starting point: is it real, can you prove the ROI, and how to pick a proven partner
Platform Divergence — the "cross-platform is thin beyond ChatGPT" gap, in depth
Request an AI Visibility Snapshot — a free, grounded read on where you actually stand today

Next step

See where you actually stand.

Start with a free AI Visibility Snapshot for a no-commitment read on where you stand. The full AI Visibility Crawl is the grounded, no-overclaim measure of how AI describes your company today, scored across every engine:

Where you're visible, cited, or absent across the four engines
What the evidence supports for your category, and what it doesn't
What the first 30 days would realistically move