Technical GEO Readiness Checklist for B2B Brands

Before AI engines can cite your site, they have to be able to read it. This checklist audits the technical foundation: whether AI crawlers can reach your pages, whether they receive real rendered content, and whether that content is structured, marked up, and current enough to be quoted. Resonate Labs runs this audit as the first step of every GEO engagement.

Read before cited

An AI engine can't quote a page it can't read. The technical layer comes before any content work.

Most failures are rendering

The most common reason good content goes uncited is client-side JavaScript that AI crawlers don't execute. Check it first.

Run it on your own site

Every item has a quick way to check it yourself. No tool required; a browser and a terminal are enough.

How to use this

This is a self-audit for your own site, not a vendor evaluation. Work through the six sections in order, because they stack: a page has to be reachable by the crawler before rendering matters, readable before structure matters, and structured before schema or freshness can help. If a page fails the rendering check in section two, the four sections after it are moot until you fix it.

Every item has a quick way to check it, listed under the boxes. Most take a browser and a terminal. Tick what passes, and treat anything you can't tick as a finding to fix or to put in front of whoever owns your site. When you're done, you'll know whether you have a technical problem, a content-structure problem, or both.

1. Crawler access

Can AI crawlers reach your pages at all? This is the first gate, and the easiest one to fail silently through a setting nobody meant to flip.

Your robots.txt explicitly allows the AI crawlers that matter, GPTBot, ClaudeBot, anthropic-ai, PerplexityBot, Google-Extended, and OAI-SearchBot among them, rather than merely not blocking them.
No CDN, WAF, or bot-management rule is silently blocking AI user-agents, including a "block AI bots" or managed-robots toggle at your CDN.
No login wall or paywall sits in front of the content you want cited.

How to check: read yoursite.com/robots.txt, then fetch a key page with a bot user-agent (curl -A "GPTBot" https://yoursite.com/page/) and confirm you get a 200 with real HTML, not a block page. Check your CDN's bot-management settings for an AI-crawler rule.

2. Rendering and readability

Once a crawler reaches the page, does it receive your actual content? Many AI crawlers don't execute JavaScript, so a page that builds itself client-side serves them an empty shell.

The primary content is present in the raw HTML response, not injected after load by client-side JavaScript.
Priority pages are server-side-rendered, pre-rendered, or served as static HTML, so a non-JavaScript crawler gets the same content a visitor does.

How to check: curl the page, or view the page source (or disable JavaScript) in a browser, and confirm your headings and body copy are in the source. If the content disappears, it's being assembled with JavaScript the crawler won't run, and the page is effectively blank to it.

3. Extractable structure

A readable page still has to be structured so a model can lift a clean, self-contained passage out of it. Walls of text that only make sense in context are hard to cite.

One <h1> per page, with a logical <h2> and <h3> hierarchy beneath it.
A direct, self-contained answer to the page's main question near the top, not buried below the fold.
Key facts in lists or tables, not stranded in the middle of long paragraphs.
Headings phrased like the questions your buyers actually ask.

How to check: read one section out of context. If it stands on its own as an answer, it will extract cleanly; if it only makes sense after the paragraphs above it, rewrite it to be self-contained.

4. Structured data and schema

Schema helps an engine understand what a page is and which entities it covers. It's worth doing, but it's necessary rather than sufficient: markup won't rescue a page that fails the three sections above.

Relevant schema types are present and valid, Organization, Article or WebPage, BreadcrumbList, FAQPage where you have an FAQ, and Product or Service where they apply.
The markup matches the visible content, with no schema describing things that aren't actually on the page.

How to check: run the page through a schema or rich-results validator and confirm the types parse and describe what's really on the page. Don't expect schema alone to move citation: in Profound's analysis of more than 50,000 prompts, organic traffic explained only about 5% of AI citation and backlinks under 4%, so being readable and extractable matters more than markup or authority.

5. Freshness

AI citation favors content that's kept current, and stale pages get displaced over time. Freshness is a maintenance discipline, not a one-time fix, and the signal has to be honest.

A visible "last updated" date on the page, matched by an accurate dateModified in the schema.
Sitemap lastmod values reflect real edits, rather than every page claiming today's date.
A maintenance cadence keeps your priority pages current.

How to check: spot-check a page's dateModified against the last real content change. If everything claims to have been updated today, the freshness signal is noise, and crawlers learn to discount it.

6. Discoverability and hygiene

Finally, the basics that let crawlers find and trust your pages. These are the easiest to get right and the easiest to break in a migration.

A clean XML sitemap listing your canonical URLs.
Canonical consistency, one URL form per page, with trailing-slash discipline so you aren't splitting signals across two versions of the same page.
No accidental noindex on pages you want cited.
Fast, stable responses to crawlers, without aggressive rate-limiting or timeouts that turn bots away.

How to check: load /sitemap.xml and confirm it lists the right URLs, check that canonical tags are consistent, and grep your templates for a stray noindex. If you're handing these requirements to a vendor, the GEO vendor RFP and scorecard turns this checklist into criteria you can grade them against.

Where Resonate Labs fits

Resonate Labs runs this audit as the first step of a GEO engagement. Before any content work, we make sure AI crawlers can reach your pages, that they receive real rendered content, and that the content is structured and current enough to be cited. The monthly AI visibility audit then tracks whether those changes are moving citation across ChatGPT, Claude, Gemini, and Perplexity. The reasoning behind the checklist is in why a site that ranks on Google can be invisible to AI.

It's a standard we hold ourselves to. This site passes its own checklist: it's served as static, server-rendered HTML with an explicit AI-crawler allow-list in robots.txt, schema on every page, and a sitemap that only restamps pages that actually changed. If you'd rather not run the audit yourself, a AI Visibility Snapshot shows what AI engines can and can't see on your site today.

Frequently asked questions

What technical requirements does a B2B site need to be cited by AI?

Six things, and they stack in order. First, crawler access: your robots.txt and CDN have to let AI crawlers like GPTBot, ClaudeBot, and PerplexityBot reach your pages. Second, rendering: those crawlers have to receive real HTML, not a JavaScript shell they can't execute. Third, extractable structure: clean headings and self-contained passages a model can lift into an answer. Fourth, structured data and schema, which help an engine understand the page but don't rescue one that fails the first three. Fifth, freshness: current pages with accurate update dates. Sixth, discoverability: a clean sitemap, consistent canonicals, and no accidental noindex. Being readable and extractable matters more than schema or domain authority.

How do I check whether AI crawlers can actually read my site?

Fetch one of your key pages with a bot user-agent and read what comes back. From a terminal, request the page as GPTBot or ClaudeBot and confirm you get a 200 with your real headings and body copy in the raw HTML, not a block page and not an empty app shell. You can do the same in a browser by viewing the page source or disabling JavaScript: if your content disappears, it's being assembled client-side with JavaScript that many AI crawlers don't execute, which means they see an empty page no matter how well it ranks on Google. This single check catches the most common reason good content goes uncited.

Is schema markup enough to get cited by AI?

No. Schema is necessary but not sufficient. It helps an engine understand what a page is and which entities it covers, but it can't rescue a page that a crawler can't read or can't extract a clean passage from. The classic authority signals don't carry the day either: in Profound's analysis of more than 50,000 prompts, organic traffic explained only about 5% of AI citation and backlinks under 4%. What moves citation most is being readable, then extractable, then supported by appropriate schema and kept current. Add schema after the page is readable and well-structured, not instead of it.

Should we run this audit ourselves or hire a GEO vendor?

Run the checklist yourself first. Most items take a browser and a terminal to verify, and the results tell you whether you have a technical problem, a content-structure problem, or both. If you decide to bring in a vendor, turn these requirements into the criteria you put on them: the GEO vendor RFP and scorecard does exactly that. Resonate Labs runs this audit as the first step of a GEO engagement, fixing crawler access, rendering, and structure before any content work, because content can't earn citations on a site AI engines can't read.

Related resources

Why a Site That Ranks on Google Can Be Invisible to AI — the reasoning behind this checklist: rendering, structure, and freshness, with a render-fix case study
GEO for Digital Marketing Leads — the role starting point for the hands-on technical lead: crawlability, what gets cited, tools versus agencies
GEO Vendor RFP & Scorecard — take these requirements to a vendor: an RFP template, scorecard, and vetting questions
How We Measure GEO Results — the four metrics, scored every 30 days across Gemini, Claude, ChatGPT, and Perplexity
What is GEO? — the fundamentals: what Generative Engine Optimization is and how it differs from SEO
Request an AI Visibility Snapshot — a free, fast diagnostic of how AI engines describe your company today, and the first step if you want to go further

Next step

Find out what AI can actually see.

Start with a free AI Visibility Snapshot for a no-commitment read on where you stand. The full AI Visibility Crawl shows whether AI crawlers can read your site at all and where that leaves you, scored across every engine:

Whether AI crawlers can read your pages today
Where you're visible, cited, or absent across the four engines
What the first 30 days would move