How to Get Your Business Cited by ChatGPT, Claude and Perplexity: The 2026 GEO Playbook

Q: How long until GEO shows results?

For training-data citations (the AI 'knowing' your business): 3-6 months, because models retrain on that cadence. For live-fetch citations (Perplexity, ChatGPT web search, AI Overviews): days to weeks. Most clients see Perplexity and AI Overview citations within 14 days of fixing the high-leverage technical issues - robots.txt for AI bots, llms.txt, FAQPage and Organization schema.

Q: Should I let AI bots crawl my site?

Yes, unless you have a specific reason not to (proprietary content, paywalled archives). Blocking GPTBot, ClaudeBot, PerplexityBot and Google-Extended removes your business from AI training data permanently for that cycle. The 'opt out to protect IP' argument trades short-term protection for long-term invisibility. For a service business hoping AI will recommend you, blocking AI crawlers is competitively suicidal.

Q: How do I know if ChatGPT can see my site?

Three quick tests: (1) ask ChatGPT to summarize your homepage URL - it should describe the actual content, not refuse or hallucinate; (2) ask ChatGPT a query you should win ('best [your service] in [your city]') and check whether your site appears in citations; (3) check your server logs for OAI-SearchBot, ChatGPT-User and PerplexityBot user-agents over the last 30 days. If they are absent, your robots.txt or hosting layer is likely the problem.

A Prague consultant emailed us last week. She had asked ChatGPT for "the best web development agency in Prague." Three other agencies came back. We didn't.

We have been a Prague web agency since 2016. We have 100+ projects shipped, public pricing, real reviews, and our work runs on the front page of Czech tech in three languages. ChatGPT had never heard of us.

That is how we got serious about Generative Engine Optimization. Six weeks later, our composite GEO score moved from 60 to a projected 82. Here is exactly what we did, in case you are staring down the same problem.

Generative Engine Optimization (GEO) is the practice of making your website findable and citable by AI assistants - ChatGPT, Claude, Perplexity, Gemini, Bing Copilot, and Google's AI Overviews. Unlike SEO, which optimizes for ranking in the blue links, GEO optimizes for being quoted directly inside an AI answer. In 2026 the two are converging, but the signals are not identical. Here is what changes, what to fix first, and what we did to our own site.

What GEO actually is

For twenty years SEO was a game played on one board: rank higher in Google, win more clicks. The searcher leaves Google for your site. They convert there.

That board now has a second layer on top of it. When someone asks ChatGPT "best web agency in Prague," the model writes a sentence or two and lists 2 or 3 names with citations. The searcher reads the answer; many never click through. The conversion happens inside the AI assistant - or it does not happen at all.

The job is no longer to be on the first page of Google. The job is to be the source the AI quotes. That is Generative Engine Optimization.

Dimension	Traditional SEO	Generative Engine Optimization
Goal	Rank on page 1	Be cited in the AI's answer
Primary signal	Backlinks + on-page content	Citable passages + brand entity + structured data
Optimization unit	The URL	The passage (a paragraph, a list, a table)
Success metric	CTR + conversions from organic	Citation share across ChatGPT, Claude, Perplexity, Gemini

Both still matter in 2026. AI assistants are not yet most people's primary search interface, but the curve is steep. Pew, SimilarWeb and Adobe all measured roughly 3-4x growth in AI search usage over the last twelve months. For anyone whose ideal customer is technically literate - which describes most B2B buyers - AI search is already a meaningful chunk of intent.

How AI assistants index and cite businesses

Two pipelines decide whether an AI knows your business.

The first is training-data crawl. Bots like GPTBot, ClaudeBot, CCBot and Google-Extended crawl the open web; their content becomes part of what the underlying model "knows." These crawls happen on the model's own schedule - months between updates is normal. Anything you publish today will not be inside the model until the next training run.

The second is retrieval-time grounding. When you ask ChatGPT a question, it does not just rely on its weights - it can also fetch live web results through a sister bot. ChatGPT uses OAI-SearchBot and ChatGPT-User. Perplexity uses PerplexityBot and Perplexity-User. Claude uses ClaudeBot for web search. Google's AI Overviews use Googlebot itself. These bots fetch pages in seconds when a real user asks a real question.

Both pipelines need access to your site. If your robots.txt blocks them - or, worse, says nothing and a parser-strict bot treats silence as "do not crawl" - you are invisible.

User-agent	Owner	Purpose
GPTBot	OpenAI	Training crawl
ChatGPT-User	OpenAI	Live fetch when user pastes a URL
OAI-SearchBot	OpenAI	ChatGPT web search grounding
ClaudeBot	Anthropic	Training + web search
PerplexityBot	Perplexity	Training
Perplexity-User	Perplexity	Live retrieval
Google-Extended	Google	Gemini + Bard training signal
CCBot	Common Crawl	Feeds most open-source LLMs
Applebot-Extended	Apple	Apple Intelligence training
Meta-ExternalAgent	Meta	Llama training
Amazonbot	Amazon	Alexa, Rufus
Bytespider	ByteDance	Doubao training
MistralAI-User	Mistral AI	Live fetch
YouBot	You.com	Training + retrieval

All fourteen belong in your robots.txt - explicitly. The audit we ran on our own site found three of them silently blocked through inheritance, which we could not see until we tested with each user-agent header by hand. If your developer says "they fall under User-agent: *," check with a fresh curl before believing it.

The 9-step GEO playbook

1. Are you letting the right AI bots crawl your site?

Open your robots.txt and confirm explicit Allow groups for the fourteen bots above. RFC 9309 says explicit groups win over wildcards, but parser-strict crawlers (OAI-SearchBot is one) treat inheritance differently. Be explicit.

Test by hand: curl -A "GPTBot" https://yoursite.com/robots.txt. Then do the same for ClaudeBot, PerplexityBot and Google-Extended. The headers each bot sends must see a clean Allow. A Cloudflare-managed default "ai-train=no" signal can silently contradict your overrides - fix that in the Cloudflare dashboard if you see it.

2. Do you have an llms.txt - and an llms-full.txt?

llms.txt is an emerging convention (llmstxt.org). It is a short, structured summary of your site, written for AI ingestion. llms-full.txt is the longer companion: full inline content of your top pages, sectioned by URL. Both live in /public/ and are served at yoursite.com/llms.txt and /llms-full.txt.

Ours: kosmoweb.cz/llms.txt. It took ninety minutes to write and is the single highest-ROI file on the site.

3. Is your structured data complete enough for AI grounding?

AI models trust Schema.org. Eight types do most of the work:

Organization - with @id, founder reference, hasOfferCatalog, sameAs, areaServed, knowsAbout
Person - the founder or main author, referenced by @id across every post. One source of truth for E-E-A-T.
FAQPage - on every page that has a FAQ block in the DOM
HowTo - on any guide-style post
Service - on every service page, with areaServed
Article + Speakable - on every blog post
BreadcrumbList - on every page
WebSite - with a publisher reference to the Organization @id

Validate each against the Schema.org validator and Google's Rich Results Test before shipping. A schema with errors is worse than no schema.

4. Is your content actually quotable?

AI assistants do not quote hero taglines. They quote:

Direct-answer paragraphs immediately under each H2 (40-60 words)
Specific numbers. "12,000 CZK" beats "affordable"
Comparison tables
Definition boxes
Numbered lists with concrete steps
The "honest admission" - content that argues against its own commercial interest. AI assistants seem to be getting better at weighting it as more trustworthy.

This post itself is engineered that way. The numbered list above is more likely to end up in a Perplexity answer than the paragraph that introduces it.

5. Does the AI know who you are as an entity?

Brand entity = Wikidata Q-item plus consistent sameAs links plus consistent NAP (name, address, phone). Without a Wikidata entity, AI assistants have trouble linking your domain to a stable identity, which limits how confidently they will cite you.

Build the entity by: creating a Wikidata Q-item for your business (free, ~30 minutes); claiming profiles on Clutch.co, DesignRush, TechBehemoths, Sortlist, GoodFirms; getting mentions (not just links) on Reddit, HN, YouTube and industry press; expanding your Organization schema sameAs array as each platform is claimed.

This is the biggest lever on AI citation share. Most businesses skip it because it looks like marketing, not engineering.

6. Are your H2s phrased like questions people ask?

Passage-level retrieval - what Perplexity does - pulls an H2 plus the next 1-3 paragraphs and matches them against the user's query. A statement H2 ("Pricing") has nothing to match. A question H2 ("How much does a website cost?") matches the entire passage to anyone asking that question.

Audit your top five pages. Rewrite half of the statement H2s into question form. Under each, add a 40-60-word direct answer. This is the reason we have a blog post called "How much does a website cost in 2026?" instead of a page titled "Pricing."

7. Is your content fresh in the way AI checks?

AI assistants weight freshness - but they check three signals, not one:

<lastmod> in your sitemap (per page)
dateModified in Article schema
Visible "Updated" date on the page itself

All three must agree. A common bug we see: the sitemap shows a lastmod from six months ago because the build pins it, while the post says "Updated last week" - AI parses the mismatch and weights the older date.

Bonus: enable IndexNow on Cloudflare. It is a one-click toggle in the Cloudflare dashboard. Pushes updates to Bing, Yandex and others within seconds of publish.

8. How do you measure GEO?

Free first: build a manual prompt panel. Pick 10 target queries ("best web agency in Prague," "how much does an MVP cost," "what is generative engine optimization"). Ask them weekly across ChatGPT, Claude, Perplexity, Gemini and Copilot. Record which sources each cites. Twenty minutes a week.

Tooling for scale: Profound, Otterly.AI, AI Brand Lens, Peec AI - all in the 100-500 USD/month range, all give citation share metrics per platform. We compared all four head-to-head in our 2026 AI tracking tools comparison - including when the free DIY method beats every paid option.

Indirect signals: referral traffic from chat.openai.com, perplexity.ai and copilot.microsoft.com in your analytics. These referrers are real and growing.

9. Are you treating ChatGPT, Perplexity, Gemini and Copilot as separate channels?

Each weights signals differently:

ChatGPT favors Wikipedia, news sites and official sources. Strong brand entities win.
Perplexity is the most citation-heavy - it lists 3-7 sources per answer. Direct-answer paragraphs plus question-form H2s win.
Gemini draws from Google's index - traditional SEO plus Google-Extended access matters more here.
Copilot leans on Bing. Bing Webmaster Tools verification plus IndexNow are unusually important.

You cannot optimize one platform without the others, but you can lose one badly through neglect. The most common own-goal we see: blocking Google-Extended in robots.txt, which kills Gemini visibility instantly while doing nothing for privacy (Googlebot still crawls you).

Case study: Kosmoweb's own GEO transformation

Honesty hour. When we ran the audit on our own site in May 2026, we scored 60/100. Composite breakdown:

AI Citability: 76 - strong
Technical GEO: 84 - strong
Brand Authority: 18 - critical weakness
Content E-E-A-T: 66
Schema completeness: 62
Platform readiness: 58

We had strong content and a technically clean site. We had near-zero brand entity outside the Czech Republic.

The fix list, with hours invested and projected GEO point gains:

Fix	Effort	Projected delta
Remove self-collected AggregateRating from schema (manual-action risk)	30 min	0 to score, -1 risk
Fix invalid postalCode "10000" → "184 00"	5 min	+2 (Knowledge Graph grounding)
Add Allow groups for CCBot, Applebot-Extended, Meta-ExternalAgent, Amazonbot, Bytespider, MistralAI-User, YouBot	20 min	+4 (Platform readiness)
Add llms-full.txt inlining top 5 pages	45 min	+3
Add FAQPage schema everywhere FAQ DOM exists	2 hrs	+5
Add hasOfferCatalog to Organization	1 hr	+4 (pricing intent queries)
Promote inline blog author to referenced Person @id	30 min	+3 (E-E-A-T consolidation)
Add Speakable to Article schema	10 min	+1
Dynamic sitemap lastmod	15 min	+2 (freshness)
Dual-type Organization as [Organization, ProfessionalService]	5 min	+1
Founder Person schema with knowsAbout + sameAs + knowsLanguage	20 min	+3 (entity grounding)

Projected new composite: 82. We will publish the actually measured delta at the 30-day mark, including the parts that did not go as planned.

What we deliberately deferred to phase two:

Wikidata entity (4 hours, single biggest move on Brand Authority - scheduled for next week)
Clutch / DesignRush / TechBehemoths / Sortlist claims (3 hours total, ~6 weeks for moderation)
YouTube channel with three case-study walkthroughs (one production week)
Author archive page (/autor/david-azarian): 3 hours dev plus 2 hours bio writing

We publish this transparently because the "honest admission" is one of the most-trusted forms of content - and AIs seem to be getting there.

Five GEO mistakes that get businesses ignored

1. Stuffing keywords into AI answers. AI models are aggressively detecting and downweighting this. The fix is not "fewer keywords" but "more specific facts." Replace "best web agency in Prague" with "fixed-price web agency, 100+ projects shipped, 95% on-time delivery."

2. Self-collected AggregateRating schema. We did this until last week. Google's rule since 2019: first-party reviews on LocalBusiness or Organization schema can trigger a manual action. Source reviews from a third-party platform (Firmy.cz, Heureka, Google Reviews) and reference the platform via Review.publisher.

3. Hidden-in-React content. If a bot fetches your page and the meaningful content arrives via client-side JavaScript that does not run for crawlers, you are invisible. AI bots specifically do not execute JS. Server-render or static-generate everything that matters. Nuxt 3, Next.js and Astro all make this easy in 2026.

4. Wrong hreflang signals. Common bugs: hreflang on only some pages, x-default missing, language codes with country variants that do not match the actual content, hreflang pointing to redirects. Each silently kills your visibility in the languages you do not fix. Our technical SEO checklist covers the hreflang validation steps in detail.

5. Blocking Google-Extended. The most common opt-out we see. Blocking Google-Extended does not help your privacy - Googlebot still crawls you. It only blocks Gemini and Bard from grounding answers in your content. For a B2B company hoping AI assistants will recommend it, this is self-sabotage.

Frequently asked questions

What is the difference between SEO and GEO?

SEO optimizes for ranking in Google's blue-link results. GEO (Generative Engine Optimization) optimizes for being quoted inside an AI assistant's answer. SEO measures clicks; GEO measures citations. In 2026 they overlap heavily, the same site usually wins both, but the signals differ. GEO weights structured data, brand entity strength and passage-level citability more than raw backlink quantity.

How long until GEO shows results?

For training-data citations (the AI "knowing" your business): 3-6 months, because models retrain on that cadence. For live-fetch citations (Perplexity, ChatGPT web search, AI Overviews): days to weeks. Most clients see Perplexity and AI Overview citations within 14 days of fixing the high-leverage technical issues.

Should I let AI bots crawl my site?

Yes - unless you have a specific reason not to. Blocking GPTBot, ClaudeBot, PerplexityBot and Google-Extended removes your business from AI training data permanently for that cycle. The "opt out to protect IP" argument trades short-term protection for long-term invisibility. For a service business hoping AI will recommend you, blocking AI crawlers is competitively suicidal.

How do I know if ChatGPT can see my site?

Three tests: (1) ask ChatGPT to summarize your homepage URL - it should describe the actual content; (2) ask ChatGPT a query you should win and check whether your site appears in citations; (3) check your server logs for OAI-SearchBot, ChatGPT-User and PerplexityBot user-agents over the last 30 days. Absent? Your robots.txt or hosting layer is likely the problem.

Does GEO replace SEO?

No. Both will matter for years. Most of the highest-leverage technical work - structured data, fast server-side rendering, clean robots.txt, valid hreflang - wins on both fronts simultaneously. The main divergence is content shape: GEO rewards question-form H2s with direct-answer paragraphs more than SEO does.

How much does a GEO audit cost?

Self-service tools (Profound, Otterly, Peec AI) start around 100 USD/month. A one-time professional GEO audit for the Czech market runs 15,000-50,000 CZK depending on scope. We run a free 15-minute GEO check on any site that lets us know its goals - we will tell you the top three issues and a rough effort estimate, no proposal attached.

The bottom line

GEO is the same game as SEO with the rules rewritten for a world where the searcher might never see a Google results page. The technical foundation overlaps almost entirely with good SEO - robots.txt hygiene, structured data, fast SSR, clean hreflang. What changes is what you are optimizing for. Not the click. The quote.

If you want to know whether AI can see your business right now, we run free 15-minute GEO checks. We will show you the three things to fix first, in your specific case. No proposal attached - that is a separate conversation if you want it.