How to Get Your Website Cited by ChatGPT, Claude, and Perplexity
Adeyinka Adefila
Founder, Distro ยท May 28, 2026
Getting cited by ChatGPT, Claude, and Perplexity is becoming as important as ranking on Google. When someone asks an AI search engine "what tools help startups get customers," you want your website in the answer. But AI citation does not work like traditional SEO. AI models decide what to cite based on crawler access, structured data, content structure, and technical signals that most websites get wrong. This guide covers the 18 specific checks that determine whether AI search engines can find, understand, and cite your website.
The shift is real. A growing share of buyers now ask an AI assistant before they ever open Google. If your site isn't readable and citable by those models, you're invisible to that traffic, no matter how well you rank on traditional search.
Key Takeaways
- AI citation depends on crawler access, structured data, and content structure
- Block AI crawlers by accident and you can't be cited, period
- Definition-first paragraphs are the easiest thing for a model to extract and quote
- Most AI signals also help Google, so you rarely have to choose
- An llms.txt file gives AI models a clean summary of what you do
What Generative Engine Optimization Is and Why It Matters
Generative engine optimization, or GEO, is the practice of structuring your website so AI models can find, understand, and cite it in their answers. Traditional SEO optimizes for a ranked list of blue links. GEO optimizes for being the source a model quotes when it writes a direct answer.
It matters because the interface is changing. When an AI answers a question in one paragraph and names three sources, those three sources get the trust and the clicks. Everyone else gets nothing. Being one of the cited sources is the new page-one ranking.
How AI Search Engines Decide What to Cite
Models pull from what they can crawl, what they can parse, and what they can trust. They favor pages with clear structure, named entities, and direct answers, because those are easy to extract without ambiguity. They also lean on signals of credibility: consistent information about who you are, what you do, and whether other trusted sources reference you.
Think of it as making the model's job easy. If your page states a clear definition, backs it with specifics, and marks everything up cleanly, the model can lift an accurate answer and cite you with confidence. If your content is vague and unstructured, it skips you for a source that's easier to quote.
Checks 1 to 6: Crawler Access
Start here, because if AI crawlers can't reach your site, nothing else matters. Confirm your robots.txt allows GPTBot, ClaudeBot, and PerplexityBot. Many sites block these by default or through a security plugin without realizing it. Check that your pages return clean 200 responses, that content renders without requiring JavaScript the crawler won't run, and that you have an llms.txt file at your root describing your site.
Run through it as a checklist: GPTBot allowed, ClaudeBot allowed, PerplexityBot allowed, robots.txt not over-blocking, server-rendered content available, and llms.txt present. Get these six right and you're in the running. Miss one and you may be uncitable.
Checks 7 to 12: Structured Data
Structured data tells models exactly what your content is. Add Organization schema so the model knows who you are. Add Article schema on blog posts so it understands authorship and dates. Add FAQPage schema on pages with questions, because that maps directly to how people query AI. Set accurate meta titles and descriptions, and use canonical URLs so the model doesn't get confused by duplicate versions of a page.
These six checks, Organization schema, Article schema, FAQPage schema, meta tags, canonical URLs, and clean Open Graph data, give the model unambiguous metadata. Ambiguity is what gets you skipped. Structured data removes it.
Checks 13 to 18: Content Structure
This is where most sites lose. Write definition-first paragraphs: answer the question in the first two sentences, then expand. Use a clean heading hierarchy so the model can follow the document. Include a real FAQ section. Name your entities explicitly instead of relying on "it" and "this." Keep paragraphs short. And make sure every claim is specific enough to quote.
The pattern to internalize: lead with the answer. A page that opens "Founder-led marketing is when the founder personally handles customer acquisition" can be cited verbatim. A page that opens "In today's competitive landscape, marketing has never been more important" gives the model nothing to lift.
How to Create an llms.txt File
An llms.txt file is a plain text file at your site root that describes your content for AI models. Create a file named llms.txt, write a short product description, list your core features and audience, add your pricing, and link your key pages. Keep it clean and factual. Place it next to robots.txt so crawlers find it where they expect to.
It takes about ten minutes and gives every AI model a tidy, authoritative summary of what you do. We cover the full template in the llms.txt guide.
How to Check Your AI Citation Readiness
Rather than guess across 18 checks, scan your site. The AI citation checker runs each signal and gives you a readiness score with specific fixes. It's the fastest way to find which of the 18 you're failing.
Google SEO Versus AI Citation
The good news is the overlap. Clean structure, fast loading, and structured data help both. The differences for AI are narrow: allow AI crawlers explicitly, add an llms.txt file, and write definition-first so models can extract answers. You're not choosing between Google and AI. You're adding a thin layer on top of good SEO.
For the deeper playbook, read the generative engine optimization guide, and the underlying concept is defined under AI citation.
Frequently Asked Questions
What is llms.txt?
llms.txt is a text file placed at the root of your website (like robots.txt) that describes your site's content in a format optimized for large language models. It tells AI crawlers what your product does, who it is for, and what content to prioritize when generating answers.
Can I rank in both Google and AI search?
Yes. Most AI citation signals (structured data, clean content, fast loading) also help Google rankings. The main additions for AI are llms.txt, allowing AI crawlers in robots.txt, and writing definition-first paragraphs that AI can extract and cite directly.
How do I know if AI is citing my website?
Ask ChatGPT, Claude, and Perplexity questions related to your product and see if your site appears in the sources. Or use an AI citation checker tool to scan your site for the 18 technical signals AI models look for.
Check your AI citation readiness for free at www.usedistro.com/tools/ai-citation-checker.