AEO • commercial intent
How to Get Your Website Recommended by ChatGPT, Claude, and Gemini
A data-driven guide to making your website visible in AI-generated answers. Covers the signals AI models use to decide which sites to recommend, with specific fixes you can apply today.
How AI models decide which websites to recommend
When someone asks ChatGPT 'what's the best project management tool for small teams?' the model doesn't search Google. It draws from two sources: its training data (everything it learned during pre-training) and real-time retrieval (web pages it fetches when browsing or using search plugins). The sites that appear in answers are the ones that score well on both axes.
Training data favors sites with clear, well-structured content that has been widely referenced across the web. Real-time retrieval favors sites that are crawlable, load fast, and present information in machine-readable formats. A site can be well-known but blocked from retrieval, or crawlable but with content too thin to be useful.
The same principles apply across Claude, Gemini, and Perplexity, though each platform weights signals slightly differently. Perplexity leans heavily on real-time retrieval. Claude relies more on training data. Gemini blends Google Search index with AI inference. The common denominator: structured, authoritative, accessible content wins everywhere.
The five signals that drive AI recommendations
Based on analyzing thousands of AI-generated responses across platforms, five factors consistently determine which sites get recommended and which get ignored.
- Crawl access — Can the AI's crawler (GPTBot, ClaudeBot, PerplexityBot) actually fetch your pages? A single robots.txt rule or CDN firewall rule can silently block all AI visibility. This is the most common fixable issue.
- Content authority — Does your site have genuinely useful, specific content about your topic? AI models strongly prefer pages that go beyond surface-level descriptions. Product pages with detailed specifications, comparison tables, and honest trade-off discussions get cited more than marketing copy.
- Structured data — JSON-LD schema (Product, Organization, FAQPage, HowTo) gives AI models machine-readable facts they can extract with confidence. A product page with structured pricing, availability, and reviews is dramatically more citable than one where this information only exists in unstructured HTML.
- Reference signals — How many other credible sources mention or link to your site? AI training data inherits the link graph of the web. Sites that are frequently referenced in authoritative contexts appear more often in training data and therefore in model responses.
- Freshness — How recently was your content updated? Retrieval-based systems (Perplexity, ChatGPT Browse) prioritize recent content. Stale pages with outdated information get deprioritized even if the domain is authoritative.
Step 1: Remove crawler blocks
Check your robots.txt right now. Open yourdomain.com/robots.txt and search for any rule that blocks GPTBot, ClaudeBot, PerplexityBot, or uses a wildcard disallow. If you see 'User-agent: *' followed by 'Disallow: /' with no specific allow exceptions for AI bots, you are invisible to every AI platform.
Next, check your CDN. Cloudflare's Bot Fight Mode and similar features on AWS CloudFront and Vercel can block AI crawlers at the network level, even if your robots.txt allows them. You need to create explicit firewall allow rules for AI crawler user agent strings.
This single step — removing blocks — is responsible for the largest visibility gains we see. Sites that were completely invisible become citable within days of allowing crawler access.
Step 2: Publish llms.txt
llms.txt is a new standard (similar in spirit to robots.txt) that tells AI models what your site is about and which pages are most important. Think of it as a curated table of contents for machines.
Place it at yourdomain.com/llms.txt and include your site name, a one-paragraph description, and links to your most important pages organized by category. Unlike a sitemap (which lists every URL), llms.txt is a priority signal — it tells AI models 'start here' when trying to understand your site.
Sites with llms.txt see measurably higher retrieval rates in Perplexity and ChatGPT Browse, because the file gives these systems a reliable entry point into your content hierarchy.
Step 3: Add structured data to key pages
JSON-LD structured data is the fastest way to make your content machine-extractable. AI models use structured data as high-confidence facts — a Product schema with a price of $49/month is more reliably cited than the same price buried in a paragraph of marketing text.
Focus on the pages that matter most for your business. For SaaS companies: Organization schema on the homepage, SoftwareApplication with Offers on the pricing page, FAQPage on the FAQ or docs page. For ecommerce: Product schema on every product page, BreadcrumbList for navigation context, FAQ on category pages.
Don't try to add every schema type. Adding Organization, your primary product/service schema, and FAQPage to 3-5 key pages will cover 80% of the AI citation value.
Step 4: Create content that AI models can cite
AI models cite specific, factual content — not vague marketing language. A page that says 'our platform is the best solution for growing teams' gives an AI model nothing citable. A page that says 'supports up to 50 team members, includes Gantt charts and time tracking, integrates with Slack and Jira, starts at $12/user/month' gives the model concrete facts to include in recommendations.
Write comparison content honestly. When a user asks an AI 'what's the difference between Tool A and Tool B?', models pull from pages that explicitly compare features, pricing, and trade-offs. If your comparison page acknowledges competitor strengths while clearly articulating your differentiation, it becomes a primary citation source.
FAQ pages are disproportionately valuable for AI visibility because they map directly to the question-answer format that AI models use. Every genuine question your customers ask should be on your site with a clear, specific answer.
Step 5: Monitor and measure
AI visibility is not a set-and-forget optimization. Models are retrained, retrieval indexes are refreshed, and new competitors enter the recommendation pool. You need ongoing monitoring to maintain and grow your position.
Track which AI platforms mention your brand, how your visibility score changes over time, and whether AI referral traffic converts. Platforms like Perplexity and ChatGPT now send identifiable referral traffic that you can track in analytics.
Run regular audits to catch regressions — a theme update that breaks your schema, a CDN configuration change that blocks a bot, or a new product page that wasn't added to llms.txt. Weekly is ideal; monthly is the minimum.
Execution Checklist
- • Verify robots.txt allows GPTBot, ClaudeBot, PerplexityBot, ChatGPT-User, and Google-Extended.
- • Check CDN/WAF firewall rules are not blocking AI crawler user agents.
- • Publish llms.txt at your domain root with your most important pages.
- • Add JSON-LD schema (Organization, Product/SoftwareApplication, FAQPage) to your top 3-5 pages.
- • Rewrite key pages to include specific, citable facts rather than vague marketing language.
- • Create or expand your FAQ page with genuine customer questions and specific answers.
- • Set up tracking for AI referral traffic (Perplexity, ChatGPT, Claude).
FAQ
How long does it take to start appearing in ChatGPT answers?
For retrieval-based answers (when ChatGPT Browse or Perplexity fetches your page), changes can take effect within days of allowing crawler access. For training-based knowledge (when the model answers from memory), it depends on when OpenAI, Anthropic, or Google next retrains the model — typically on a cadence of weeks to months.
Can I pay to appear in AI recommendations?
Not directly. Unlike Google Ads or sponsored search results, major AI platforms do not currently sell placement in their organic generated answers. Your visibility is determined by content quality, accessibility, and authority — which makes the organic optimization approach outlined above the only reliable path.
Does SEO help with AI visibility?
Partially. Good SEO practices (clean site architecture, fast page speed, quality content) help with AI visibility too. But AI models use additional signals that traditional SEO doesn't address: llms.txt, AI-specific crawler access, and structured data optimized for extraction rather than just rich snippets. Think of AI visibility as a superset of SEO — everything good for SEO helps, but you need extra layers for AI platforms.
My competitors appear in ChatGPT answers but I don't. Why?
The most common reasons: (1) your robots.txt or CDN blocks AI crawlers while theirs doesn't, (2) they have structured data that makes their content machine-extractable, (3) they are more frequently referenced by other authoritative sites in their AI training data, or (4) they have more specific, factual content that AI models can confidently cite. Run an AI visibility audit to identify which gaps apply to your site.