AEO • commercial intent
AI Visibility Checker: How to Audit Your Site for AI Search in 5 Minutes
What an AI visibility checker actually tests, how to run one on your own site, and the signals that separate 'technically accessible' from 'actually cited'. Includes a DIY checklist and when to use a dedicated tool.
What 'AI visibility' actually means
AI visibility is a composite of two things: whether AI crawlers can reach your content, and whether AI models choose to cite it when a relevant question is asked. Those are different problems with different fixes. A site can have zero visibility because GPTBot is blocked at the CDN, or because it's perfectly accessible but the content isn't specific enough to be worth citing. An AI visibility checker needs to distinguish between the two.
The naive version of a checker just counts accessibility — can I fetch your robots.txt, is GPTBot allowed, is there a llms.txt file. This catches the most common failure mode (you're blocked and didn't know it) but it misses everything downstream. A site with perfect accessibility and no structured data, no factual content, and no authority references will still be invisible in AI answers.
A good checker tests both layers: the infrastructure layer (access, crawlability, response codes, structured data validity) and the content layer (specificity, citation-worthiness, factual density, authority signals). Either layer failing is enough to make you invisible, so you need to check both to know where you actually stand.
The five checks you can run manually in 5 minutes
Before reaching for a tool, run these five checks yourself. They catch the majority of fixable issues and give you a concrete picture of your current state.
- robots.txt audit — Open yourdomain.com/robots.txt and search for GPTBot, ClaudeBot, PerplexityBot, ChatGPT-User, and Google-Extended. Each should be either explicitly allowed or not mentioned (which defaults to allow). A 'Disallow: /' under any of these names is a hard block that removes you from that AI platform entirely.
- Live crawler test — Run curl with each AI bot's user agent string against a content page (not just the homepage). A 200 response with your actual HTML is passing. Anything else (403, 429, redirect to a challenge page, or 200 with a bot-detection page) means your CDN or firewall is blocking despite what robots.txt says.
- llms.txt presence — Check yourdomain.com/llms.txt. Missing is OK (most sites don't have one yet), but if present it should contain your site name, description, and links to your most important pages. A malformed or empty llms.txt is worse than no file at all.
- Structured data spot check — Paste one of your key pages into Google's Rich Results Test or Schema.org's validator. Look for at least one valid Organization or Product/SoftwareApplication schema on high-value pages, plus FAQPage schema on any Q&A content. Valid structured data is the single biggest lever for citation quality.
- AI search test — Actually ask ChatGPT, Claude, and Perplexity a question your brand should answer. Something like 'best [your category] for [your audience]' or 'how does [your product type] work'. If you don't show up, take note of which competitors do — that tells you which content and authority signals you're missing.
What a dedicated AI visibility tool adds
The manual checks above are good enough to find the big problems. A dedicated tool is worth it when you need three things the manual process can't give you: scale (checking dozens of pages at once), monitoring (detecting when something regresses), and attribution (tying visibility changes to traffic and revenue outcomes).
Scale matters because AI visibility is not a single-page problem. A site might have a clean homepage but structured data missing on every product page. A DIY audit on the homepage looks fine; the actual problem is invisible until you sample across your catalog. A good tool crawls every important page and reports issues as a distribution, not a single snapshot.
Monitoring matters because AI visibility regresses. A plugin update, a theme change, a CDN configuration sync — any of these can silently break a previously-working setup. Without automated monitoring, you find out months later when your Perplexity citations have dropped off. With monitoring, you get an alert the day something breaks.
Attribution matters because visibility is only valuable if it leads to outcomes. A tool that measures visibility AND tracks AI referral traffic AND ties both to conversion events is more useful than a tool that just grades your robots.txt. The question is never 'is my site accessible?' — it's 'is my AI investment driving revenue?'.
The signals a good checker tests
Beyond the basics, an AI visibility checker should test at least the following signal categories on every page it audits. This is the checklist we use internally when building our own scanner, and it's the bar we recommend measuring any tool against.
- Crawler access per bot — GPTBot, ChatGPT-User, OAI-SearchBot, ClaudeBot, PerplexityBot, Google-Extended, and Applebot-Extended all tested independently, with real HTTP fetches not just robots.txt parsing.
- Response quality — Not just status codes but response time, HTML completeness, and whether the page renders without JavaScript (many AI crawlers don't execute JS, so a client-side rendered site is invisible to them).
- Structured data validity — Not just presence but validity against Schema.org definitions and the specific requirements of each schema type. Invalid schema is sometimes worse than missing schema because it signals a broken implementation.
- Content specificity — Heuristics for whether pages contain citable facts versus generic marketing copy. Dense factual content gets cited; vague descriptions don't.
- llms.txt quality — Presence, format correctness, and whether the file actually helps AI models navigate the site (vs. an empty or malformed placeholder).
- Citation presence — Actual sampling of AI platforms to check whether the site is currently being cited for relevant queries, and how that changes over time.
How to interpret the score
Most AI visibility tools produce a score — somewhere between 0 and 100, sometimes split into categories. A score is useful as a trendline but misleading as an absolute value. A site at 72 isn't 'good enough'; it's 'good relative to the tool's weighting'. The real question is whether the score is moving in the right direction and whether the unresolved issues are the ones that matter for your business.
A common pattern: a site scores 85 but has zero ChatGPT citations, while a competitor scores 65 and gets cited constantly. The 85 site is passing technical checks but losing on content and authority. The 65 site has weaker infrastructure but has invested in citation-worthy content. The score tells you what the tool measured; the outcome tells you what actually works.
The correct interpretation: use the score to find specific fixable issues, not as a verdict. Fix the issues that are cheap and high-impact first (robots.txt, basic structured data). Then invest in the expensive ones (content quality, authority building). Don't optimize for the score itself — optimize for being the best answer to questions your customers ask AI.
Try the free scanner
We built AgentSurge's free AI visibility scanner to run all of the above checks in under a minute. Paste your domain, get a breakdown of which AI crawlers can reach you, which signals you're missing, and which specific fixes would move the needle most for your industry.
The scanner is free for one-off checks. For ongoing monitoring, automatic alerts when something regresses, and AI citation tracking across ChatGPT, Claude, Perplexity, and Gemini, upgrading to a paid plan adds the continuous side of what we covered above.
If you'd rather run everything yourself with the manual checklist below, that's a perfectly reasonable starting point. The goal is not to sell you a tool — it's to get you to a state where AI platforms can actually find and cite your content.
Execution Checklist
- • Open robots.txt and confirm GPTBot, ClaudeBot, PerplexityBot, ChatGPT-User, and Google-Extended are not blocked.
- • Curl each AI crawler user agent against a real content page and verify 200 responses.
- • Check whether llms.txt exists and, if so, that it's well-formed and includes your key pages.
- • Validate structured data on your top 5 pages with Schema.org's validator.
- • Ask ChatGPT, Claude, and Perplexity 3 questions your brand should answer — note who shows up instead of you.
- • Run a full audit tool for scale, monitoring, and attribution you can't do manually.
- • Focus fixes on the highest-impact issues first, not the score itself.
FAQ
Is AgentSurge's free checker comprehensive or limited?
The free scan covers robots.txt, crawler accessibility across all major AI bots, structured data detection, llms.txt, and a sample of citation queries. It's designed to catch the 80% of issues that most sites have, in under a minute. The paid plans add continuous monitoring, deeper crawling across all your pages, AI citation tracking over time, and attribution from AI referrals to conversion events.
How often should I check my AI visibility?
Weekly is ideal, monthly is the minimum. AI visibility changes faster than traditional SEO because AI platforms update their crawlers, index freshness, and model behavior on shorter cycles. A monthly check is enough to catch CDN regressions and plugin updates that break your setup; weekly gives you a tighter feedback loop on content changes. For most sites, automated monitoring (which runs in the background and only alerts you on issues) is more practical than manual weekly audits.
Can an AI visibility checker predict whether I'll be cited?
Partially. Accessibility and structured data are necessary conditions — if they fail, you won't be cited. But meeting those conditions doesn't guarantee citation; content quality, specificity, and authority determine whether AI models choose your content over alternatives. A good checker tells you the floor (whether AI can reach you) and samples current citation rates, but it can't predict whether a new piece of content will get picked up before the AI actually crawls it.
What should I fix first if my score is low?
Start with crawler access — unblocking GPTBot, ClaudeBot, and PerplexityBot is the single highest-impact fix and usually takes under 30 minutes. Second, add basic structured data (Organization schema sitewide, Product or SoftwareApplication on your key commercial pages). Third, publish an llms.txt file. Fourth, audit your most important 5-10 pages for citable specificity — replace vague marketing copy with concrete facts, pricing, feature lists, and comparisons.