AEOinformational intent

What Is llms.txt? The New Standard for Making Your Site AI-Readable

llms.txt is an emerging web standard that tells AI models what your site is about and which pages matter most. Learn how it works, how to create one, and why it improves your visibility in ChatGPT, Claude, and Perplexity.

Feb 26, 202610 min readWeb developers, site owners, and SEO professionals
what is llms.txtllms.txt examplellms.txt formathow to create llms.txtllms.txt vs robots.txtllms.txt specification

The problem llms.txt solves

robots.txt tells crawlers where they can and cannot go. sitemap.xml tells crawlers which URLs exist. But neither file tells an AI model what your site actually does, what content is most important, or how to interpret your pages in context. An AI crawler visiting your site for the first time has to guess which pages matter — and it often guesses wrong.

llms.txt fills this gap. It's a plain text file at your domain root (yourdomain.com/llms.txt) that provides a machine-readable brief about your site: what it is, what it offers, and which pages an AI model should prioritize when building its understanding of your content.

Think of it like a README file for AI. Just as a GitHub README helps a developer understand a repository quickly, llms.txt helps an AI model understand your website quickly.

How llms.txt works

The format is intentionally simple — plain Markdown with a structured heading hierarchy. The file starts with your site name as an H1, followed by a brief description (a blockquote), then organized sections of links with optional descriptions.

Here's the basic structure: a site title, a one-paragraph summary, then categorized lists of your most important URLs. Categories might include 'Products', 'Documentation', 'Policies', or whatever makes sense for your site. Each link can include a brief inline description.

There is also an extended format called llms-full.txt that includes the actual content of linked pages inline, so AI models can ingest your key content in a single request without following links. This is useful for sites with complex product information or documentation that benefits from being consumed as a unit.

A real-world llms.txt example

For a SaaS company, a well-structured llms.txt might look like this: the site name and description at the top, followed by sections for core product pages (pricing, features, integrations), documentation (API docs, getting started guide), and policies (privacy, terms, security). Each entry is a Markdown link with a one-line description of what the page contains.

For an ecommerce store, categories might include top product collections, shipping and returns policies, size guides, and customer support pages. The key principle is prioritization — llms.txt is not your sitemap. It should contain 10-30 of your most important URLs, not thousands.

Avoid including URLs that are behind authentication, dynamically generated with session tokens, or likely to change frequently. The file should be stable and every link should resolve to a publicly accessible page.

llms.txt vs robots.txt vs sitemap.xml

These three files serve complementary purposes and you should have all three.

robots.txt controls access — which crawlers can visit which paths. It's a gatekeeping mechanism. Without proper robots.txt configuration, AI crawlers might be blocked from your site entirely.

sitemap.xml provides discovery — a complete list of all indexable URLs on your site. It helps crawlers find pages they might miss through link following alone. Sitemaps are comprehensive but provide no context about what each page is or how important it is.

llms.txt provides context and priority — a curated selection of your most important pages with descriptions of what they contain and how they relate to your site's purpose. It helps AI models understand your site holistically rather than as a collection of disconnected pages.

Placement and hosting

llms.txt should be placed at the root of your domain: yourdomain.com/llms.txt. For consistency with other web standards, you can also serve it at yourdomain.com/.well-known/llms.txt — serving both locations is ideal.

The file should be served with a text/plain or text/markdown content type. Ensure it's not blocked by robots.txt (which would be self-defeating). The file should be accessible without authentication and should not require JavaScript to render.

For frameworks like Next.js, you can generate llms.txt from your route structure at build time. For WordPress, a static file in your web root works. For Shopify, you'll need to use a proxy page or an app that serves the file at the correct path.

How AI platforms use llms.txt

Perplexity has been the most public about using llms.txt — their crawler actively looks for the file and uses it to prioritize which pages to index and cite in search results. Sites with llms.txt see higher retrieval rates in Perplexity answers because the file gives their crawler a reliable content map.

ChatGPT's Browse feature and OpenAI's SearchGPT check for llms.txt when evaluating a site's content structure. While OpenAI hasn't published detailed documentation about how they weight the file, empirical testing shows that sites with llms.txt appear more consistently in browsing-mode answers.

The standard is still early, but adoption is accelerating. As more AI platforms formalize their crawling and retrieval pipelines, llms.txt is becoming the de facto standard for AI-readable site descriptions — similar to how robots.txt became universal for search engines in the 1990s.

Execution Checklist

  • Create a llms.txt file with your site name, description, and 10-30 most important URLs.
  • Serve llms.txt at both /llms.txt and /.well-known/llms.txt.
  • Verify every URL in llms.txt is publicly accessible and returns a 200 status code.
  • Ensure robots.txt allows crawler access to /llms.txt itself.
  • Update llms.txt whenever you add, remove, or restructure major pages.
  • Consider creating llms-full.txt with inline content for key pages.

FAQ

Is llms.txt an official web standard?

llms.txt is a community-driven proposal, not a W3C or IETF standard. However, it has gained practical adoption across multiple AI platforms and is rapidly becoming a de facto standard, similar to how robots.txt was widely adopted before formal standardization.

How is llms.txt different from a sitemap?

A sitemap lists all indexable URLs on your site (often thousands). llms.txt is a curated priority list of your 10-30 most important pages with descriptions and context. A sitemap helps with discovery; llms.txt helps with understanding.

Do I still need llms.txt if I have good structured data?

Yes. Structured data (JSON-LD) helps AI models understand individual pages. llms.txt helps them understand your site as a whole — what it does, how pages relate to each other, and which content to prioritize. They serve different purposes and work best together.

Run Free AuditView PricingBack to Blog

Related Posts