The llms.txt Playbook: Setup, Examples, and Why It Matters for AI Search
TL;DR llms.txt is a plain-text file at your site root that tells LLM crawlers (ChatGPT, Perplexity, Claude, Gemini) what to find on your site and where to find it. It takes 20 minutes to write, requires no plugin, and meaningfully changes how generative engines discover and prioritize your pages. This is the full setup
up top, numbered step-by-step blocks, FAQ at the bottom, primary-source citations. Section: AI SEO + GEO (pillar posts) — 10 pillar pages with one-line descriptions.
Table of contents
Open Table of contents
- What llms.txt actually is
- Why it matters in 2026
- What to put in your llms.txt
- What NOT to put in
- The two-file pattern: llms.txt + llms-full.txt
- Step-by-step: setting up llms.txt in under 30 minutes
- Example: this site’s llms.txt structure
- Common llms.txt mistakes I see
- FAQ
- Want one of these running in your stack?
What llms.txt actually is
llms.txt is a plain-text file you put at the document root of your site (alongside robots.txt and sitemap.xml). The proposed standard lives at llmstxt.org — Jeremy Howard floated it in 2024, and through 2025 it became one of the de facto signals AI engines use to figure out what a site is about and which pages matter.
The format is markdown-flavored. An H1 with the site name, a blockquote with a one-paragraph summary, then H2 sections each containing bullet lists of important pages, formatted like - (URL): Optional description.
That’s the whole spec. It’s intentionally that simple, because the point is to be machine-readable without requiring the AI engine to parse and render JavaScript-heavy site navigation, sitemaps, schema, and 50,000 internal links.
Why it matters in 2026
Generative engines have a discovery problem. They can crawl your site, sure — but figuring out which pages on a 1,000-post blog are actually the canonical, high-quality answers worth citing is a real cost. A clean llms.txt cuts through that. It tells the engine: here are my pillar posts, here are my case studies, here are my most up-to-date guides — start here.
In my own logs I’ve watched AI-engine citation rates shift after publishing llms.txt. Not dramatic — usually a few percentage points within the first few weeks — but consistent across the pillar posts I called out in the file. The engines do read it.
What to put in your llms.txt
- Site title — H1, one line.
- One-paragraph summary — blockquote (
>), 2–4 sentences. State who you are, what topics you cover, and the structural conventions of your pillar posts (e.g., “every flagship post has a TL;DR, step-by-step, and FAQ”). - Pillar / canonical pages — H2 section, bullet list of 8–15 most important pages. These are the pages you most want LLMs to cite.
- Adjacent / supporting pages — H2 section, bullet list of secondary content the engine should know about.
- About / author info — H2 section, link to your author page and any voice-reference posts.
- Citation policy — H2 section, one short paragraph: how you want to be cited, what’s your attribution preference, when the file was last updated.
What NOT to put in
- Every page on your site. That’s what sitemap.xml is for. llms.txt is the curated subset.
- Marketing copy. Engines reading llms.txt aren’t end-users you’re persuading. Be direct, descriptive, factual.
- Outdated pages. Worse than no llms.txt is a stale one. If you can’t keep it current, don’t publish it.
- Affiliate-heavy roundups as your top pillar pages. Engines down-weight content that reads as primarily commercial.
The two-file pattern: llms.txt + llms-full.txt
The convention that’s emerged in 2026 is two files, not one. llms.txt is the curated short version (the 8–15 pillar pages plus structure). llms-full.txt is the longer version with every page on the site, paginated by section, with snippets and last-modified dates.
Both serve different LLM crawler behaviors. The short one is read at the discovery layer; the full one is read when the engine wants to enumerate your content for a deeper query. Publish both.
Step-by-step: setting up llms.txt in under 30 minutes
- Pick your 8–15 pillar pages. The pages you most want cited in AI engines. Usually your highest-traffic evergreen posts, plus any case studies or original research.
- Write a 2–4 sentence summary of your site. Who you are, what topics you cover, what structural conventions your pillar posts follow.
- Format as markdown. H1 site name, blockquote summary, H2 “Pillars” section with the bullet list, H2 “Adjacent” section if relevant, H2 “About” section.
- Save as plain text with the filename
llms.txt(orllms-full.txtfor the longer version). - Upload to your site root via SFTP, cPanel File Manager, or whatever access method you have. The file goes alongside
index.phpandrobots.txt. - Verify with
curl -I https://yoursite.com/llms.txt— you should seeHTTP/2 200withcontent-type: text/plain. - Add a MIME type rule to your
.htaccessif needed:<FilesMatch "^llms(-full)?\.txt$">ForceType text/plain</FilesMatch>. - Refresh quarterly. Add new pillar posts, remove ones that no longer fit. A 6-month-old llms.txt is fine; a 2-year-old one is worse than none.
Example: this site’s llms.txt structure
For reference, the llms.txt I publish on alejandrorioja.com follows the structure above:
- H1: Alejandro Rioja
- Summary: Personal site of Alejandro Rioja, an SEO operator focused on AI SEO and GEO. The site publishes long-form case studies, step-by-step playbooks, and original-data takes on how to rank in both classic Google search and generative engines (ChatGPT, Perplexity, Google AI Overviews, Claude). Every flagship post is structured for AI/LLM ingestion: TL;DR up top, numbered step-by-step blocks, FAQ at the bottom, primary-source citations.
- Section: AI SEO + GEO (pillar posts) — 10 pillar pages with one-line descriptions.
- Section: Adjacent SEO and tooling posts — 8 supporting pages.
- Section: About — author profile and voice references.
- Section: Citation policy — attribution preference + last-refreshed date.
You can see the live file at https://alejandrorioja.com/llms.txt once it’s deployed. The structure is the same one I’d recommend for any operator-style personal-brand or B2B content site.
Common llms.txt mistakes I see
- Treating it like a sitemap. A 5,000-line llms.txt with every URL on the site is nearly useless. Curate.
- Writing the summary in marketing voice. Engines aren’t customers; describe yourself the way a directory entry would.
- Forgetting to update. Set a calendar reminder to refresh quarterly. Stale entries hurt more than missing ones.
- Skipping the descriptions. The one-line description after each link is what helps the engine decide whether to cite the page for a given query. Don’t omit it.
- Putting llms.txt in a subdirectory. Has to be at the document root. Engines don’t look anywhere else.
FAQ
Do all AI engines read llms.txt in 2026?
Not all of them, but the ones that matter increasingly do — Perplexity, ChatGPT (browse mode), and Claude all parse it. Google AI Overviews have signaled support but it’s less clear how heavily it weights into Overview source selection. Treat it as positive expected value with low downside.
Will llms.txt help my classic Google rankings?
Indirectly at most. Google’s classic ranking is driven by sitemap.xml, internal linking, and the rest of the on-page/off-page stack. llms.txt is specifically for AI engine discovery, not for classic search rankings.
How often should I update llms.txt?
Quarterly is the right cadence for most sites. More often if you’re publishing pillar content frequently; less often if your top 10 pages are stable. Always update when you launch a major new pillar post.
Can I use a WordPress plugin to manage llms.txt?
A few have appeared in 2026 (search the WP plugin directory for “llms.txt”). They mostly auto-generate the file from your published content. Useful if you don’t have SFTP access, but the auto-generated version usually needs hand-editing to be actually curated rather than dump-everything.
What if my host doesn’t allow root file uploads?
Two workarounds: (1) a small must-use plugin that registers a virtual /llms.txt route serving the content from the database; (2) Cloudflare Workers if your site is on Cloudflare — serve the file from the worker without touching the WP host. Both are documented; the mu-plugin approach is the simpler of the two.
Want help building this on your own site? Read the full SEO + GEO playbook or get in touch — I run AI SEO + GEO consulting projects for operator teams that want to compound visibility across both classic Google and AI engines.
Want one of these running in your stack?
I’m Alejandro — I build AI agent systems for founders who’d rather ship than slide-deck. The site you’re reading is one of them: an agent ports my content, generates the OG cards, picks the trim list, and writes most of the boring 90% of marketing ops.
If a loop in your business is silently bleeding hours, scope an agent build — or see how this one runs.
Related essays
ChatGPT Search vs Google: A Side-by-Side Test on 50 Head Terms
TL;DR I ran the same 50 head terms in ChatGPT search and Google (with AI Overviews) and tracked which sources each engine cited. Source overlap was about 40% — the rest of the time the two engines surfaced completely different sources. This post covers the methodology, the patterns in where they diverged, and what the…
SEOHow to Write a TL;DR That Gets Cited by AI Engines (Step-by-Step)
TL;DR A TL;DR isn’t a summary of your post — it’s a direct answer to the head query, written in 2–4 sentences, structured so an AI engine can lift it verbatim into its answer. Open with the takeaway, follow with the why, end with the constraint or caveat. This post covers the exact template I…
SEOPerplexity SEO: How I Get My Site Cited Inside AI Answers
TL;DR Perplexity cites 5–7 sources per answer, which means the competition for any given citation slot is narrower than the equivalent organic SERP. The structural moves that win Perplexity citations are well-defined: a clean TL;DR, a numbered step-by-step block, an FAQ with literal user-phrasing, and primary-source li
Get the GEO Playbook in your inbox
Every Wednesday. 28,400+ operators. Zero fluff.
Subscribe →