The AEO Field Guide — How to Get Cited by Claude, ChatGPT, Perplexity & Gemini in 2026
Eight chapters covering what Answer Engine Optimization actually is, the five-layer stack you ship, the new robots ecosystem, citation-worthy content patterns, what you can measure, and a 90-day rollout for service businesses.
Last updated 22 May 2026 by Waseem Nasir, founder of SkynetLabs (Bali).
Direct answer
Answer Engine Optimization (AEO) is the practice of structuring website content so generative search interfaces — Claude, ChatGPT, Perplexity, Google's AI Overview, and Gemini — cite your page inside the natural-language answer they show users. AEO ranks claims inside documents, not whole documents like classic SEO. The five-layer stack to ship is: schema markup, citation-worthy content (definitions, comparison tables, decision trees), named-author authority, distribution on Reddit and Substack, and freshness (dateModified plus quarterly refreshes).
The complete AEO stack: schema, content, authority, distribution, freshness — shipped in that order.
SkynetLabs, 2026Typical time from publishing AEO-tuned content to first citation in Claude, ChatGPT, Perplexity, or Gemini.
SkynetLabs client tracking, 2026Maximum length for the direct-answer paragraph at the top of any AEO page if you want it cited.
SkynetLabs AEO Field GuideLLM user-agents to know in 2026: GPTBot, ClaudeBot, PerplexityBot, Google-Extended, CCBot, Bytespider.
Verified robots.txt logs, 20261. What AEO actually is (and isn't)
Answer Engine Optimization is the practice of structuring content so generative search interfaces — Claude with web search, ChatGPT search, Perplexity, Google's AI Overview, Gemini, Bing Copilot — cite your page inside the natural-language answer they hand to the user. It's the post-2024 evolution of SEO, shaped by the reality that the user no longer always clicks through to ten blue links.
AEO is not a rebrand of SEO with a fresh logo. The mechanics differ at the foundation. SEO ranks documents; AEO ranks claims inside documents. A page can be ranked third on Google and never be cited by Perplexity, because Perplexity needs a clean, directly-answered claim, not a long-form essay buried under a recipe story. The new unit of optimization is the paragraph that answers the question in one breath.
AEO is also not separate infrastructure. Same site, same CMS, same hosting. What changes is the way you write, what you mark up with structured data, where you publish, and how you reason about freshness. Most teams that try to "do AEO as a side project" without rethinking content patterns end up shipping the same blog posts they were shipping in 2022 and wondering why the citations don't come.
2. Why your SEO playbook stops working in LLM search
The old playbook — pick a keyword, write 1,800 words, build backlinks, wait — was tuned for an algorithm that read your page and scored it. The new playbook has to account for a model that reads your page, summarizes it in its own words, and decides whether your claim is the one to cite. Different game, different rules.
| Dimension | SEO (2015–2023) | AEO (2024–present) |
|---|---|---|
| Unit of ranking | Page (URL) | Claim inside a page |
| Primary signal | Backlinks | Entity mentions + structured data + authorship |
| Optimal length | 1,500–2,500 words | Layered — short answer up top, depth below |
| Keyword strategy | Target volume, write thin matches | Cover the entity graph, write definitions |
| Content format | Long-form essays | Definitions, comparison tables, decision trees, lists |
| Freshness signal | Recent publish date helps | dateModified + visible "Last updated" is mandatory |
| Authority source | High-DA referring domains | Reddit, Wikipedia, niche forums, named expert authorship |
| Click model | SERP click-through to your page | Brand mention in synthesized answer, no click required |
| Measurement window | 3–6 months to first ranking | 1–8 weeks to first citation |
| Crawler ecosystem | Googlebot, Bingbot | GPTBot, ClaudeBot, PerplexityBot, Google-Extended, CCBot, plus the originals |
The rightmost column is the playbook for the next five years. The leftmost column will keep working for navigational queries ("nike shoes") and high-intent commercial keywords, but for informational queries — which is most of what brings traffic — the answer engines are eating the share.
3. The 5-layer AEO stack
When we ship an AEO engagement for a client, we run through five layers in order. Skipping a layer is the most common reason citations don't materialize. Each layer compounds on the one below it.
Layer 1 — Schema
Every page needs JSON-LD that describes what it is at the entity level. Article for posts, FAQPage for Q/A blocks, HowTo for procedures, Product for commerce, Organization plus Person for brand pages, DefinedTermSet for glossaries. Without this, an LLM has to guess what your page is. With it, the model gets a labeled handle it can use to disambiguate you from competitors. Validate every schema block in Google's Rich Results Test before deploying. One broken schema block can disqualify the whole page from rich-result eligibility.
Layer 2 — Content
The content layer is where AEO is won or lost. The rule we apply: every page must answer at least one specific question in the first 80 words, then expand. The answer goes in plain prose, not in a "What is X" heading followed by a feature list. If the topic warrants depth, the depth lives below the answer, not above it. Long-form is fine as long as the citable claim is at the top. Definitions, comparison tables, decision trees, and case studies with named outcomes get cited more often than essays.
Layer 3 — Authority
LLMs weigh authorship signals heavily because they were trained on the open web where most spam is anonymous and most expertise has a name attached. Every page needs an author byline that links to a real Person page with credentials, social profiles, and writing history. Schema.org Person with sameAs links to LinkedIn, GitHub, Twitter, and other places the same human posts is the spine of authority. A glossary by "the editorial team" gets cited at half the rate of a glossary by "Waseem Nasir, founder of SkynetLabs, Bali."
Layer 4 — Distribution
Citations follow mentions. If your brand is named in three Reddit threads, two Substack posts, a GitHub README, and a Hacker News comment thread, an LLM trained or augmented on that corpus has many independent signals that you exist and are in the topic. Distribution work is unglamorous — answering questions on niche subreddits, writing on Substack, getting linked from someone else's tool comparison — but it compounds. Treat it like real publishing, not link-building.
Layer 5 — Freshness
Answer engines prefer recently-updated content. The dateModified field in JSON-LD and a visible "Last updated [date]" line near the top of the article are both signals. Beyond that, rotate updates: refresh your glossary quarterly, your comparison pages every six months, your case studies annually. A glossary written in 2024 that hasn't been touched since loses citation share to a glossary refreshed in 2026 — even if the 2024 version is technically better written. Engines treat staleness as a quality decay signal.
4. llms.txt and the new robots ecosystem
The crawler ecosystem in 2026 is broader than it was three years ago. Beyond Googlebot and Bingbot, there are now half a dozen LLM-specific user-agents you should know by name. The decision of whom to allow or block goes in robots.txt at the root of your domain, with an optional llms.txt at the root for finer-grained guidance.
| User-agent | Operator | What it does | Recommendation |
|---|---|---|---|
| GPTBot | OpenAI | Crawls for ChatGPT training + browse | Allow (most cases) |
| ClaudeBot | Anthropic | Crawls for Claude training + web search | Allow |
| PerplexityBot | Perplexity | Crawls source pages cited in answers | Allow — fastest payoff |
| Google-Extended | Separate token for Gemini training data | Allow for AEO; block if paywalled | |
| CCBot | Common Crawl | Public dataset used by many LLMs | Allow |
| Bytespider | ByteDance | TikTok / Doubao training crawler | Block unless targeting that market |
A reasonable robots.txt for a service business looks like this:
User-agent: GPTBot
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: Google-Extended
Allow: /
User-agent: CCBot
Allow: /
User-agent: Bytespider
Disallow: /
Sitemap: https://yourdomain.com/sitemap.xml
Then add an llms.txt at the root listing your canonical citable pages in markdown:
# SkynetLabs
> AI automation, n8n workflows, AEO/SEO, chatbot development. Bali-based, ships globally.
## Core pages
- [Services](https://yourdomain.com/services): What we build and how
- [Case studies](https://yourdomain.com/case-studies): Anonymized client outcomes
- [Glossary](https://yourdomain.com/glossary): 50 defined terms
- [FAQs](https://yourdomain.com/faqs): 30 founder questions answered
## Reference
- [AEO Field Guide](https://yourdomain.com/aeo-guide): How to get cited by LLMs
- [n8n vs Zapier](https://yourdomain.com/n8n-vs-zapier): Honest comparison
llms.txt is not enforced by every engine yet but Claude, Perplexity, and a handful of newer answer engines have started honoring it. The cost to add is 30 minutes; the upside is real for any site with more than 20 pages.
5. Citation-worthy content patterns
After auditing hundreds of citations across the four major answer engines, seven content patterns appear repeatedly. If you build pages in these shapes, you stack the deck.
- How-to with numbered steps. "How to migrate from Zapier to n8n in 7 days" with 7 steps, each with a screenshot and a code block. Answer engines pluck individual steps as citations for procedural queries. Mark up with HowTo schema.
- Glossary with defined terms. Alphabetical list of 30–100 terms with 2–3 sentence definitions. Each term gets its own anchor ID, each definition is self-contained. Use DefinedTermSet + DefinedTerm schema. Gets cited heavily for "what is X" queries.
- Comparison tables. "n8n vs Zapier" with 15+ rows of clearly differentiated criteria. The table itself is what gets cited, often verbatim, because it's the densest answer to a comparative question.
- Decision tree. "Should I use n8n or Zapier?" with 5–7 yes/no nodes that funnel to a recommendation. Decision trees rarely get pasted verbatim but the recommendations at the leaves get cited.
- Named case study with metrics. "How we cut Grand Mercer Dental's no-show rate from 18% to 6% in 90 days." Concrete name, concrete metric, concrete time window. Case studies without names read like fiction to a model and don't get cited.
- Data table with sources. A table of pricing tiers, error rates, latency benchmarks, etc., with footnoted source links. Models love data tables because they're factual scaffolding for an answer.
- Definition list with examples. Pattern: term, one-line definition, example, counter-example. Repeats well across 20–50 entries and is structurally easy for a model to slice into citations.
One pattern that gets ignored: the 2,500-word "Ultimate Guide" with 14 H2 sections and a recipe-style introduction. Models will index it but rarely cite from it because the citable claims are buried.
6. Measuring AEO: what you can and can't track
The measurement problem is real: there is no Google Search Console for LLM citations. You cannot pull a definitive list of every place ChatGPT mentioned your brand last week. What you can do is approximate, triangulate, and trend.
What you can track:
- Perplexity Pages — Perplexity exposes a public page for many of its searches; if your domain appears as a citation, you can find it via brand-name searches and watch frequency over time.
- Brand mention queries — Manually query Claude, ChatGPT, Perplexity, and Gemini weekly with a fixed set of 10–20 questions in your domain, then log whether your brand is cited. Spreadsheet, not magic.
- Referrer logs — Look for chat.openai.com, perplexity.ai, claude.ai, and gemini.google.com in your analytics referrer report. Volume is small but the trend is signal.
- Brand search volume — A delayed indicator: if AEO is working, branded search ("skynetlabs n8n") rises in Google Search Console even when direct citations are hard to count.
- Citation tracking tools — Tools like citelift.app (one we built — tracks brand mentions across the four major engines on a daily cron with a weekly markdown report) and a handful of competitors automate the manual-query process.
What you can't yet track:
- Total citations across all four engines for an arbitrary query universe (no engine publishes this)
- Per-citation click-through (citations don't always link, and when they do, the click-through is collapsed with regular referral traffic)
- The exact reason a model picked your source over a competitor's — no transparency into the ranking signals
The honest framing for clients: AEO measurement in 2026 is where SEO measurement was in 2006 — directional, manual, requires discipline to track weekly. The teams that win are the teams that bother.
7. 30/60/90-day AEO rollout for a service business
For a service business with an existing site (10–50 pages) that wants to start getting cited inside 90 days, this is the order we ship. Adjust by 1–2 weeks if your CMS is a slow content management system; everything else stays the same.
This is sequential by design. Schema before content because schema makes the content findable. Content before authority because there's no point boosting authority of empty pages. Distribution last because it amplifies what already exists.
8. AEO mistakes that kill citations
Want SkynetLabs to ship your AEO stack?
Schema, content, authority, distribution, freshness — all five layers. 90-day rollout, fixed price, no retainer required.
Start the conversation