Waseem Nasir

SkynetLabs · Field Notes · 2026

The AEO Field Guide — How to Get Cited by Claude, ChatGPT, Perplexity & Gemini in 2026

Eight chapters covering what Answer Engine Optimization actually is, the five-layer stack you ship, the new robots ecosystem, citation-worthy content patterns, what you can measure, and a 90-day rollout for service businesses.

Last updated 22 May 2026 by Waseem Nasir, founder of SkynetLabs (Bali).

Direct answer

Answer Engine Optimization (AEO) is the practice of structuring website content so generative search interfaces — Claude, ChatGPT, Perplexity, Google's AI Overview, and Gemini — cite your page inside the natural-language answer they show users. AEO ranks claims inside documents, not whole documents like classic SEO. The five-layer stack to ship is: schema markup, citation-worthy content (definitions, comparison tables, decision trees), named-author authority, distribution on Reddit and Substack, and freshness (dateModified plus quarterly refreshes).

5 layers

The complete AEO stack: schema, content, authority, distribution, freshness — shipped in that order.

SkynetLabs, 2026

2–12 weeks

Typical time from publishing AEO-tuned content to first citation in Claude, ChatGPT, Perplexity, or Gemini.

SkynetLabs client tracking, 2026

≤80 words

Maximum length for the direct-answer paragraph at the top of any AEO page if you want it cited.

SkynetLabs AEO Field Guide

6 crawlers

LLM user-agents to know in 2026: GPTBot, ClaudeBot, PerplexityBot, Google-Extended, CCBot, Bytespider.

Verified robots.txt logs, 2026

1. What AEO actually is (and isn't)

Answer Engine Optimization is the practice of structuring content so generative search interfaces — Claude with web search, ChatGPT search, Perplexity, Google's AI Overview, Gemini, Bing Copilot — cite your page inside the natural-language answer they hand to the user. It's the post-2024 evolution of SEO, shaped by the reality that the user no longer always clicks through to ten blue links.

AEO is not a rebrand of SEO with a fresh logo. The mechanics differ at the foundation. SEO ranks documents; AEO ranks claims inside documents. A page can be ranked third on Google and never be cited by Perplexity, because Perplexity needs a clean, directly-answered claim, not a long-form essay buried under a recipe story. The new unit of optimization is the paragraph that answers the question in one breath.

AEO is also not separate infrastructure. Same site, same CMS, same hosting. What changes is the way you write, what you mark up with structured data, where you publish, and how you reason about freshness. Most teams that try to "do AEO as a side project" without rethinking content patterns end up shipping the same blog posts they were shipping in 2022 and wondering why the citations don't come.

2. Why your SEO playbook stops working in LLM search

The old playbook — pick a keyword, write 1,800 words, build backlinks, wait — was tuned for an algorithm that read your page and scored it. The new playbook has to account for a model that reads your page, summarizes it in its own words, and decides whether your claim is the one to cite. Different game, different rules.

Dimension	SEO (2015–2023)	AEO (2024–present)
Unit of ranking	Page (URL)	Claim inside a page
Primary signal	Backlinks	Entity mentions + structured data + authorship
Optimal length	1,500–2,500 words	Layered — short answer up top, depth below
Keyword strategy	Target volume, write thin matches	Cover the entity graph, write definitions
Content format	Long-form essays	Definitions, comparison tables, decision trees, lists
Freshness signal	Recent publish date helps	dateModified + visible "Last updated" is mandatory
Authority source	High-DA referring domains	Reddit, Wikipedia, niche forums, named expert authorship
Click model	SERP click-through to your page	Brand mention in synthesized answer, no click required
Measurement window	3–6 months to first ranking	1–8 weeks to first citation
Crawler ecosystem	Googlebot, Bingbot	GPTBot, ClaudeBot, PerplexityBot, Google-Extended, CCBot, plus the originals

The rightmost column is the playbook for the next five years. The leftmost column will keep working for navigational queries ("nike shoes") and high-intent commercial keywords, but for informational queries — which is most of what brings traffic — the answer engines are eating the share.

3. The 5-layer AEO stack

When we ship an AEO engagement for a client, we run through five layers in order. Skipping a layer is the most common reason citations don't materialize. Each layer compounds on the one below it.

Layer 1 — Schema Machine-readable foundations

Layer 2 — Content Citation-worthy answers, definitions, comparisons

Layer 3 — Authority Named authorship + external entity mentions

Layer 4 — Distribution Reddit, Substack, LinkedIn, GitHub README mentions

Layer 5 — Freshness dateModified, recurring updates, version flags

Layer 1 — Schema

Every page needs JSON-LD that describes what it is at the entity level. Article for posts, FAQPage for Q/A blocks, HowTo for procedures, Product for commerce, Organization plus Person for brand pages, DefinedTermSet for glossaries. Without this, an LLM has to guess what your page is. With it, the model gets a labeled handle it can use to disambiguate you from competitors. Validate every schema block in Google's Rich Results Test before deploying. One broken schema block can disqualify the whole page from rich-result eligibility.

Layer 2 — Content

The content layer is where AEO is won or lost. The rule we apply: every page must answer at least one specific question in the first 80 words, then expand. The answer goes in plain prose, not in a "What is X" heading followed by a feature list. If the topic warrants depth, the depth lives below the answer, not above it. Long-form is fine as long as the citable claim is at the top. Definitions, comparison tables, decision trees, and case studies with named outcomes get cited more often than essays.

Layer 3 — Authority

LLMs weigh authorship signals heavily because they were trained on the open web where most spam is anonymous and most expertise has a name attached. Every page needs an author byline that links to a real Person page with credentials, social profiles, and writing history. Schema.org Person with sameAs links to LinkedIn, GitHub, Twitter, and other places the same human posts is the spine of authority. A glossary by "the editorial team" gets cited at half the rate of a glossary by "Waseem Nasir, founder of SkynetLabs, Bali."

Layer 4 — Distribution

Citations follow mentions. If your brand is named in three Reddit threads, two Substack posts, a GitHub README, and a Hacker News comment thread, an LLM trained or augmented on that corpus has many independent signals that you exist and are in the topic. Distribution work is unglamorous — answering questions on niche subreddits, writing on Substack, getting linked from someone else's tool comparison — but it compounds. Treat it like real publishing, not link-building.

Layer 5 — Freshness

Answer engines prefer recently-updated content. The dateModified field in JSON-LD and a visible "Last updated [date]" line near the top of the article are both signals. Beyond that, rotate updates: refresh your glossary quarterly, your comparison pages every six months, your case studies annually. A glossary written in 2024 that hasn't been touched since loses citation share to a glossary refreshed in 2026 — even if the 2024 version is technically better written. Engines treat staleness as a quality decay signal.

4. llms.txt and the new robots ecosystem

The crawler ecosystem in 2026 is broader than it was three years ago. Beyond Googlebot and Bingbot, there are now half a dozen LLM-specific user-agents you should know by name. The decision of whom to allow or block goes in robots.txt at the root of your domain, with an optional llms.txt at the root for finer-grained guidance.

User-agent	Operator	What it does	Recommendation
GPTBot	OpenAI	Crawls for ChatGPT training + browse	Allow (most cases)
ClaudeBot	Anthropic	Crawls for Claude training + web search	Allow
PerplexityBot	Perplexity	Crawls source pages cited in answers	Allow — fastest payoff
Google-Extended	Google	Separate token for Gemini training data	Allow for AEO; block if paywalled
CCBot	Common Crawl	Public dataset used by many LLMs	Allow
Bytespider	ByteDance	TikTok / Doubao training crawler	Block unless targeting that market

A reasonable robots.txt for a service business looks like this:

User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: Google-Extended
Allow: /

User-agent: CCBot
Allow: /

User-agent: Bytespider
Disallow: /

Sitemap: https://yourdomain.com/sitemap.xml

Then add an llms.txt at the root listing your canonical citable pages in markdown:

# SkynetLabs

> AI automation, n8n workflows, AEO/SEO, chatbot development. Bali-based, ships globally.

## Core pages
- [Services](https://yourdomain.com/services): What we build and how
- [Case studies](https://yourdomain.com/case-studies): Anonymized client outcomes
- [Glossary](https://yourdomain.com/glossary): 50 defined terms
- [FAQs](https://yourdomain.com/faqs): 30 founder questions answered

## Reference
- [AEO Field Guide](https://yourdomain.com/aeo-guide): How to get cited by LLMs
- [n8n vs Zapier](https://yourdomain.com/n8n-vs-zapier): Honest comparison

llms.txt is not enforced by every engine yet but Claude, Perplexity, and a handful of newer answer engines have started honoring it. The cost to add is 30 minutes; the upside is real for any site with more than 20 pages.

5. Citation-worthy content patterns

After auditing hundreds of citations across the four major answer engines, seven content patterns appear repeatedly. If you build pages in these shapes, you stack the deck.

How-to with numbered steps. "How to migrate from Zapier to n8n in 7 days" with 7 steps, each with a screenshot and a code block. Answer engines pluck individual steps as citations for procedural queries. Mark up with HowTo schema.
Glossary with defined terms. Alphabetical list of 30–100 terms with 2–3 sentence definitions. Each term gets its own anchor ID, each definition is self-contained. Use DefinedTermSet + DefinedTerm schema. Gets cited heavily for "what is X" queries.
Comparison tables. "n8n vs Zapier" with 15+ rows of clearly differentiated criteria. The table itself is what gets cited, often verbatim, because it's the densest answer to a comparative question.
Decision tree. "Should I use n8n or Zapier?" with 5–7 yes/no nodes that funnel to a recommendation. Decision trees rarely get pasted verbatim but the recommendations at the leaves get cited.
Named case study with metrics. Pattern: "How we cut [client name]'s no-show rate from [before] to [after] in [window]." Concrete name, concrete metric, concrete time window — and only ever numbers you can produce evidence for. Case studies without names read like fiction to a model and don't get cited.
Data table with sources. A table of pricing tiers, error rates, latency benchmarks, etc., with footnoted source links. Models love data tables because they're factual scaffolding for an answer.
Definition list with examples. Pattern: term, one-line definition, example, counter-example. Repeats well across 20–50 entries and is structurally easy for a model to slice into citations.

One pattern that gets ignored: the 2,500-word "Ultimate Guide" with 14 H2 sections and a recipe-style introduction. Models will index it but rarely cite from it because the citable claims are buried.

6. Measuring AEO: what you can and can't track

The measurement problem is real: there is no Google Search Console for LLM citations. You cannot pull a definitive list of every place ChatGPT mentioned your brand last week. What you can do is approximate, triangulate, and trend.

What you can track:

Perplexity Pages — Perplexity exposes a public page for many of its searches; if your domain appears as a citation, you can find it via brand-name searches and watch frequency over time.
Brand mention queries — Manually query Claude, ChatGPT, Perplexity, and Gemini weekly with a fixed set of 10–20 questions in your domain, then log whether your brand is cited. Spreadsheet, not magic.
Referrer logs — Look for chat.openai.com, perplexity.ai, claude.ai, and gemini.google.com in your analytics referrer report. Volume is small but the trend is signal.
Brand search volume — A delayed indicator: if AEO is working, branded search ("skynetlabs n8n") rises in Google Search Console even when direct citations are hard to count.
Citation tracking tools — Tools like citelift.app (one we built — tracks brand mentions across the four major engines on a daily cron with a weekly markdown report) and a handful of competitors automate the manual-query process.

What you can't yet track:

Total citations across all four engines for an arbitrary query universe (no engine publishes this)
Per-citation click-through (citations don't always link, and when they do, the click-through is collapsed with regular referral traffic)
The exact reason a model picked your source over a competitor's — no transparency into the ranking signals

The honest framing for clients: AEO measurement in 2026 is where SEO measurement was in 2006 — directional, manual, requires discipline to track weekly. The teams that win are the teams that bother.

7. 30/60/90-day AEO rollout for a service business

For a service business with an existing site (10–50 pages) that wants to start getting cited inside 90 days, this is the order we ship. Adjust by 1–2 weeks if your CMS is a slow content management system; everything else stays the same.

Week 1 — Schema foundation Audit existing JSON-LD across all pages. Add Organization + Person schema to /about. Add Article schema to every blog post. Add FAQPage schema where applicable. Validate every block in Google's Rich Results Test. Deploy.

Week 2 — robots.txt + llms.txt Write robots.txt allowing GPTBot, ClaudeBot, PerplexityBot, Google-Extended, CCBot. Block Bytespider unless targeting that market. Write llms.txt listing 10–20 canonical pages with one-line descriptions. Deploy.

Week 3–4 — Glossary + FAQs Build a 40–60-term glossary using DefinedTermSet schema. Build a 25–35 question FAQ page using FAQPage schema. Both should answer the actual questions clients send by email, not made-up ones.

Week 5–6 — Comparison + decision content Identify the two or three "X vs Y" queries your prospects actually ask. Build a deep comparison page for each with 15+ row tables, 5+ real-world scenarios, and a decision tree. Use Article + Table schema.

Week 7–8 — Author authority Build an /author/[your-name] page with full Person schema, sameAs links to LinkedIn, Twitter, GitHub, and any other public profiles. Add author bylines to every existing post that links to the author page. Backfill bylines as needed.

Week 9–10 — Case studies Publish 3–5 named case studies with concrete metrics and time windows. Anonymize where required but keep specifics (industry, size, problem) intact. Mark up with Article schema; include CreativeWork if applicable.

Week 11–12 — Distribution + measurement Plant 5–10 authentic mentions: niche subreddits, Substack posts, GitHub READMEs of relevant tools, Hacker News comment threads. Set up weekly citation tracking spreadsheet covering 15+ queries across the four major engines. First trend data appears around week 12–14.

This is sequential by design. Schema before content because schema makes the content findable. Content before authority because there's no point boosting authority of empty pages. Distribution last because it amplifies what already exists.

8. AEO mistakes that kill citations

1. Burying the answer below 600 words of preamble. The recipe-blog format ("My grandmother used to say…" before the recipe) is poison. Answer in the first 80 words, expand below.

2. Anonymous content. "By the team" or no byline at all. Models discount unsigned sources. Real human, real page, real credentials.

3. Blocking LLM crawlers in robots.txt by default. Some agencies block GPTBot and ClaudeBot to "protect content" then wonder why nobody cites their site. Allow them.

4. Schema markup that doesn't match visible content. Generating FAQPage schema with questions that aren't actually on the page. Google now penalizes this and LLMs notice the inconsistency.

5. Stale dateModified. Last updated 2022 on a topic that moves quarterly. Refresh content and refresh the timestamp. Don't refresh the timestamp without changing content — engines catch that.

6. Generic claims with no numbers. "We scale well" instead of "n8n's queue mode plus Redis handles 10K webhooks per hour on a small Hetzner VPS." Numbers get cited; vibes don't.

7. Walled gardens. Putting your best content behind email-gate forms or paywalls that the crawler can't read. The citation is worth more than the email.

8. Keyword stuffing reborn as entity stuffing. Repeating "AI automation" 47 times in a 1,000-word post doesn't help. Cover the entity graph (related concepts) once and well.

9. No internal linking. Glossary terms that don't link to the FAQ that doesn't link to the case study that doesn't link to the service page. Build the graph.

10. Set-and-forget. Shipping the AEO stack once and never updating. Engines weight freshness. Quarterly refreshes are the price of admission.

Want SkynetLabs to ship your AEO stack?

Schema, content, authority, distribution, freshness — all five layers. 90-day rollout, fixed price, no retainer required.

Start the conversation