Search results don’t look the way they did two years ago. Google now opens with AI Overviews, ChatGPT and Claude pull live web results into their answers, Perplexity built an entire product around it, and Gemini sits one tap away inside every Google surface. The page is no longer the destination. The page is a source the model is reading on your behalf.
Two acronyms have shown up to describe the work of being visible inside those answers. AEO stands for Answer Engine Optimization, which is the work of being the source an “answer engine” uses when it returns a direct answer instead of a list of links. GEO stands for Generative Engine Optimization, which is the same idea framed around generative AI specifically: appearing inside answers a model writes from scratch using your page as a reference.
Google’s own AI optimization guide treats both as variations of regular SEO. From their perspective, “optimizing for generative AI search is optimizing for the search experience, and thus still SEO.” The ranking and quality systems that decide what shows up in a list of blue links are the same systems that decide what shows up inside an AI Overview. Improving for one improves the other.
One reason this matters: each AI surface pulls from a different web index, but most of those indexes are downstream of the same crawl, rendering, and quality work.
Your page │ ▼ Search engine indexes (where the crawl lands) ┌────────────────────────────────────────────────────────┐ │ Google index ──▶ Google Search, AI Overviews, Gemini │ │ Bing index ──▶ Bing, Microsoft Copilot │ │ OpenAI index ──▶ ChatGPT Search │ │ Anthropic + Brave ▶ Claude web search │ │ Perplexity + Bing ▶ Perplexity answers │ └────────────────────────────────────────────────────────┘ │ ▼ The AI surface reads from the index, cites your pageThe practical question is what you do to your site. The rest of this post walks through it, with sources for every recommendation.
Eligibility comes before everything else
Before any of the content work matters, the page has to be allowed to appear in AI features at all. Google’s guide is explicit: a page is only eligible for AI features if it’s eligible to appear as a regular search snippet (source). That means the URL needs to be indexed, the page needs to be crawlable in robots.txt, snippets need to be allowed (no nosnippet, no max-snippet:0), and the content has to load without requiring the crawler to execute heavy JavaScript first.
Open Google Search Console and run a URL inspection on a page you care about. The “Test live URL” view shows you what Google sees, including the rendered HTML after JavaScript has executed. If the article body is missing from that rendered HTML, fix it before doing anything else. Google’s JavaScript SEO basics covers the patterns that work and the ones that break crawling. Server rendering and static generation are the safest bets.
A 30-second sanity check from your terminal, one per major AI crawler:
# Google Search (feeds AI Overviews and Gemini grounding)curl -A "Googlebot/2.1 (+http://www.google.com/bot.html)" -I https://0xinsider.com/article
# Bing (feeds Microsoft Copilot)curl -A "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)" -I https://0xinsider.com/article
# OpenAI search indexer (ChatGPT Search)curl -A "OAI-SearchBot/1.3; +https://openai.com/searchbot" -I https://0xinsider.com/article
# Anthropic search indexer (Claude web search)curl -A "Claude-SearchBot/1.0 (+https://www.anthropic.com)" -I https://0xinsider.com/article
# Perplexity indexercurl -A "PerplexityBot/1.0 (+https://perplexity.ai/perplexitybot)" -I https://0xinsider.com/articleA 200 OK from a spoofed user agent isn’t proof that the real crawler can reach the page. Bot operators block UA spoofing, so the only authoritative check is to verify the request against published IP ranges or reverse-DNS records. Google documents its crawler verification process, and OpenAI, Anthropic, and Perplexity all publish IP ranges in their bot docs. Use the curl test to catch obvious blocks (a 403, 503, or login-page redirect that suggests Cloudflare’s bot-fight rule or a misconfigured WAF), then confirm against the official IP list for the bots that matter to you.
The full list of bots worth thinking about, with what they do and the canonical reference:
Bot user agent Purpose Reference──────────────────────────────────────────────────────────────────────────Googlebot Google Search index developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlersGoogle-Extended Gemini Apps + Vertex AI Grounding + developers.google.com/crawling/docs/crawlers-fetchers/google-special-case-crawlers AI training (separate from Search)Bingbot Bing index (feeds Copilot) bing.com/webmasters/help/which-crawlers-does-bing-use-8c184ec0GPTBot/1.3 OpenAI training developers.openai.com/api/docs/botsOAI-SearchBot/1.3 ChatGPT Search live index developers.openai.com/api/docs/botsChatGPT-User ChatGPT user-initiated fetch developers.openai.com/api/docs/botsClaudeBot Anthropic training support.claude.com/en/articles/8896518Claude-SearchBot Anthropic search index (Claude search) support.claude.com/en/articles/8896518Claude-User Claude user-initiated fetch support.claude.com/en/articles/8896518PerplexityBot Perplexity search index docs.perplexity.ai/docs/resources/perplexity-crawlersPerplexity-User Perplexity user-initiated fetch docs.perplexity.ai/docs/resources/perplexity-crawlersApplebot-Extended Apple Intelligence training support.apple.com/en-us/119829Meta-ExternalAgent Meta AI training developers.facebook.com/docs/sharing/webmasters/web-crawlersCCBot Common Crawl (feeds many models) commoncrawl.org/ccbotA robots.txt you can copy that allows AI search visibility (the surfaces that cite your page) while opting out of training (the bots that scrape for model training data):
# Allow indexing for search and AI search surfacesUser-agent: GooglebotAllow: /
User-agent: BingbotAllow: /
User-agent: OAI-SearchBotAllow: /
User-agent: ChatGPT-UserAllow: /
User-agent: PerplexityBotAllow: /
User-agent: Perplexity-UserAllow: /
User-agent: Claude-SearchBotAllow: /
User-agent: Claude-UserAllow: /
# Block AI training crawlers (does not affect Google Search inclusion)User-agent: GPTBotDisallow: /
User-agent: ClaudeBotDisallow: /
User-agent: Google-ExtendedDisallow: /
User-agent: Applebot-ExtendedDisallow: /
User-agent: Meta-ExternalAgentDisallow: /
User-agent: CCBotDisallow: /
# FallbackUser-agent: *Allow: /
Sitemap: https://0xinsider.com/sitemap.xmlThe distinction matters. GPTBot and ClaudeBot are training crawlers, and blocking them does not affect search inclusion. Google-Extended is broader: it controls AI training and grounding inside Gemini Apps and Vertex AI Grounding, but does not affect Google Search ranking or AI Overview eligibility (Google source). The bots that determine whether your page can show up inside an AI answer are the search indexers: Googlebot, Bingbot, OAI-SearchBot, Claude-SearchBot, and PerplexityBot (OpenAI source, Anthropic source, Perplexity source). Many sites accidentally block one of those and tank their visibility.
Meta robots tags are the other lever, page-level rather than site-level:
<!-- Eligible for AI Overviews and AI search citations --><meta name="robots" content="index, follow">
<!-- Indexed, but excluded from snippets and AI Overviews --><meta name="robots" content="index, follow, nosnippet">
<!-- Indexed, but snippet capped too short for AI answers to use --><meta name="robots" content="index, follow, max-snippet:50">To opt out of Google-Extended (Gemini Apps and Vertex AI Grounding), use the robots.txt product token shown earlier. Google does not document Google-Extended as a robots meta tag, only as a robots.txt token (source). The snippet directives above are documented in Google’s meta tags reference.
Page exists │ ▼ robots.txt allows crawl? ── no ──▶ invisible ▼ page renders without JS errors? ── no ──▶ invisible ▼ indexed in Search Console? ── no ──▶ invisible ▼ snippet allowed (no nosnippet)? ── no ──▶ regular search only ▼ quality + originality signals? ── weak ─▶ ranked, rarely cited │ ▼ Eligible to appear in AI Overviews and related surfacesEvery layer is a gate. The fancier optimization work only matters once all the gates are open.
What gets cited is what a model can’t write from training data alone
Generative search rewards specificity. Models can summarize generic information without quoting anyone, so the pages that get cited are the ones that say something the model can’t synthesize on its own. Google’s guide tells creators to focus on “unique, valuable, people-first content” rather than commodity content that re-states what every other page on the topic already says (source). The deeper version of this advice lives in Google’s helpful content guidance, which goes into how to demonstrate firsthand experience, real expertise, and original perspective.
Here are two real versions of the same paragraph for an article about migrating to Next.js 16. Same topic, same word count, wildly different odds of being cited:
Commodity version:"Next.js 16 introduces async params, making route parametersasynchronous. This is a breaking change you should plan forwhen upgrading from Next.js 15. Make sure to await your paramsin dynamic routes."
Distinctive version:"We migrated a 240-route Next.js 15 app to 16 last week. Theasync params change broke 47 pages in CI on the first run.The mechanical fix: wrap every `params.slug` access in`await params`. The catch we hit: dynamic API routes thatdestructure params in the function signature need thesignature itself marked async, not just the body. Took3 hours, almost all of it search-replace."A model can produce the commodity version from training data alone, so it won’t cite the source. There’s nothing in there it couldn’t write itself. The distinctive version has a number (47 broken pages), a specific catch (the function signature subtlety), and a time estimate (3 hours), none of which the model can generate without quoting the source. Even one of those details is often enough to flip a page from “training data summary” to “cited reference”.
What the model sees about your topic │ ├──▶ Commodity content │ "Same overview 50 other pages have" │ │ │ ▼ │ Model synthesizes from training data │ │ │ ▼ │ Not cited │ └──▶ Distinctive content "Specific data, screenshot, opinion, result you tested in production" │ ▼ Model can't synthesize, must quote │ ▼ Cited in the answerClean technical structure helps the crawler and the model
Semantic HTML matters. Use real heading levels in a sensible hierarchy, put the answer to the question the page is about near the top, and avoid burying content under preamble. A real before/after on the same blog post:
<!-- Bad: divs and classes carry no semantic weight --><div class="title">How to migrate to Next.js 16</div><div class="subtitle">A practical guide</div><div class="body">We migrated 240 routes last week...</div>
<!-- Good: explicit semantics the crawler and model understand --><article> <h1>How to migrate to Next.js 16</h1> <p class="lede">A practical guide to async params, Turbopack defaults, and the gotchas we hit.</p> <section> <h2>Async params, in practice</h2> <p>We migrated 240 routes last week...</p> </section></article>The second version gives the crawler clear structure (article, h1, section, h2) and the model clean boundaries for what’s heading, lede, and body.
Google’s documentation on page experience explains how Core Web Vitals feed into ranking, which feeds directly into AI feature eligibility. The thresholds Google publishes (source):
Metric Good Poor─────────────────────────────────────────────────────LCP Largest Contentful Paint ≤ 2.5s > 4.0sINP Interaction to Next Paint ≤ 200ms > 500msCLS Cumulative Layout Shift ≤ 0.1 > 0.25The numbers ranking algorithms look at are the 28-day field data from real Chrome users (CrUX), not a Lighthouse run on your laptop. Read them from web-vitals in JavaScript to align local testing with what Google’s systems see:
import { onLCP, onINP, onCLS } from 'web-vitals';
onLCP(metric => console.log('LCP', metric.value, metric.rating));onINP(metric => console.log('INP', metric.value, metric.rating));onCLS(metric => console.log('CLS', metric.value, metric.rating));The AI optimization guide also pushes back on several “optimization hacks” circulating online. Adding an llms.txt file is not a ranking signal and isn’t used by Google’s AI features (source). Chunking content into tiny sections or rewriting every heading as a question is unnecessary, because models read context across the whole page. The guide also says structured data is useful where it powers a documented rich result, but it isn’t required for AI feature visibility. Spend the time on real content quality and rendering instead.
What the crawler fetches ───────────────────────────────────────────────────────── Server-rendered HTML Client-only SPA shell ───────────────────── ───────────────────── <h1>Title</h1> <div id="root"></div> <p>Real content...</p> <script src="app.js"> <h2>Section</h2> (renders later) │ │ ▼ ▼ Crawler reads it now Crawler runs JS, may stall │ │ ▼ ▼ Indexed, AI-eligible May never reach the contentVisuals, schema, and commerce data are the structured pipelines
AI Overviews pull images and video directly when they’re high quality. Real screenshots, real diagrams, and short video walkthroughs are more useful than stock photos. Apply the same image SEO basics Google has always recommended in its image best practices: descriptive alt text, meaningful filenames, captions where they help the reader, and the same for video best practices.
A real alt-text before/after for an article about Next.js performance:
<!-- Bad: alt tells the crawler nothing --><img src="chart.png" alt="chart"><img src="screenshot.png" alt="">
<!-- Good: alt describes what the image conveys --><img src="chart.png" alt="Next.js 16 vs 15 build time: 4.2s vs 6.8s on a 240-route app"><img src="screenshot.png" alt="Search Console URL inspection showing the page is indexed">The second pair is what gets pulled into an AI Overview’s image carousel because the alt text is descriptive enough for the model to understand what the image proves.
Structured data is worth adding where it powers a specific rich result. Recipe schema, product schema, FAQ schema, event schema, and Article schema all have documented effects in regular search and feed into the same understanding layer the AI features use. Google’s rich results gallery lists every supported type. A working Article schema for a blog post:
<script type="application/ld+json">{ "@context": "https://schema.org", "@type": "Article", "headline": "How to migrate to Next.js 16", "datePublished": "2026-05-17", "dateModified": "2026-05-17", "author": { "@type": "Person", "name": "Trevor Lasn", "url": "https://www.trevorlasn.com" }, "publisher": { "@type": "Organization", "name": "trevorlasn.com" }, "image": "https://www.trevorlasn.com/article-cover.webp", "description": "A practical guide to async params and Turbopack defaults."}</script>Test it inside Google’s Rich Results Test before deploying. The tool will tell you exactly which required fields are missing or malformed.
If you run a local business or sell products, two unrelated surfaces matter more than schema. A verified Google Business Profile feeds local AI answers with your hours, location, services, and reviews. A Merchant Center feed is what AI Overviews pull product information from. The AI optimization guide names both explicitly as the primary input for business and commerce results (source).
Type of result Source feed Where it shows ──────────────────────────────────────────────────────────────────── Local business ◀── Google Business Profile ──▶ Maps, local panel, (hours, location, reviews) local AI answers
Products ◀── Merchant Center feed ────▶ Shopping cards, (price, stock, variants) product AI answers
Recipes, FAQs, ◀── Schema.org JSON-LD ──────▶ Rich results, events, articles (on-page structured data) AI understandingAgentic experiences are the next surface
The newer wrinkle is autonomous agents browsing on the user’s behalf (Claude with computer use, ChatGPT Operator, Perplexity’s assistant). Google’s AI optimization guide recommends sites consider how agents interpret their DOM, controls, and content (source). Sites with confusing markup, hidden controls, or essential information rendered only as images are hard for agents to operate. The accessibility work you’d already do for screen readers covers most of the same ground.
A real before/after of an interactive control on a booking page:
<!-- Agent-hostile: div pretending to be a button, no label --><div class="btn-primary" onclick="submitBooking()"> <svg viewBox="0 0 24 24"><!-- check icon --></svg></div>
<!-- Agent-friendly: real button, explicit label, real semantics --><button type="submit" aria-label="Confirm booking for 7:00 PM on May 17"> <svg viewBox="0 0 24 24" aria-hidden="true"> <!-- check icon --> </svg> Confirm booking</button>The second version tells an agent three things: it’s a submit button, the action is “Confirm booking”, and the icon is decorative. The first version tells it nothing. An agent that can’t identify the booking confirmation gives up and picks a site it can operate.
Form fields work the same way. An agent reads name, id, aria-label, and the surrounding <label> element:
<!-- Agent-hostile: placeholder is the only hint, no semantic link --><input type="text" placeholder="When?">
<!-- Agent-friendly: explicit label, real input type, real name --><label for="reservation-time">Reservation time</label><input type="datetime-local" id="reservation-time" name="reservation_time" required>Switching to type="datetime-local" is a tiny change that gives both browsers and agents a native datetime picker with structured value handling. No agent has to guess what format you want.
User intent: "Book me a table for 7pm tonight" │ ▼ Agent opens your site │ ▼ ┌───────────────────────────────────────────────┐ │ Can it find the booking widget? │ │ Can it read the available time slots? │ │ Are the buttons labeled, not just icons? │ │ Does the form submit without a 3s JS stall? │ └───────────────────────────────────────────────┘ │ ┌──────────────┴──────────────┐ ▼ ▼ Task completes on Agent gives up, your site picks a competitorMeasure what you can, and don’t chase what you can’t
Search Console is still the source of truth for Google-side data. AI Overviews and AI Mode traffic is rolled into the standard Web performance report (Google source), so impressions and clicks for the pages you care about are the right place to look. Bing Webmaster Tools provides the equivalent for Bing and Copilot.
One inference you can draw, carefully: filter Performance by Query containing a conversational starter (how, what, why, is, can). These long-tail queries are the kind AI Overviews trigger on, and a noticeable shift in impressions vs clicks on those queries is consistent with the page being summarized inside an AI answer rather than visited. It is not proof. Layout changes, ranking shifts, query mix changes, and seasonality can all produce similar patterns. Use it as a hypothesis to investigate, not a verdict.
A direct way to test whether models cite you: open each surface and ask a question your content should answer. Concrete, copy-paste tests:
ChatGPT (Search mode): "What's a practical way to migrate a Next.js 15 app to Next.js 16?"
Claude (with web search): "Find me a recent first-hand account of migrating to Next.js 16 on a large app. I want specifics, not generic advice."
Perplexity: "Real-world Next.js 16 migration: what broke, how it was fixed."
Gemini (or Google with AI Overview): "How do you handle async params when migrating to Next.js 16?"If your domain shows up in the inline source list or the answer cites it, you’re being retrieved. Repeat across the major surfaces every few weeks for the topics that matter to your business. Track the count of cite-events the same way you’d track backlinks.
What to track Where it lives Signal it gives ──────────────────────────────────────────────────────────────────── Impressions ◀── Search Console ──▶ Visibility growth Clicks ◀── Search Console ──▶ Selection rate Conversions ◀── Your own analytics ──▶ Business outcome Cite events ◀── Ask ChatGPT / Claude ──▶ Whether models cite
What to skip: "AI Overview rank trackers" no reliable public methodology yetDoing the work above covers everything Google’s AI optimization guide recommends and everything the other AI search surfaces reward. AEO and GEO aren’t separate disciplines from SEO. They’re the same work, applied with sharper attention to content originality, rendering, and the structured pipelines that feed every AI surface on the web.