AEO and GEO for AI Overviews, ChatGPT, Claude, Gemini, and Perplexity

What Answer Engine Optimization and Generative Engine Optimization mean, and how to get your site cited by AI Overviews, ChatGPT, Claude, Perplexity, and Gemini.

Trevor I. Lasn Trevor I. Lasn
· 13 min read
Building 0xinsider.com — see who's winning across prediction markets (Polymarket, Kalshi, and more) — and what they're trading right now.

Search results don’t look the way they did two years ago. Google now opens with AI Overviews, ChatGPT and Claude pull live web results into their answers, Perplexity built an entire product around it, and Gemini sits one tap away inside every Google surface. The page is no longer the destination. The page is a source the model is reading on your behalf.

Two acronyms have shown up to describe the work of being visible inside those answers. AEO stands for Answer Engine Optimization, which is the work of being the source an “answer engine” uses when it returns a direct answer instead of a list of links. GEO stands for Generative Engine Optimization, which is the same idea framed around generative AI specifically: appearing inside answers a model writes from scratch using your page as a reference.

Google’s own AI optimization guide treats both as variations of regular SEO. From their perspective, “optimizing for generative AI search is optimizing for the search experience, and thus still SEO.” The ranking and quality systems that decide what shows up in a list of blue links are the same systems that decide what shows up inside an AI Overview. Improving for one improves the other.

One reason this matters: each AI surface pulls from a different web index, but most of those indexes are downstream of the same crawl, rendering, and quality work.

Your page
Search engine indexes (where the crawl lands)
┌────────────────────────────────────────────────────────┐
│ Google index ──▶ Google Search, AI Overviews, Gemini │
│ Bing index ──▶ Bing, Microsoft Copilot │
│ OpenAI index ──▶ ChatGPT Search │
│ Anthropic + Brave ▶ Claude web search │
│ Perplexity + Bing ▶ Perplexity answers │
└────────────────────────────────────────────────────────┘
The AI surface reads from the index, cites your page

The practical question is what you do to your site. The rest of this post walks through it, with sources for every recommendation.

Eligibility comes before everything else

Before any of the content work matters, the page has to be allowed to appear in AI features at all. Google’s guide is explicit: a page is only eligible for AI features if it’s eligible to appear as a regular search snippet (source). That means the URL needs to be indexed, the page needs to be crawlable in robots.txt, snippets need to be allowed (no nosnippet, no max-snippet:0), and the content has to load without requiring the crawler to execute heavy JavaScript first.

Open Google Search Console and run a URL inspection on a page you care about. The “Test live URL” view shows you what Google sees, including the rendered HTML after JavaScript has executed. If the article body is missing from that rendered HTML, fix it before doing anything else. Google’s JavaScript SEO basics covers the patterns that work and the ones that break crawling. Server rendering and static generation are the safest bets.

A 30-second sanity check from your terminal, one per major AI crawler:

Terminal window
# Google Search (feeds AI Overviews and Gemini grounding)
curl -A "Googlebot/2.1 (+http://www.google.com/bot.html)" -I https://0xinsider.com/article
# Bing (feeds Microsoft Copilot)
curl -A "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)" -I https://0xinsider.com/article
# OpenAI search indexer (ChatGPT Search)
curl -A "OAI-SearchBot/1.3; +https://openai.com/searchbot" -I https://0xinsider.com/article
# Anthropic search indexer (Claude web search)
curl -A "Claude-SearchBot/1.0 (+https://www.anthropic.com)" -I https://0xinsider.com/article
# Perplexity indexer
curl -A "PerplexityBot/1.0 (+https://perplexity.ai/perplexitybot)" -I https://0xinsider.com/article

A 200 OK from a spoofed user agent isn’t proof that the real crawler can reach the page. Bot operators block UA spoofing, so the only authoritative check is to verify the request against published IP ranges or reverse-DNS records. Google documents its crawler verification process, and OpenAI, Anthropic, and Perplexity all publish IP ranges in their bot docs. Use the curl test to catch obvious blocks (a 403, 503, or login-page redirect that suggests Cloudflare’s bot-fight rule or a misconfigured WAF), then confirm against the official IP list for the bots that matter to you.

The full list of bots worth thinking about, with what they do and the canonical reference:

Bot user agent Purpose Reference
──────────────────────────────────────────────────────────────────────────
Googlebot Google Search index developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers
Google-Extended Gemini Apps + Vertex AI Grounding + developers.google.com/crawling/docs/crawlers-fetchers/google-special-case-crawlers
AI training (separate from Search)
Bingbot Bing index (feeds Copilot) bing.com/webmasters/help/which-crawlers-does-bing-use-8c184ec0
GPTBot/1.3 OpenAI training developers.openai.com/api/docs/bots
OAI-SearchBot/1.3 ChatGPT Search live index developers.openai.com/api/docs/bots
ChatGPT-User ChatGPT user-initiated fetch developers.openai.com/api/docs/bots
ClaudeBot Anthropic training support.claude.com/en/articles/8896518
Claude-SearchBot Anthropic search index (Claude search) support.claude.com/en/articles/8896518
Claude-User Claude user-initiated fetch support.claude.com/en/articles/8896518
PerplexityBot Perplexity search index docs.perplexity.ai/docs/resources/perplexity-crawlers
Perplexity-User Perplexity user-initiated fetch docs.perplexity.ai/docs/resources/perplexity-crawlers
Applebot-Extended Apple Intelligence training support.apple.com/en-us/119829
Meta-ExternalAgent Meta AI training developers.facebook.com/docs/sharing/webmasters/web-crawlers
CCBot Common Crawl (feeds many models) commoncrawl.org/ccbot

A robots.txt you can copy that allows AI search visibility (the surfaces that cite your page) while opting out of training (the bots that scrape for model training data):

# Allow indexing for search and AI search surfaces
User-agent: Googlebot
Allow: /
User-agent: Bingbot
Allow: /
User-agent: OAI-SearchBot
Allow: /
User-agent: ChatGPT-User
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: Perplexity-User
Allow: /
User-agent: Claude-SearchBot
Allow: /
User-agent: Claude-User
Allow: /
# Block AI training crawlers (does not affect Google Search inclusion)
User-agent: GPTBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: Applebot-Extended
Disallow: /
User-agent: Meta-ExternalAgent
Disallow: /
User-agent: CCBot
Disallow: /
# Fallback
User-agent: *
Allow: /
Sitemap: https://0xinsider.com/sitemap.xml

The distinction matters. GPTBot and ClaudeBot are training crawlers, and blocking them does not affect search inclusion. Google-Extended is broader: it controls AI training and grounding inside Gemini Apps and Vertex AI Grounding, but does not affect Google Search ranking or AI Overview eligibility (Google source). The bots that determine whether your page can show up inside an AI answer are the search indexers: Googlebot, Bingbot, OAI-SearchBot, Claude-SearchBot, and PerplexityBot (OpenAI source, Anthropic source, Perplexity source). Many sites accidentally block one of those and tank their visibility.

Meta robots tags are the other lever, page-level rather than site-level:

<!-- Eligible for AI Overviews and AI search citations -->
<meta name="robots" content="index, follow">
<!-- Indexed, but excluded from snippets and AI Overviews -->
<meta name="robots" content="index, follow, nosnippet">
<!-- Indexed, but snippet capped too short for AI answers to use -->
<meta name="robots" content="index, follow, max-snippet:50">

To opt out of Google-Extended (Gemini Apps and Vertex AI Grounding), use the robots.txt product token shown earlier. Google does not document Google-Extended as a robots meta tag, only as a robots.txt token (source). The snippet directives above are documented in Google’s meta tags reference.

Page exists
▼ robots.txt allows crawl? ── no ──▶ invisible
▼ page renders without JS errors? ── no ──▶ invisible
▼ indexed in Search Console? ── no ──▶ invisible
▼ snippet allowed (no nosnippet)? ── no ──▶ regular search only
▼ quality + originality signals? ── weak ─▶ ranked, rarely cited
Eligible to appear in AI Overviews and related surfaces

Every layer is a gate. The fancier optimization work only matters once all the gates are open.

What gets cited is what a model can’t write from training data alone

Generative search rewards specificity. Models can summarize generic information without quoting anyone, so the pages that get cited are the ones that say something the model can’t synthesize on its own. Google’s guide tells creators to focus on “unique, valuable, people-first content” rather than commodity content that re-states what every other page on the topic already says (source). The deeper version of this advice lives in Google’s helpful content guidance, which goes into how to demonstrate firsthand experience, real expertise, and original perspective.

Here are two real versions of the same paragraph for an article about migrating to Next.js 16. Same topic, same word count, wildly different odds of being cited:

Commodity version:
"Next.js 16 introduces async params, making route parameters
asynchronous. This is a breaking change you should plan for
when upgrading from Next.js 15. Make sure to await your params
in dynamic routes."
Distinctive version:
"We migrated a 240-route Next.js 15 app to 16 last week. The
async params change broke 47 pages in CI on the first run.
The mechanical fix: wrap every `params.slug` access in
`await params`. The catch we hit: dynamic API routes that
destructure params in the function signature need the
signature itself marked async, not just the body. Took
3 hours, almost all of it search-replace."

A model can produce the commodity version from training data alone, so it won’t cite the source. There’s nothing in there it couldn’t write itself. The distinctive version has a number (47 broken pages), a specific catch (the function signature subtlety), and a time estimate (3 hours), none of which the model can generate without quoting the source. Even one of those details is often enough to flip a page from “training data summary” to “cited reference”.

What the model sees about your topic
├──▶ Commodity content
│ "Same overview 50 other pages have"
│ │
│ ▼
│ Model synthesizes from training data
│ │
│ ▼
│ Not cited
└──▶ Distinctive content
"Specific data, screenshot, opinion,
result you tested in production"
Model can't synthesize, must quote
Cited in the answer

Clean technical structure helps the crawler and the model

Semantic HTML matters. Use real heading levels in a sensible hierarchy, put the answer to the question the page is about near the top, and avoid burying content under preamble. A real before/after on the same blog post:

<!-- Bad: divs and classes carry no semantic weight -->
<div class="title">How to migrate to Next.js 16</div>
<div class="subtitle">A practical guide</div>
<div class="body">We migrated 240 routes last week...</div>
<!-- Good: explicit semantics the crawler and model understand -->
<article>
<h1>How to migrate to Next.js 16</h1>
<p class="lede">A practical guide to async params,
Turbopack defaults, and the gotchas we hit.</p>
<section>
<h2>Async params, in practice</h2>
<p>We migrated 240 routes last week...</p>
</section>
</article>

The second version gives the crawler clear structure (article, h1, section, h2) and the model clean boundaries for what’s heading, lede, and body.

Google’s documentation on page experience explains how Core Web Vitals feed into ranking, which feeds directly into AI feature eligibility. The thresholds Google publishes (source):

Metric Good Poor
─────────────────────────────────────────────────────
LCP Largest Contentful Paint ≤ 2.5s > 4.0s
INP Interaction to Next Paint ≤ 200ms > 500ms
CLS Cumulative Layout Shift ≤ 0.1 > 0.25

The numbers ranking algorithms look at are the 28-day field data from real Chrome users (CrUX), not a Lighthouse run on your laptop. Read them from web-vitals in JavaScript to align local testing with what Google’s systems see:

import { onLCP, onINP, onCLS } from 'web-vitals';
onLCP(metric => console.log('LCP', metric.value, metric.rating));
onINP(metric => console.log('INP', metric.value, metric.rating));
onCLS(metric => console.log('CLS', metric.value, metric.rating));

The AI optimization guide also pushes back on several “optimization hacks” circulating online. Adding an llms.txt file is not a ranking signal and isn’t used by Google’s AI features (source). Chunking content into tiny sections or rewriting every heading as a question is unnecessary, because models read context across the whole page. The guide also says structured data is useful where it powers a documented rich result, but it isn’t required for AI feature visibility. Spend the time on real content quality and rendering instead.

What the crawler fetches
─────────────────────────────────────────────────────────
Server-rendered HTML Client-only SPA shell
───────────────────── ─────────────────────
<h1>Title</h1> <div id="root"></div>
<p>Real content...</p> <script src="app.js">
<h2>Section</h2> (renders later)
│ │
▼ ▼
Crawler reads it now Crawler runs JS, may stall
│ │
▼ ▼
Indexed, AI-eligible May never reach the content

Visuals, schema, and commerce data are the structured pipelines

AI Overviews pull images and video directly when they’re high quality. Real screenshots, real diagrams, and short video walkthroughs are more useful than stock photos. Apply the same image SEO basics Google has always recommended in its image best practices: descriptive alt text, meaningful filenames, captions where they help the reader, and the same for video best practices.

A real alt-text before/after for an article about Next.js performance:

<!-- Bad: alt tells the crawler nothing -->
<img src="chart.png" alt="chart">
<img src="screenshot.png" alt="">
<!-- Good: alt describes what the image conveys -->
<img src="chart.png"
alt="Next.js 16 vs 15 build time: 4.2s vs 6.8s on a 240-route app">
<img src="screenshot.png"
alt="Search Console URL inspection showing the page is indexed">

The second pair is what gets pulled into an AI Overview’s image carousel because the alt text is descriptive enough for the model to understand what the image proves.

Structured data is worth adding where it powers a specific rich result. Recipe schema, product schema, FAQ schema, event schema, and Article schema all have documented effects in regular search and feed into the same understanding layer the AI features use. Google’s rich results gallery lists every supported type. A working Article schema for a blog post:

<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "Article",
"headline": "How to migrate to Next.js 16",
"datePublished": "2026-05-17",
"dateModified": "2026-05-17",
"author": {
"@type": "Person",
"name": "Trevor Lasn",
"url": "https://www.trevorlasn.com"
},
"publisher": {
"@type": "Organization",
"name": "trevorlasn.com"
},
"image": "https://www.trevorlasn.com/article-cover.webp",
"description": "A practical guide to async params and Turbopack defaults."
}
</script>

Test it inside Google’s Rich Results Test before deploying. The tool will tell you exactly which required fields are missing or malformed.

If you run a local business or sell products, two unrelated surfaces matter more than schema. A verified Google Business Profile feeds local AI answers with your hours, location, services, and reviews. A Merchant Center feed is what AI Overviews pull product information from. The AI optimization guide names both explicitly as the primary input for business and commerce results (source).

Type of result Source feed Where it shows
────────────────────────────────────────────────────────────────────
Local business ◀── Google Business Profile ──▶ Maps, local panel,
(hours, location, reviews) local AI answers
Products ◀── Merchant Center feed ────▶ Shopping cards,
(price, stock, variants) product AI answers
Recipes, FAQs, ◀── Schema.org JSON-LD ──────▶ Rich results,
events, articles (on-page structured data) AI understanding

Agentic experiences are the next surface

The newer wrinkle is autonomous agents browsing on the user’s behalf (Claude with computer use, ChatGPT Operator, Perplexity’s assistant). Google’s AI optimization guide recommends sites consider how agents interpret their DOM, controls, and content (source). Sites with confusing markup, hidden controls, or essential information rendered only as images are hard for agents to operate. The accessibility work you’d already do for screen readers covers most of the same ground.

A real before/after of an interactive control on a booking page:

<!-- Agent-hostile: div pretending to be a button, no label -->
<div class="btn-primary" onclick="submitBooking()">
<svg viewBox="0 0 24 24"><!-- check icon --></svg>
</div>
<!-- Agent-friendly: real button, explicit label, real semantics -->
<button type="submit"
aria-label="Confirm booking for 7:00 PM on May 17">
<svg viewBox="0 0 24 24" aria-hidden="true">
<!-- check icon -->
</svg>
Confirm booking
</button>

The second version tells an agent three things: it’s a submit button, the action is “Confirm booking”, and the icon is decorative. The first version tells it nothing. An agent that can’t identify the booking confirmation gives up and picks a site it can operate.

Form fields work the same way. An agent reads name, id, aria-label, and the surrounding <label> element:

<!-- Agent-hostile: placeholder is the only hint, no semantic link -->
<input type="text" placeholder="When?">
<!-- Agent-friendly: explicit label, real input type, real name -->
<label for="reservation-time">Reservation time</label>
<input type="datetime-local"
id="reservation-time"
name="reservation_time"
required>

Switching to type="datetime-local" is a tiny change that gives both browsers and agents a native datetime picker with structured value handling. No agent has to guess what format you want.

User intent: "Book me a table for 7pm tonight"
Agent opens your site
┌───────────────────────────────────────────────┐
│ Can it find the booking widget? │
│ Can it read the available time slots? │
│ Are the buttons labeled, not just icons? │
│ Does the form submit without a 3s JS stall? │
└───────────────────────────────────────────────┘
┌──────────────┴──────────────┐
▼ ▼
Task completes on Agent gives up,
your site picks a competitor

Measure what you can, and don’t chase what you can’t

Search Console is still the source of truth for Google-side data. AI Overviews and AI Mode traffic is rolled into the standard Web performance report (Google source), so impressions and clicks for the pages you care about are the right place to look. Bing Webmaster Tools provides the equivalent for Bing and Copilot.

One inference you can draw, carefully: filter Performance by Query containing a conversational starter (how, what, why, is, can). These long-tail queries are the kind AI Overviews trigger on, and a noticeable shift in impressions vs clicks on those queries is consistent with the page being summarized inside an AI answer rather than visited. It is not proof. Layout changes, ranking shifts, query mix changes, and seasonality can all produce similar patterns. Use it as a hypothesis to investigate, not a verdict.

A direct way to test whether models cite you: open each surface and ask a question your content should answer. Concrete, copy-paste tests:

ChatGPT (Search mode):
"What's a practical way to migrate a Next.js 15 app to Next.js 16?"
Claude (with web search):
"Find me a recent first-hand account of migrating to Next.js 16
on a large app. I want specifics, not generic advice."
Perplexity:
"Real-world Next.js 16 migration: what broke, how it was fixed."
Gemini (or Google with AI Overview):
"How do you handle async params when migrating to Next.js 16?"

If your domain shows up in the inline source list or the answer cites it, you’re being retrieved. Repeat across the major surfaces every few weeks for the topics that matter to your business. Track the count of cite-events the same way you’d track backlinks.

What to track Where it lives Signal it gives
────────────────────────────────────────────────────────────────────
Impressions ◀── Search Console ──▶ Visibility growth
Clicks ◀── Search Console ──▶ Selection rate
Conversions ◀── Your own analytics ──▶ Business outcome
Cite events ◀── Ask ChatGPT / Claude ──▶ Whether models cite
What to skip:
"AI Overview rank trackers" no reliable public methodology yet

Doing the work above covers everything Google’s AI optimization guide recommends and everything the other AI search surfaces reward. AEO and GEO aren’t separate disciplines from SEO. They’re the same work, applied with sharper attention to content originality, rendering, and the structured pipelines that feed every AI surface on the web.


Trevor I. Lasn

Building 0xinsider.com — see who's winning across prediction markets (Polymarket, Kalshi, and more) — and what they're trading right now. Product engineer based in Tartu, Estonia, building and shipping for over a decade.


Found this article helpful? You might enjoy my free newsletter. I share dev tips and insights to help you grow your coding skills and advance your tech career.


Related Articles

Check out these related articles that might be useful for you. They cover similar topics and provide additional insights.

Webdev
3 min read

CSS ::target-text for Text Highlighting

A look at how browsers can highlight text fragments using CSS ::target-text, making text sharing and navigation more user-friendly

Dec 17, 2024
Read article
Webdev
4 min read

Explicit is better than implicit

Clarity is key: being explicit makes your code more readable and maintainable.

Sep 4, 2024
Read article
Webdev
5 min read

Programming Trends to Watch in 2020 and Beyond

Here are my bets on the programming trends

Jul 19, 2019
Read article
Webdev
12 min read

Frontend Security Checklist

Tips for Keeping All Frontend Applications Secure

Jul 30, 2024
Read article
Webdev
3 min read

NPQ: Open source CLI tool that audits and protects your npm installs from malicious packages

A CLI tool that checks packages for security issues and social engineering attacks before they hit your project

Jul 26, 2025
Read article
Webdev
3 min read

CSS content-visibility: The Web Performance Boost You Might Be Missing

The content-visibility CSS property delays rendering an element, including layout and painting, until it is needed

Dec 5, 2024
Read article
Webdev
3 min read

CSS :has() - The Parent Selector We've Always Wanted

Transform your CSS with :has(), the game-changing selector that finally lets us style elements based on their children.

Dec 4, 2024
Read article
Webdev
6 min read

SecretLint — A Linter for Preventing Committing Credentials

A guide to catching and preventing credential leaks in your code using Secretlint

Oct 22, 2024
Read article
Webdev
8 min read

Invisible columns in SQL

It’s a small feature, but it can make a big difference.

Aug 26, 2024
Read article

This article was originally published on https://www.trevorlasn.com/blog/aeo-geo-vs-seo-google-ai-optimization. It was written by a human and polished using grammar tools for clarity.