How should robots.txt handle GPTBot, OAI-SearchBot, ChatGPT-User, ClaudeBot, and PerplexityBot?
If you want AI citations, allow them. The default-allow robots.txt for AEO unblocks the five user agents AI engines actually use: GPTBot (OpenAI training crawler), OAI-SearchBot (ChatGPT Search retrieval), ChatGPT-User (on-demand fetch when a user pastes a URL), ClaudeBot and anthropic-ai (Anthropic crawlers), and PerplexityBot plus Perplexity-User (Perplexity retrieval). Blocking any of these is a primary cause of "we're not cited" in Surfaced audits.
A working AEO-friendly robots.txt stanza:
User-agent: GPTBot
Allow: /
User-agent: OAI-SearchBot Allow: /
User-agent: ChatGPT-User Allow: /
User-agent: ClaudeBot Allow: /
User-agent: anthropic-ai Allow: /
User-agent: PerplexityBot Allow: /
User-agent: Perplexity-User Allow: /
User-agent: Google-Extended Allow: /
User-agent: Applebot-Extended Allow: /
User-agent: Bingbot Allow: / ```
Three nuances: (1) Google-Extended only controls Gemini training — blocking it does not affect AI Overviews citation eligibility, which uses regular Googlebot. (2) Applebot-Extended is the opt-out for Apple Intelligence training. Allowing it helps if you care about Apple Intelligence citations. (3) If you must block AI crawlers (paywalled content, licensing concerns), block training crawlers like GPTBot and ClaudeBot but keep retrieval bots like OAI-SearchBot, ChatGPT-User, and PerplexityBot allowed — you keep citation eligibility without contributing to training. Always also keep Bingbot open; ChatGPT Search retrieval flows through Bing's index.