Block AI training on your content
Audit which AI training crawlers can read your site and get copy-paste robots.txt rules for the ones you want to block. Free, no install, results in two minutes.
AI training and AI search use different crawlers. Training crawlers (GPTBot, Google-Extended, Applebot-Extended, CCBot, and others) gather pages to feed into model training. Search and answer crawlers (PerplexityBot, OAI-SearchBot, Bingbot for Copilot, and others) gather pages for live citations. The two groups can be blocked independently in robots.txt.
If you want to keep showing up in AI answer engines but stop new content from training future models, you allow the search-side bots and disallow the training-side ones. The free checker on this page tells you which of those crawlers are currently allowed, denied, or unaddressed across 33+ named AI tokens, and gives you a copy-paste snippet for the ones you want to block.
Frequently asked questions
- Which crawlers train AI models on web content? +
- GPTBot (OpenAI), Google-Extended (Google), Applebot-Extended (Apple Intelligence), CCBot (Common Crawl), Bytespider (TikTok parent), and others. The free checker above lists every named training crawler your site is exposed to.
- Does blocking training crawlers affect search rankings? +
- No, when you block the right tokens. Training crawlers (GPTBot, Google-Extended, Applebot-Extended) are separate from search indexers (Googlebot, Bingbot, Applebot). Blocking the training side leaves traditional SEO untouched.
- How fast do these blocks take effect? +
- Most major crawlers re-fetch robots.txt within a few days. Sites with low crawl volume can take a week or two before changes are honored consistently.
- Will blocking these stop my content from appearing in AI answers? +
- It can reduce future training inclusion, but it does not retract what was already trained on. For visibility in AI answer engines, look at engines that cite live sources (Perplexity, Bing/Copilot search), which use separate crawlers.
How It Works
Enter your domain
We fetch your public robots.txt — no install, no auth, no site changes.
Diff against AI bot registry
Your rules are checked against 33+ AI crawlers and agents (OpenAI, Anthropic, Perplexity, Google, Meta, ...).
Get a strategic report
Per-bot status, posture flags, and downloadable robots.txt presets you can copy-paste.