AI Crawlers List
49 bots from 27 companies are crawling the web to train AI models, power search, and fetch content for users. Know who visits your site and what they do with your content.
Want to control which bots access your site?
Use robots.txt rules or run an AlfaUMi analysis to see which crawlers currently access your website.
| Bot Name | Company | Influence | Monthly Users | Description |
|---|---|---|---|---|
GPTBot | OpenAI | 10/ 10 | 900M+ | Main OpenAI crawler for GPT model training. Indexes pages for ChatGPT knowledge base. Blocking = absent from ChatGPT results. |
ChatGPT-User | OpenAI | 10/ 10 | 900M+ | Fetches pages on ChatGPT user request (Browse with Bing). Triggered by human clicking a link. |
OAI-SearchBot | OpenAI | 9/ 10 | 900M+ | Crawler for ChatGPT Search (Google competitor). Indexes pages for ChatGPT search results. |
Google-Extended | 10/ 10 | 750M+ | Google crawler for Gemini and Bard training. Separate from Googlebot. Blocking doesn't affect Search indexing. | |
GoogleOther | 8/ 10 | 750M+ | General Google crawler for AI products (NotebookLM, Vertex AI). Used across various Google AI services. | |
Gemini-Deep-Research | 8/ 10 | 750M+ | Gemini Deep Research crawler for in-depth page analysis. Fetches content for advanced queries. | |
Google-NotebookLM | 7/ 10 | ~50M | NotebookLM crawler - Google's document analysis and AI podcast creation tool. | |
GoogleAgent-Mariner | 6/ 10 | New (2025) | Experimental Google agent for autonomous web browsing. Project Mariner. | |
Google-CloudVertexBot | 6/ 10 | Enterprise | Google Cloud Vertex AI crawler for enterprise customers. B2B, smaller consumer reach. | |
ClaudeBot | Anthropic | 9/ 10 | 30M+ | Main Anthropic crawler for source citation in Claude responses. Fetches pages for context. |
Claude-User | Anthropic | 9/ 10 | 30M+ | Fetches pages on Claude user request (Browse feature). User-initiated. |
Claude-Web | Anthropic | 8/ 10 | 30M+ | Anthropic crawler for page indexing. Builds Claude knowledge base. |
Claude-SearchBot | Anthropic | 8/ 10 | 30M+ | Crawler for Claude Search - Anthropic's new search feature. |
anthropic-ai | Anthropic | 8/ 10 | 30M+ | General Anthropic crawler identifier. Used for model training. |
PerplexityBot | Perplexity | 9/ 10 | 45M+ | Main Perplexity AI crawler. Indexes pages for AI search results. Fastest growing AI search. |
Perplexity-User | Perplexity | 8/ 10 | 45M+ | Fetches pages on Perplexity user request. User-initiated, often ignores robots.txt. |
Grokbot | xAI | 8/ 10 | 78M+ | xAI (Elon Musk) crawler for Grok. Integrated with X (Twitter). Rapid growth via premium X. |
xAI-Grok | xAI | 8/ 10 | 78M+ | Alternative Grok crawler identifier. Indexes pages for Grok knowledge base. |
FacebookBot | Meta | 9/ 10 | 1B+ | Meta crawler for Facebook/Instagram previews and Meta AI. Used for Llama training. |
Meta-ExternalAgent | Meta | 9/ 10 | 1B+ | Meta AI crawler for fetching pages for WhatsApp/Messenger/Instagram users. |
Meta-ExternalFetcher | Meta | 8/ 10 | 1B+ | Fetches content for Meta AI. Active in WhatsApp (63% of Meta AI traffic). |
meta-externalagent | Meta | 8/ 10 | 1B+ | Lowercase variant of Meta-ExternalAgent. Some systems use lowercase. |
Applebot | Apple | 8/ 10 | 2B+ | Apple crawler for Siri, Spotlight and Apple News. Used across all Apple devices. |
Applebot-Extended | Apple | 7/ 10 | 2B+ | Extended Apple crawler for Apple Intelligence training. New since iOS 18. |
bingbot | Microsoft | 9/ 10 | 100M+ | Main Bing crawler. Also used by Copilot and other Microsoft AI products. |
BingPreview | Microsoft | 7/ 10 | 100M+ | Generates link previews in Bing/Teams/Outlook. Preview thumbnails. |
Amazonbot | Amazon | 7/ 10 | 500M+ | Amazon crawler for Alexa and Amazon Search. Answers voice queries. |
AmazonBuyForMe | Amazon | 6/ 10 | New (2026) | New Amazon shopping agent. Autonomously browses pages for purchases. |
Amzn-SearchBot | Amazon | 6/ 10 | 500M+ | Amazon crawler for product and information search. |
DuckAssistBot | DuckDuckGo | 7/ 10 | 100M+ | DuckDuckGo AI Assistant crawler. Privacy-first, doesn't track users. |
YouBot | You.com | 6/ 10 | 10M+ | You.com crawler - AI search engine. Perplexity competitor. |
PhindBot | Phind | 6/ 10 | 5M+ | Phind crawler - AI search for developers. Specializes in code. |
Andibot | Andi | 5/ 10 | 2M+ | Andi Search crawler - conversational AI search engine. |
Bravebot | Brave | 6/ 10 | 70M+ | Brave Search and Leo AI crawler. Privacy-focused browser with AI. |
iAskBot | iAsk.ai | 5/ 10 | 3M+ | iAsk.ai crawler - free AI search engine. Popular in education. |
KomoSearch | Komo | 4/ 10 | 1M+ | Komo AI Search crawler. Smaller player in AI search market. |
DeepSeekBot | DeepSeek | 8/ 10 | 97M+ | Chinese DeepSeek crawler. #4 globally, cheap model, 35% users from China. |
MistralAI-User | Mistral | 6/ 10 | 5M+ | Mistral AI crawler (France). European OpenAI competitor, open-source. |
cohere-ai | Cohere | 5/ 10 | Enterprise | Cohere crawler - Canadian AI company. Mainly B2B/enterprise. |
cohere-training-data-crawler | Cohere | 5/ 10 | Enterprise | Cohere crawler for collecting training data. |
Diffbot | Diffbot | 6/ 10 | Enterprise | Diffbot crawler - structures web data. B2B, knowledge graphs. |
CCBot | Common Crawl | 7/ 10 | Research | Common Crawl crawler - nonprofit collecting data for AI research. Used by many AI companies. |
DataForSeoBot | DataForSEO | 5/ 10 | B2B | DataForSEO crawler - SEO data for marketers and tools. |
Bytespider | ByteDance | 8/ 10 | 1B+ | ByteDance (TikTok, Douyin) crawler. Indexes pages for TikTok search and AI. |
TikTokSpider | ByteDance | 7/ 10 | 1B+ | Alternative TikTok crawler. Fetches content for in-app display. |
PetalBot | Huawei | 6/ 10 | 500M+ | Huawei Petal Search crawler. Popular in China and on Huawei devices. |
SemrushBot | Semrush | 5/ 10 | B2B | Semrush crawler - SEO tool. Analyzes sites for marketers. |
AhrefsBot | Ahrefs | 5/ 10 | B2B | Ahrefs crawler - SEO tool. Builds link index. |
YandexBot | Yandex | 6/ 10 | 100M+ | Yandex crawler - Russian search engine. Popular in Russia and CIS countries. |
GPTBot10/ 10Main OpenAI crawler for GPT model training. Indexes pages for ChatGPT knowledge base. Blocking = absent from ChatGPT results.
ChatGPT-User10/ 10Fetches pages on ChatGPT user request (Browse with Bing). Triggered by human clicking a link.
OAI-SearchBot9/ 10Crawler for ChatGPT Search (Google competitor). Indexes pages for ChatGPT search results.
Google-Extended10/ 10Google crawler for Gemini and Bard training. Separate from Googlebot. Blocking doesn't affect Search indexing.
GoogleOther8/ 10General Google crawler for AI products (NotebookLM, Vertex AI). Used across various Google AI services.
Gemini-Deep-Research8/ 10Gemini Deep Research crawler for in-depth page analysis. Fetches content for advanced queries.
Google-NotebookLM7/ 10NotebookLM crawler - Google's document analysis and AI podcast creation tool.
GoogleAgent-Mariner6/ 10Experimental Google agent for autonomous web browsing. Project Mariner.
Google-CloudVertexBot6/ 10Google Cloud Vertex AI crawler for enterprise customers. B2B, smaller consumer reach.
ClaudeBot9/ 10Main Anthropic crawler for source citation in Claude responses. Fetches pages for context.
Claude-User9/ 10Fetches pages on Claude user request (Browse feature). User-initiated.
Claude-Web8/ 10Anthropic crawler for page indexing. Builds Claude knowledge base.
Claude-SearchBot8/ 10Crawler for Claude Search - Anthropic's new search feature.
anthropic-ai8/ 10General Anthropic crawler identifier. Used for model training.
PerplexityBot9/ 10Main Perplexity AI crawler. Indexes pages for AI search results. Fastest growing AI search.
Perplexity-User8/ 10Fetches pages on Perplexity user request. User-initiated, often ignores robots.txt.
Grokbot8/ 10xAI (Elon Musk) crawler for Grok. Integrated with X (Twitter). Rapid growth via premium X.
xAI-Grok8/ 10Alternative Grok crawler identifier. Indexes pages for Grok knowledge base.
FacebookBot9/ 10Meta crawler for Facebook/Instagram previews and Meta AI. Used for Llama training.
Meta-ExternalAgent9/ 10Meta AI crawler for fetching pages for WhatsApp/Messenger/Instagram users.
Meta-ExternalFetcher8/ 10Fetches content for Meta AI. Active in WhatsApp (63% of Meta AI traffic).
meta-externalagent8/ 10Lowercase variant of Meta-ExternalAgent. Some systems use lowercase.
Applebot8/ 10Apple crawler for Siri, Spotlight and Apple News. Used across all Apple devices.
Applebot-Extended7/ 10Extended Apple crawler for Apple Intelligence training. New since iOS 18.
bingbot9/ 10Main Bing crawler. Also used by Copilot and other Microsoft AI products.
BingPreview7/ 10Generates link previews in Bing/Teams/Outlook. Preview thumbnails.
Amazonbot7/ 10Amazon crawler for Alexa and Amazon Search. Answers voice queries.
AmazonBuyForMe6/ 10New Amazon shopping agent. Autonomously browses pages for purchases.
Amzn-SearchBot6/ 10Amazon crawler for product and information search.
DuckAssistBot7/ 10DuckDuckGo AI Assistant crawler. Privacy-first, doesn't track users.
YouBot6/ 10You.com crawler - AI search engine. Perplexity competitor.
PhindBot6/ 10Phind crawler - AI search for developers. Specializes in code.
Andibot5/ 10Andi Search crawler - conversational AI search engine.
Bravebot6/ 10Brave Search and Leo AI crawler. Privacy-focused browser with AI.
iAskBot5/ 10iAsk.ai crawler - free AI search engine. Popular in education.
KomoSearch4/ 10Komo AI Search crawler. Smaller player in AI search market.
DeepSeekBot8/ 10Chinese DeepSeek crawler. #4 globally, cheap model, 35% users from China.
MistralAI-User6/ 10Mistral AI crawler (France). European OpenAI competitor, open-source.
cohere-ai5/ 10Cohere crawler - Canadian AI company. Mainly B2B/enterprise.
cohere-training-data-crawler5/ 10Cohere crawler for collecting training data.
Diffbot6/ 10Diffbot crawler - structures web data. B2B, knowledge graphs.
CCBot7/ 10Common Crawl crawler - nonprofit collecting data for AI research. Used by many AI companies.
DataForSeoBot5/ 10DataForSEO crawler - SEO data for marketers and tools.
Bytespider8/ 10ByteDance (TikTok, Douyin) crawler. Indexes pages for TikTok search and AI.
TikTokSpider7/ 10Alternative TikTok crawler. Fetches content for in-app display.
PetalBot6/ 10Huawei Petal Search crawler. Popular in China and on Huawei devices.
SemrushBot5/ 10Semrush crawler - SEO tool. Analyzes sites for marketers.
AhrefsBot5/ 10Ahrefs crawler - SEO tool. Builds link index.
YandexBot6/ 10Yandex crawler - Russian search engine. Popular in Russia and CIS countries.
Is your site ready for AI crawlers?
Run a free AlfaUMi analysis to check your robots.txt, structured data, and AI visibility score.
Analyze your websiteLast updated: February 2026. Data compiled from public sources. Influence scores reflect estimated impact on content visibility.