AI Crawlers List

49 bots from 27 companies are crawling the web to train AI models, power search, and fetch content for users. Know who visits your site and what they do with your content.

49
Total Crawlers
27
Companies
10
Critical Impact
20
High Impact
💡

Want to control which bots access your site?

Use robots.txt rules or run an AlfaUMi analysis to see which crawlers currently access your website.

GPTBot10
OpenAI900M+ users

Main OpenAI crawler for GPT model training. Indexes pages for ChatGPT knowledge base. Blocking = absent from ChatGPT results.

ChatGPT-User10
OpenAI900M+ users

Fetches pages on ChatGPT user request (Browse with Bing). Triggered by human clicking a link.

OAI-SearchBot9
OpenAI900M+ users

Crawler for ChatGPT Search (Google competitor). Indexes pages for ChatGPT search results.

Google-Extended10
Google750M+ users

Google crawler for Gemini and Bard training. Separate from Googlebot. Blocking doesn't affect Search indexing.

GoogleOther8
Google750M+ users

General Google crawler for AI products (NotebookLM, Vertex AI). Used across various Google AI services.

Gemini-Deep-Research8
Google750M+ users

Gemini Deep Research crawler for in-depth page analysis. Fetches content for advanced queries.

Google-NotebookLM7
Google~50M users

NotebookLM crawler - Google's document analysis and AI podcast creation tool.

GoogleAgent-Mariner6
GoogleNew (2025) users

Experimental Google agent for autonomous web browsing. Project Mariner.

Google-CloudVertexBot6
GoogleEnterprise users

Google Cloud Vertex AI crawler for enterprise customers. B2B, smaller consumer reach.

ClaudeBot9
Anthropic30M+ users

Main Anthropic crawler for source citation in Claude responses. Fetches pages for context.

Claude-User9
Anthropic30M+ users

Fetches pages on Claude user request (Browse feature). User-initiated.

Claude-Web8
Anthropic30M+ users

Anthropic crawler for page indexing. Builds Claude knowledge base.

Claude-SearchBot8
Anthropic30M+ users

Crawler for Claude Search - Anthropic's new search feature.

anthropic-ai8
Anthropic30M+ users

General Anthropic crawler identifier. Used for model training.

PerplexityBot9
Perplexity45M+ users

Main Perplexity AI crawler. Indexes pages for AI search results. Fastest growing AI search.

Perplexity-User8
Perplexity45M+ users

Fetches pages on Perplexity user request. User-initiated, often ignores robots.txt.

Grokbot8
xAI78M+ users

xAI (Elon Musk) crawler for Grok. Integrated with X (Twitter). Rapid growth via premium X.

xAI-Grok8
xAI78M+ users

Alternative Grok crawler identifier. Indexes pages for Grok knowledge base.

FacebookBot9
Meta1B+ users

Meta crawler for Facebook/Instagram previews and Meta AI. Used for Llama training.

Meta-ExternalAgent9
Meta1B+ users

Meta AI crawler for fetching pages for WhatsApp/Messenger/Instagram users.

Meta-ExternalFetcher8
Meta1B+ users

Fetches content for Meta AI. Active in WhatsApp (63% of Meta AI traffic).

meta-externalagent8
Meta1B+ users

Lowercase variant of Meta-ExternalAgent. Some systems use lowercase.

Applebot8
Apple2B+ users

Apple crawler for Siri, Spotlight and Apple News. Used across all Apple devices.

Applebot-Extended7
Apple2B+ users

Extended Apple crawler for Apple Intelligence training. New since iOS 18.

bingbot9
Microsoft100M+ users

Main Bing crawler. Also used by Copilot and other Microsoft AI products.

BingPreview7
Microsoft100M+ users

Generates link previews in Bing/Teams/Outlook. Preview thumbnails.

Amazonbot7
Amazon500M+ users

Amazon crawler for Alexa and Amazon Search. Answers voice queries.

AmazonBuyForMe6
AmazonNew (2026) users

New Amazon shopping agent. Autonomously browses pages for purchases.

Amzn-SearchBot6
Amazon500M+ users

Amazon crawler for product and information search.

DuckAssistBot7
DuckDuckGo100M+ users

DuckDuckGo AI Assistant crawler. Privacy-first, doesn't track users.

YouBot6
You.com10M+ users

You.com crawler - AI search engine. Perplexity competitor.

PhindBot6
Phind5M+ users

Phind crawler - AI search for developers. Specializes in code.

Andibot5
Andi2M+ users

Andi Search crawler - conversational AI search engine.

Bravebot6
Brave70M+ users

Brave Search and Leo AI crawler. Privacy-focused browser with AI.

iAskBot5
iAsk.ai3M+ users

iAsk.ai crawler - free AI search engine. Popular in education.

KomoSearch4
Komo1M+ users

Komo AI Search crawler. Smaller player in AI search market.

DeepSeekBot8
DeepSeek97M+ users

Chinese DeepSeek crawler. #4 globally, cheap model, 35% users from China.

MistralAI-User6
Mistral5M+ users

Mistral AI crawler (France). European OpenAI competitor, open-source.

cohere-ai5
CohereEnterprise users

Cohere crawler - Canadian AI company. Mainly B2B/enterprise.

cohere-training-data-crawler5
CohereEnterprise users

Cohere crawler for collecting training data.

Diffbot6
DiffbotEnterprise users

Diffbot crawler - structures web data. B2B, knowledge graphs.

CCBot7
Common CrawlResearch users

Common Crawl crawler - nonprofit collecting data for AI research. Used by many AI companies.

DataForSeoBot5
DataForSEOB2B users

DataForSEO crawler - SEO data for marketers and tools.

Bytespider8
ByteDance1B+ users

ByteDance (TikTok, Douyin) crawler. Indexes pages for TikTok search and AI.

TikTokSpider7
ByteDance1B+ users

Alternative TikTok crawler. Fetches content for in-app display.

PetalBot6
Huawei500M+ users

Huawei Petal Search crawler. Popular in China and on Huawei devices.

SemrushBot5
SemrushB2B users

Semrush crawler - SEO tool. Analyzes sites for marketers.

AhrefsBot5
AhrefsB2B users

Ahrefs crawler - SEO tool. Builds link index.

YandexBot6
Yandex100M+ users

Yandex crawler - Russian search engine. Popular in Russia and CIS countries.

Is your site ready for AI crawlers?

Run a free AlfaUMi analysis to check your robots.txt, structured data, and AI visibility score.

Analyze your website

Last updated: February 2026. Data compiled from public sources. Influence scores reflect estimated impact on content visibility.