ai.robots.txt/haproxy-block-ai-bots.txt
László Károlyi d1e0a9a757
Add meta-webindexer bot and update Brightbot operator info
- Added 'meta-webindexer' to HAProxy, Nginx, robots.txt blocklists
- Updated Brightbot operator to brightdata.com in robots.json and metrics
- Added Brightbot frequency and disguise tactics documentation link
- Added meta-webindexer entry in robots.json with Meta's official description
- Added meta-webindexer row in table-of-bot-metrics.md with details
2025-09-07 12:12:35 +02:00

96 lines
1.3 KiB
Text

AddSearchBot
AI2Bot
Ai2Bot-Dolma
aiHitBot
Amazonbot
Andibot
anthropic-ai
Applebot
Applebot-Extended
Awario
bedrockbot
bigsur.ai
Brightbot 1.0
Bytespider
CCBot
ChatGPT Agent
ChatGPT-User
Claude-SearchBot
Claude-User
Claude-Web
ClaudeBot
CloudVertexBot
cohere-ai
cohere-training-data-crawler
Cotoyogi
Crawlspace
Datenbank Crawler
Devin
Diffbot
DuckAssistBot
Echobot Bot
EchoboxBot
FacebookBot
facebookexternalhit
Factset_spyderbot
FirecrawlAgent
FriendlyCrawler
Gemini-Deep-Research
Google-CloudVertexBot
Google-Extended
Google-Firebase
GoogleAgent-Mariner
GoogleOther
GoogleOther-Image
GoogleOther-Video
GPTBot
iaskspider/2.0
ICC-Crawler
ImagesiftBot
img2dataset
ISSCyberRiskCrawler
Kangaroo Bot
LinerBot
meta-externalagent
Meta-ExternalAgent
meta-externalfetcher
Meta-ExternalFetcher
meta-webindexer
MistralAI-User
MistralAI-User/1.0
MyCentralAIScraperBot
netEstate Imprint Crawler
NovaAct
OAI-SearchBot
omgili
omgilibot
OpenAI
Operator
PanguBot
Panscient
panscient.com
Perplexity-User
PerplexityBot
PetalBot
PhindBot
Poseidon Research Crawler
QualifiedBot
QuillBot
quillbot.com
SBIntuitionsBot
Scrapy
SemrushBot-OCOB
SemrushBot-SWA
ShapBot
Sidetrade indexer bot
Thinkbot
TikTokSpider
Timpibot
VelenPublicWebCrawler
WARDBot
Webzio-Extended
wpbot
YaK
YandexAdditional
YandexAdditionalBot
YouBot