mirror of
https://github.com/ai-robots-txt/ai.robots.txt.git
synced 2025-12-29 12:18:33 +01:00
Merge pull request #205 from fiskhandlarn/fix/editorconfig
Fix/editorconfig
This commit is contained in:
commit
56010ef913
3 changed files with 8 additions and 8 deletions
|
|
@ -4,3 +4,6 @@ root = true
|
|||
end_of_line = lf
|
||||
insert_final_newline = true
|
||||
trim_trailing_whitespace = true
|
||||
|
||||
[{Caddyfile,haproxy-block-ai-bots.txt,nginx-block-ai-bots.conf}]
|
||||
insert_final_newline = false
|
||||
|
|
|
|||
|
|
@ -4,7 +4,7 @@
|
|||
|
||||
This list contains AI-related crawlers of all types, regardless of purpose. We encourage you to contribute to and implement this list on your own site. See [information about the listed crawlers](./table-of-bot-metrics.md) and the [FAQ](https://github.com/ai-robots-txt/ai.robots.txt/blob/main/FAQ.md).
|
||||
|
||||
A number of these crawlers have been sourced from [Dark Visitors](https://darkvisitors.com) and we appreciate the ongoing effort they put in to track these crawlers.
|
||||
A number of these crawlers have been sourced from [Dark Visitors](https://darkvisitors.com) and we appreciate the ongoing effort they put in to track these crawlers.
|
||||
|
||||
If you'd like to add information about a crawler to the list, please make a pull request with the bot name added to `robots.txt`, `ai.txt`, and any relevant details in `table-of-bot-metrics.md` to help people understand what's crawling.
|
||||
|
||||
|
|
@ -86,8 +86,8 @@ Alternatively, you can also subscribe to new releases with your GitHub account b
|
|||
|
||||
## License content with RSL
|
||||
|
||||
It is also possible to license your content to AI companies in `robots.txt` using
|
||||
the [Really Simple Licensing](https://rslstandard.org) standard, with an option of
|
||||
It is also possible to license your content to AI companies in `robots.txt` using
|
||||
the [Really Simple Licensing](https://rslstandard.org) standard, with an option of
|
||||
collective bargaining. A [plugin](https://github.com/Jameswlepage/rsl-wp) currently
|
||||
implements RSL as well as payment processing for WordPress sites.
|
||||
|
||||
|
|
@ -103,5 +103,3 @@ But even if you don't use Cloudflare's hard block, their list of [verified bots]
|
|||
- [Blockin' bots on Netlify](https://www.jeremiak.com/blog/block-bots-netlify-edge-functions/) by Jeremia Kimelman
|
||||
- [Blocking AI web crawlers](https://underlap.org/blocking-ai-web-crawlers) by Glyn Normington
|
||||
- [Block AI Bots from Crawling Websites Using Robots.txt](https://originality.ai/ai-bot-blocking) by Jonathan Gillham, Originality.AI
|
||||
|
||||
|
||||
|
|
|
|||
|
|
@ -1,7 +1,7 @@
|
|||
# Intro
|
||||
If you're using Traefik as your reverse proxy in your docker setup, you might want to use it as well to centrally serve the ```/robots.txt``` for all your Traefik fronted services.
|
||||
|
||||
This can be achieved by configuring a single lightweight service to service static files and defining a high priority Traefik HTTP Router rule.
|
||||
This can be achieved by configuring a single lightweight service to service static files and defining a high priority Traefik HTTP Router rule.
|
||||
|
||||
# Setup
|
||||
Define a single service to serve the one robots.txt to rule them all. I'm using a lean nginx:alpine docker image in this example:
|
||||
|
|
@ -31,7 +31,6 @@ networks:
|
|||
external: true
|
||||
```
|
||||
|
||||
|
||||
The Traefik HTTP Routers rule explicitly does not contain a Hostname. Traefik will print a warning about this for the TLS setup but it will work. The high priority of 3000 should ensure this rule is evaluated first for incoming requests.
|
||||
The Traefik HTTP Routers rule explicitly does not contain a Hostname. Traefik will print a warning about this for the TLS setup but it will work. The high priority of 3000 should ensure this rule is evaluated first for incoming requests.
|
||||
|
||||
Place your robots.txt in the local `./static/` directory and NGINX will serve it for all services behind your Traefik proxy.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue