From 9cc3dbc05f1a905e73f572a80429c536f09b2c17 Mon Sep 17 00:00:00 2001 From: Anshita-18H Date: Thu, 27 Nov 2025 17:06:59 +0530 Subject: [PATCH 1/7] Add requirements.txt with project dependencies --- requirements.txt | 7 +++++++ 1 file changed, 7 insertions(+) create mode 100644 requirements.txt diff --git a/requirements.txt b/requirements.txt new file mode 100644 index 0000000..a8bf87f --- /dev/null +++ b/requirements.txt @@ -0,0 +1,7 @@ +beautifulsoup4 + + + +requests +beautifulsoup4 +lxml From 9ca50339270e410cf6d16b1dc416bbb96729609b Mon Sep 17 00:00:00 2001 From: Anshita-18H Date: Thu, 27 Nov 2025 17:44:47 +0530 Subject: [PATCH 2/7] Deduplicate requirements.txt and add installation instructions to README --- requirements.txt | 4 ---- 1 file changed, 4 deletions(-) diff --git a/requirements.txt b/requirements.txt index a8bf87f..b8ee43c 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,7 +1,3 @@ -beautifulsoup4 - - - requests beautifulsoup4 lxml From 4302fd1acafd8be96dcc274639b3e21970a91b8f Mon Sep 17 00:00:00 2001 From: Anshita-18H Date: Thu, 27 Nov 2025 17:56:47 +0530 Subject: [PATCH 3/7] Improve contributing section and fix formatting in README --- README.md | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 987cd4a..b1350e0 100644 --- a/README.md +++ b/README.md @@ -49,9 +49,17 @@ file on-the-fly. A note about contributing: updates should be added/made to `robots.json`. A GitHub action will then generate the updated `robots.txt`, `table-of-bot-metrics.md`, `.htaccess` and `nginx-block-ai-bots.conf`. -You can run the tests by [installing](https://www.python.org/about/gettingstarted/) Python 3 and issuing: +You can run the tests by [installing Python 3 and issuing: ```console -code/tests.py +python code/tests.py + + +### Installing Dependencies + +Before running the tests, install all required Python packages: +pip install -r requirements.txt + + ``` ## Releasing @@ -97,3 +105,5 @@ But even if you don't use Cloudflare's hard block, their list of [verified bots] - [Blockin' bots on Netlify](https://www.jeremiak.com/blog/block-bots-netlify-edge-functions/) by Jeremia Kimelman - [Blocking AI web crawlers](https://underlap.org/blocking-ai-web-crawlers) by Glyn Normington - [Block AI Bots from Crawling Websites Using Robots.txt](https://originality.ai/ai-bot-blocking) by Jonathan Gillham, Originality.AI + + From 7521c3af50f2755ae46d6718addedc3e7720eb02 Mon Sep 17 00:00:00 2001 From: Glyn Normington Date: Thu, 27 Nov 2025 12:37:07 +0000 Subject: [PATCH 4/7] Fix link --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index b1350e0..5fbf2c6 100644 --- a/README.md +++ b/README.md @@ -49,7 +49,7 @@ file on-the-fly. A note about contributing: updates should be added/made to `robots.json`. A GitHub action will then generate the updated `robots.txt`, `table-of-bot-metrics.md`, `.htaccess` and `nginx-block-ai-bots.conf`. -You can run the tests by [installing Python 3 and issuing: +You can run the tests by [installing](https://www.python.org/about/gettingstarted/) Python 3, installing the depenendcies, and then issuing: ```console python code/tests.py From c3f2fe758e99b6f2c1afb0fad54ca7b0106ed227 Mon Sep 17 00:00:00 2001 From: Glyn Normington Date: Thu, 27 Nov 2025 12:37:21 +0000 Subject: [PATCH 5/7] Whitespace --- README.md | 1 - 1 file changed, 1 deletion(-) diff --git a/README.md b/README.md index 5fbf2c6..8aa3e3e 100644 --- a/README.md +++ b/README.md @@ -53,7 +53,6 @@ You can run the tests by [installing](https://www.python.org/about/gettingstarte ```console python code/tests.py - ### Installing Dependencies Before running the tests, install all required Python packages: From f46754d280800f78a891e7e0b834458ce54d39dd Mon Sep 17 00:00:00 2001 From: Glyn Normington Date: Thu, 27 Nov 2025 12:37:35 +0000 Subject: [PATCH 6/7] Order deps --- requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/requirements.txt b/requirements.txt index b8ee43c..664e86d 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,3 +1,3 @@ -requests beautifulsoup4 lxml +requests From b75163e796fcf1beab7865afadaa14950cd56aaa Mon Sep 17 00:00:00 2001 From: Glyn Normington Date: Thu, 27 Nov 2025 12:38:12 +0000 Subject: [PATCH 7/7] Ensure Python3 is used --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 8aa3e3e..305aed2 100644 --- a/README.md +++ b/README.md @@ -51,7 +51,7 @@ A note about contributing: updates should be added/made to `robots.json`. A GitH You can run the tests by [installing](https://www.python.org/about/gettingstarted/) Python 3, installing the depenendcies, and then issuing: ```console -python code/tests.py +code/tests.py ### Installing Dependencies