diff --git a/README.md b/README.md index 987cd4a..305aed2 100644 --- a/README.md +++ b/README.md @@ -49,9 +49,16 @@ file on-the-fly. A note about contributing: updates should be added/made to `robots.json`. A GitHub action will then generate the updated `robots.txt`, `table-of-bot-metrics.md`, `.htaccess` and `nginx-block-ai-bots.conf`. -You can run the tests by [installing](https://www.python.org/about/gettingstarted/) Python 3 and issuing: +You can run the tests by [installing](https://www.python.org/about/gettingstarted/) Python 3, installing the depenendcies, and then issuing: ```console code/tests.py + +### Installing Dependencies + +Before running the tests, install all required Python packages: +pip install -r requirements.txt + + ``` ## Releasing @@ -97,3 +104,5 @@ But even if you don't use Cloudflare's hard block, their list of [verified bots] - [Blockin' bots on Netlify](https://www.jeremiak.com/blog/block-bots-netlify-edge-functions/) by Jeremia Kimelman - [Blocking AI web crawlers](https://underlap.org/blocking-ai-web-crawlers) by Glyn Normington - [Block AI Bots from Crawling Websites Using Robots.txt](https://originality.ai/ai-bot-blocking) by Jonathan Gillham, Originality.AI + + diff --git a/requirements.txt b/requirements.txt new file mode 100644 index 0000000..664e86d --- /dev/null +++ b/requirements.txt @@ -0,0 +1,3 @@ +beautifulsoup4 +lxml +requests