diff --git a/README.md b/README.md index ceba4e9..75f72c3 100644 --- a/README.md +++ b/README.md @@ -253,6 +253,16 @@ Could use some help writing this with concrete receipts on environmental, social LLMs are often trained on, and thus prone to, regurgitate either completely, or in-part, chunks of code that are licensed under terms which have specific legal requirements that a sloperator may not understand or even be aware of when making a contribution. Regardless of this ignorance, it falls to the repo's owner to comply with the terms of any and all licensed code integrated into their project. +Legal, copyright and ethic problems arise especially with copyleft licenses such as (A/L)GPL. With the "help" of AI the copyleft code may be "license-washed" very easily. + +There are ongoing problems with AI "license-washing" in the FOSS world: + +* `chardet` --- switched from LGPL to MIT license without asking all + contributors (which itself is a violation of GPL) + * relicensed release: + * original author's concerns: + * "consumer's" concerns: + ## Stolen Training Data AI companies use data from across the web for training their models, most often without the website owners' and users' consent. Big tech companies like Google and Meta are scraping data from the users of major FOSS projects, such as Mastodon, WordPress, and other AcitivityPub-powered and self-hosted software.