Here’s Proof You Can Train an AI Model Without Slurping Copyrighted Content

From Condé Nast.:

OpenAI states that training leading AI models without copyrighted materials is impossible, leading to lawsuits alleging copyright infringement. However, a French-backed group releases a large AI training dataset in the public domain. Fairly Trained certifies its first large language model, KL3M, developed by 273 Ventures, using curated legal documents as a training dataset. This infringement-free approach may set a new trend in AI model development by providing clean, high-quality data for specialized tasks. Researchers also release Common Corpus, claiming to be the largest AI dataset composed entirely of public domain content, offering more options for infringement-free datasets in AI model training.



Read more at Condé Nast.: Here’s Proof You Can Train an AI Model Without Slurping Copyrighted Content