AI systems may run out of training data, hindering progress for tech giants.
From Fortune Media Group Holdings: 2024-06-06 20:33:24
A new study reveals that AI systems may run out of publicly available training data by 2026-2032, hindering progress. Tech giants like OpenAI and Google are scrambling to secure data sources. As Facebook claims to train models on 15 trillion tokens, concerns about overreliance on synthetic data arise. Experts warn of depleted data resources and potential biases encoded in AI systems. Wikimedia Foundation hopes to maintain incentives for human contributions amidst the flood of automatically generated content. AI companies are exploring diverse ways to train models, including synthetic data, to drive technical performance. Altman, CEO of OpenAI, warns against excessive reliance on synthetic data for model training.
Read more at Fortune Media Group Holdings: OpenAI, Google and Meta could face AI ‘bottleneck’ in less than a decade