OpenAI, Meta Debate Wild Solutions As They Run Out of Data to Train AI

From Business Insider: 2024-04-07 16:08:00

Big Tech firms like Meta, Google, and OpenAI are running out of high-quality data to train their AI models by 2026. To combat this, Google considered using consumer data from Google Docs, Sheets, and Slides, while Meta executives brainstormed options like buying Simon & Schuster for new data sources. OpenAI is exploring synthetic data as an alternative.

Google’s legal department aimed to broaden the use of consumer data for training AI systems from Google Docs, Sheets, and Slides. Meanwhile, Meta executives considered purchasing Simon & Schuster for new data sources, while also contemplating budget-friendly options like paying $10 per book for full licensing rights to new titles.

OpenAI is considering synthetic data generated by AI systems as an alternative to train its AI models. However, using synthetic data has its challenges, including reinforcing AI limitations and mistakes. OpenAI is working on a process where one AI system produces data, and another judges it to improve the training process.

Read more at Business Insider: OpenAI, Meta Debate Wild Solutions As They Run Out of Data to Train AI

You may also like

OpenAI halts new ChatGPT Plus sign-ups amid high demand By Cointelegraph

OpenAI leadership upheaval continues as board agrees to step down By Investing.com

Elon Musk: Public must know why OpenAI board fired Sam Altman