Apple released open-source AI models that outperform competitors, showcasing strong AI capabilities.

From VB Media Inc.: 2024-07-19 16:30:45

Apple has released a family of small open-source DCLM models on Hugging Face, including a 7 billion-parameter model and a 1.4 billion-parameter model. These models outperform Mistral-7B and are competing with other leading open models. The project was made truly open-source, including model weights, training code, and pretraining dataset.

The DCLM project focuses on designing high-quality datasets for training AI models, involving researchers from Apple, University of Washington, Tel Aviv University, and Toyota Institute of Research. The dataset used to train the DCLM models, DCLM-Baseline, resulted in impressive performance, with the 7 billion-parameter model achieving a 63.7% 5-shot accuracy on MMLU.

Apple released an open-source 7 billion-parameter LLM model, along with its weights, training code, and dataset. The model, trained on 2.5 trillion tokens, delivers exceptional performance and competes with other leading models in the market. Extensions to the model, such as an 8K context length, further improved its performance on Core and Extended benchmarks.

A smaller 1.4 billion-parameter version of the DCLM model, trained jointly with Toyota Research Institute, also delivers impressive performance, scoring 41.9% in the MMLU test. This model outperforms other models in its category, showcasing Apple’s commitment to developing powerful AI models. The smaller model has been released under the Apache 2.0 license for commercial use.



Read more at VB Media Inc.: Apple shows off open AI prowess: new models outperform Mistral and Hugging Face offerings