Meta has introduced the next generation of the Llama family of large open source language models (LLM) developed by the company. The company describes Llama 3 as “the best open source models of its class.”
The Silicon Valley giant has released the first two models of the Llama 3 family – one with 8B parameters and one with 70B. According to Meta, these models are significantly better than the Llama 2 models, offering a much lower false rejection rate, improved alignment, and greater diversity in model responses. The specific capabilities of the models such as, code generation and instruction following have also been greatly improved.
Llama 3 was pre-trained on more than 15T tokens from publicly available sources, therefore the training dataset of Llama 3 is seven times larger than the training dataset of Llama 2 and contains four times more code.
According to Meta, in developing Llama 3, a new human evaluation set for benchmarking was also developed that contains 1,800 prompts across 12 use cases. These include asking for advice, classification, answering a closed question, writing code, creative writing, extraction, answering an open question, reasoning, rewriting, summarizing, and more.
The 70B parameter model beat Claude Sonnet, Mistral Medium, GPT 3.5, and Llama 2 using this new assessment set.
Llama 3 is available on AWS, Databricks, Google Cloud, Hugging Face, Kaggle, IBM WatsonX, Microsoft Azure, NVIDIA NIM and Snowflake. In addition, some hardware vendors will also offer support for it, including AMD, AWS, Dell, Intel, NVIDIA and Qualcomm.
Over the next few months, the company plans to update Llama 3 with new features, longer context windows, and larger model sizes.
Read more:
1. GitLab Releases GitLab Duo Chat with 40+ New Features
2. From Competitions in Informatics to MIT – Rumen Hristov’s Formula for Success
3. OpenSSF, CISA and DHS Join Forces in New Open Source Project