META’s answer to ChatGPT, Llama is an open-source large language model (LLM). Available for both research and commercial use free of charge, it includes model weights and starting code for pretrained and fine-tuned models alike. These range from 7 billion to 70 billion parameters, indicating the model’s complexity and potential for detailed understanding and response generation.
The pretrained models have been trained on an impressive 2 trillion tokens, with a context length double that of its predecessor, Llama 1. As for the fine-tuned models, including Llama Chat and Code Llama, they have been trained with over 1 million human annotations — enhancing their accuracy and reliability.
Llama demonstrates superior performance over other open-source language models in several external benchmarks. Said benchmarks test various capabilities like reasoning, coding, language proficiency, and knowledge retention — and these advancements indicate that Llama could potentially offer more accurate and contextually relevant responses, making it a valuable tool for developers and researchers.
The model is divided into two main components: Llama Chat and Code Llama. The former was pretrained on publicly available online data sources and further refined using instruction datasets and human annotations. This component of Llama is geared towards natural language understanding and response.
On the other hand, Code Llama specializes in code generation. It is trained on 500 billion tokens of code and supports several common programming languages like Python, C++, Java, PHP, Typescript, C#, and Bash. This diversity in language support underlines its potential utility in software development and related fields.
Finally, Llama is backed by a broad range of global partners and supporters who believe in Meta’s open approach to AI. These supporters come from various sectors, including tech, academia, and policy, and see the benefits of an open platform for AI development.