Meta-Llama-3-8B-Instruct
Property | Value |
---|---|
Parameter Count | 8.03B |
Context Length | 8k tokens |
License | Meta Llama 3 Community License |
Training Data | 15T+ tokens |
Knowledge Cutoff | March 2023 |
What is Meta-Llama-3-8B-Instruct?
Meta-Llama-3-8B-Instruct is part of Meta's latest generation of large language models, specifically optimized for dialogue and instruction-following tasks. This 8B parameter model represents a significant advancement in compact yet powerful language models, featuring improved performance, enhanced safety measures, and reduced false refusals compared to its predecessors.
Implementation Details
The model utilizes an optimized transformer architecture with Grouped-Query Attention (GQA) for improved inference scalability. It was trained on over 15 trillion tokens of publicly available data and supports a context length of 8,000 tokens. The model uses BF16 precision and is designed for both research and commercial applications.
- Advanced instruction-tuning optimized for dialogue use cases
- Implements safety measures and content filtering
- Supports both Transformers and original llama3 codebase
- Features comprehensive evaluation across multiple benchmarks
Core Capabilities
- Strong performance on MMLU (68.4%) and other key benchmarks
- Excellent code generation capabilities (HumanEval: 62.2%)
- Enhanced mathematical reasoning (GSM-8K: 79.6%)
- Reduced false refusals while maintaining safety
- Multilingual potential through fine-tuning
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its optimal balance between size and performance, featuring significantly improved instruction-following capabilities and reduced false refusals compared to previous versions, while maintaining strong safety measures.
Q: What are the recommended use cases?
The model is particularly well-suited for dialogue applications, coding assistance, and general text generation tasks. It's designed for both commercial and research use in English, with the possibility of fine-tuning for other languages while complying with the Llama 3 Community License.