Llama-2-7b-hf
Property | Value |
---|---|
Parameter Count | 6.74B parameters |
Training Data | 2 trillion tokens |
Context Length | 4k tokens |
License | Custom Meta License Required |
Training Period | January 2023 - July 2023 |
What is Llama-2-7b-hf?
Llama-2-7b-hf is Meta's foundational language model, part of the Llama 2 family, converted to Hugging Face format. This 7B parameter model represents the base version of Meta's open-source effort to democratize large language models while maintaining strong performance and efficiency.
Implementation Details
The model utilizes an optimized transformer architecture, trained on 2 trillion tokens of publicly available data. It's implemented with both F32 and FP16 tensor support, making it versatile for different deployment scenarios. The training process consumed 184,320 GPU hours and was conducted on Meta's Research Super Cluster.
- Optimized transformer architecture for efficient processing
- 4k token context window
- Trained with a global batch-size of 4M tokens
- Carbon footprint of 31.22 tCO2eq (100% offset)
Core Capabilities
- Strong performance in commonsense reasoning (63.9% accuracy)
- Effective reading comprehension (61.3% on benchmark tests)
- Basic mathematical reasoning capabilities (14.6% on math benchmarks)
- Enhanced truthfulness compared to Llama 1 (33.29% on TruthfulQA)
Frequently Asked Questions
Q: What makes this model unique?
The model stands out for its balance of size and performance, offering strong capabilities while remaining deployable on consumer hardware. It's part of Meta's commitment to open-source AI, providing a foundation for both research and commercial applications.
Q: What are the recommended use cases?
The model is designed for English language tasks including text generation, analysis, and completion. It's particularly suitable for research applications and commercial use cases requiring a balance of performance and resource efficiency.