Llama-2-7b-hf

Maintained By
NousResearch

Llama-2-7b-hf

PropertyValue
Parameter Count6.74B parameters
Training Data2 trillion tokens
Context Length4k tokens
LicenseCustom Meta License Required
Training PeriodJanuary 2023 - July 2023

What is Llama-2-7b-hf?

Llama-2-7b-hf is Meta's foundational language model, part of the Llama 2 family, converted to Hugging Face format. This 7B parameter model represents the base version of Meta's open-source effort to democratize large language models while maintaining strong performance and efficiency.

Implementation Details

The model utilizes an optimized transformer architecture, trained on 2 trillion tokens of publicly available data. It's implemented with both F32 and FP16 tensor support, making it versatile for different deployment scenarios. The training process consumed 184,320 GPU hours and was conducted on Meta's Research Super Cluster.

  • Optimized transformer architecture for efficient processing
  • 4k token context window
  • Trained with a global batch-size of 4M tokens
  • Carbon footprint of 31.22 tCO2eq (100% offset)

Core Capabilities

  • Strong performance in commonsense reasoning (63.9% accuracy)
  • Effective reading comprehension (61.3% on benchmark tests)
  • Basic mathematical reasoning capabilities (14.6% on math benchmarks)
  • Enhanced truthfulness compared to Llama 1 (33.29% on TruthfulQA)

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its balance of size and performance, offering strong capabilities while remaining deployable on consumer hardware. It's part of Meta's commitment to open-source AI, providing a foundation for both research and commercial applications.

Q: What are the recommended use cases?

The model is designed for English language tasks including text generation, analysis, and completion. It's particularly suitable for research applications and commercial use cases requiring a balance of performance and resource efficiency.

The first platform built for prompt engineering