Zamba2-2.7B
Property | Value |
---|---|
Parameter Count | 2.69B |
Model Type | Hybrid SSM-Transformer |
License | Apache 2.0 |
Paper | Zamba Architecture Paper |
Tensor Type | BF16 |
What is Zamba2-2.7B?
Zamba2-2.7B represents a significant advancement in hybrid AI architectures, combining state-space models with transformer blocks. Pre-trained on 3T tokens of text and code data, followed by fine-tuning on 100B high-quality tokens, this model achieves state-of-the-art performance among models under 3B parameters.
Implementation Details
The model features a sophisticated architecture that builds upon the original Zamba design with three major improvements:
- Integration of Mamba2 blocks replacing Mamba1
- Dual shared attention blocks in an ABAB pattern
- LoRA projector implementation for shared MLP blocks enabling depth specialization
Core Capabilities
- State-of-the-art performance for models under 3B parameters
- Extremely low inference latency and rapid generation
- Reduced memory footprint compared to traditional transformer models
- Efficient text generation using the Mistral v0.1 tokenizer
Frequently Asked Questions
Q: What makes this model unique?
Zamba2-2.7B's uniqueness lies in its hybrid architecture that combines state-space modeling with transformer blocks, achieving high performance with significantly lower computational requirements than traditional models.
Q: What are the recommended use cases?
The model is ideal for on-device applications requiring efficient text generation, particularly where computational resources are limited. However, it's important to note that it's a base model without moderation mechanisms and isn't fine-tuned for instruction following or chat applications.