Zamba2-2.7B

Maintained By
Zyphra

Zamba2-2.7B

PropertyValue
Parameter Count2.69B
Model TypeHybrid SSM-Transformer
LicenseApache 2.0
PaperZamba Architecture Paper
Tensor TypeBF16

What is Zamba2-2.7B?

Zamba2-2.7B represents a significant advancement in hybrid AI architectures, combining state-space models with transformer blocks. Pre-trained on 3T tokens of text and code data, followed by fine-tuning on 100B high-quality tokens, this model achieves state-of-the-art performance among models under 3B parameters.

Implementation Details

The model features a sophisticated architecture that builds upon the original Zamba design with three major improvements:

  • Integration of Mamba2 blocks replacing Mamba1
  • Dual shared attention blocks in an ABAB pattern
  • LoRA projector implementation for shared MLP blocks enabling depth specialization

Core Capabilities

  • State-of-the-art performance for models under 3B parameters
  • Extremely low inference latency and rapid generation
  • Reduced memory footprint compared to traditional transformer models
  • Efficient text generation using the Mistral v0.1 tokenizer

Frequently Asked Questions

Q: What makes this model unique?

Zamba2-2.7B's uniqueness lies in its hybrid architecture that combines state-space modeling with transformer blocks, achieving high performance with significantly lower computational requirements than traditional models.

Q: What are the recommended use cases?

The model is ideal for on-device applications requiring efficient text generation, particularly where computational resources are limited. However, it's important to note that it's a base model without moderation mechanisms and isn't fine-tuned for instruction following or chat applications.

The first platform built for prompt engineering