CausalLM 34b-beta
Property | Value |
---|---|
Parameter Count | 34.4B |
Model Type | Text Generation, Conversational |
License | GPL-3.0 |
Tensor Type | BF16 |
MT-Bench Score | 8.5 |
What is 34b-beta?
CausalLM 34b-beta is a large-scale language model featuring 34.4 billion parameters, designed for advanced text generation and conversational tasks. The model demonstrates impressive performance with an MT-Bench score of 8.5 and shows notably low contamination rates compared to other popular models.
Implementation Details
The model utilizes the ChatML prompt format and is optimized for transformer-based architectures. Currently, it's recommended to use Transformers for inference rather than accelerated frameworks like VLLM due to precision issues that will be addressed in future updates.
- Uses BF16 tensor type for efficient computation
- Supports q8_0 quantization for faster inference
- Implements safetensors for model weight storage
- Requires specific handling of repetition penalty (recommended not to use)
Core Capabilities
- High-quality text generation and conversation
- Strong performance on MT-Bench (8.5 score)
- Low contamination rate (0.38 on MMLU reference)
- Efficient inference with proper quantization
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its combination of large-scale parameters (34.4B) and impressive benchmark performances, particularly its MT-Bench score of 8.5. It also shows lower contamination rates compared to models like Orca-2-7b and Mistral-7B.
Q: What are the recommended use cases?
The model is best suited for text generation and conversational tasks. Users should utilize Transformers for inference or q8_0 quantization with llama.cpp for optimal performance. It's important to avoid VLLM temporarily due to precision issues.