CausalLM 34b-beta

Property	Value
Parameter Count	34.4B
Model Type	Text Generation, Conversational
License	GPL-3.0
Tensor Type	BF16
MT-Bench Score	8.5

What is 34b-beta?

CausalLM 34b-beta is a large-scale language model featuring 34.4 billion parameters, designed for advanced text generation and conversational tasks. The model demonstrates impressive performance with an MT-Bench score of 8.5 and shows notably low contamination rates compared to other popular models.

Implementation Details

The model utilizes the ChatML prompt format and is optimized for transformer-based architectures. Currently, it's recommended to use Transformers for inference rather than accelerated frameworks like VLLM due to precision issues that will be addressed in future updates.

Uses BF16 tensor type for efficient computation
Supports q8_0 quantization for faster inference
Implements safetensors for model weight storage
Requires specific handling of repetition penalty (recommended not to use)

Core Capabilities

High-quality text generation and conversation
Strong performance on MT-Bench (8.5 score)
Low contamination rate (0.38 on MMLU reference)
Efficient inference with proper quantization

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its combination of large-scale parameters (34.4B) and impressive benchmark performances, particularly its MT-Bench score of 8.5. It also shows lower contamination rates compared to models like Orca-2-7b and Mistral-7B.

Q: What are the recommended use cases?

The model is best suited for text generation and conversational tasks. Users should utilize Transformers for inference or q8_0 quantization with llama.cpp for optimal performance. It's important to avoid VLLM temporarily due to precision issues.

34b-beta