34b-beta

Maintained By
CausalLM

CausalLM 34b-beta

PropertyValue
Parameter Count34.4B
Model TypeText Generation, Conversational
LicenseGPL-3.0
Tensor TypeBF16
MT-Bench Score8.5

What is 34b-beta?

CausalLM 34b-beta is a large-scale language model featuring 34.4 billion parameters, designed for advanced text generation and conversational tasks. The model demonstrates impressive performance with an MT-Bench score of 8.5 and shows notably low contamination rates compared to other popular models.

Implementation Details

The model utilizes the ChatML prompt format and is optimized for transformer-based architectures. Currently, it's recommended to use Transformers for inference rather than accelerated frameworks like VLLM due to precision issues that will be addressed in future updates.

  • Uses BF16 tensor type for efficient computation
  • Supports q8_0 quantization for faster inference
  • Implements safetensors for model weight storage
  • Requires specific handling of repetition penalty (recommended not to use)

Core Capabilities

  • High-quality text generation and conversation
  • Strong performance on MT-Bench (8.5 score)
  • Low contamination rate (0.38 on MMLU reference)
  • Efficient inference with proper quantization

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its combination of large-scale parameters (34.4B) and impressive benchmark performances, particularly its MT-Bench score of 8.5. It also shows lower contamination rates compared to models like Orca-2-7b and Mistral-7B.

Q: What are the recommended use cases?

The model is best suited for text generation and conversational tasks. Users should utilize Transformers for inference or q8_0 quantization with llama.cpp for optimal performance. It's important to avoid VLLM temporarily due to precision issues.

The first platform built for prompt engineering