llama2_7b_chat_uncensored-GPTQ

Maintained By
TheBloke

Llama2 7B Chat Uncensored GPTQ

PropertyValue
Parameter Count7 Billion
Model TypeLlama2
LicenseOther + Meta Llama 2 License
Quantization4-bit GPTQ

What is llama2_7b_chat_uncensored-GPTQ?

This is a quantized version of George Sung's Llama2 7B Chat Uncensored model, optimized by TheBloke for efficient deployment. The model was fine-tuned using QLoRA on the uncensored Wizard-Vicuna conversation dataset, making it suitable for open-ended dialogue generation without traditional content restrictions.

Implementation Details

The model offers multiple GPTQ quantization variants, each optimized for different hardware configurations and performance requirements. It uses a 4-bit precision base with varying group sizes (32g, 64g, 128g) and includes Act Order optimization options.

  • Multiple branch options with different quantization parameters
  • Compatible with AutoGPTQ, Transformers, and ExLlama
  • Supports sequence lengths up to 4096 tokens
  • Includes optimized configurations for different VRAM requirements

Core Capabilities

  • Efficient GPU inference with reduced memory footprint
  • Maintains model quality while reducing size to ~4GB
  • Supports both direct Transformers integration and pipeline usage
  • Optimized for chat-based applications with specific prompt template

Frequently Asked Questions

Q: What makes this model unique?

This model combines the capabilities of Llama 2 with uncensored training data, while offering multiple quantization options for efficient deployment. The GPTQ quantization maintains model quality while significantly reducing the resource requirements.

Q: What are the recommended use cases?

The model is best suited for applications requiring open-ended dialogue generation, particularly where traditional content restrictions might be limiting. It's especially useful in scenarios where GPU memory is constrained but model quality needs to be maintained.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.