WizardLM-7B-uncensored-GPTQ
Property | Value |
---|---|
Parameter Count | 1.13B |
Model Type | LLaMA-based |
License | Other |
Quantization | 4-bit GPTQ |
What is WizardLM-7B-uncensored-GPTQ?
WizardLM-7B-uncensored-GPTQ is a quantized version of Eric Hartford's WizardLM model, specifically designed to provide unrestricted responses without built-in alignment or moral constraints. This GPTQ-quantized variant maintains the core capabilities of the original model while significantly reducing its memory footprint through 4-bit precision.
Implementation Details
The model comes in multiple GPTQ variants, each optimized for different use cases. It features group size options of 128, with both Act-Order and non-Act-Order versions available. The quantization process utilized the WikiText dataset with a sequence length of 2048 tokens.
- Multiple quantization options available across different branches
- Compatible with ExLlama and Hugging Face's Text Generation Inference
- Optimized for both VRAM efficiency and performance
- Includes automatic configuration through quantize_config.json
Core Capabilities
- Unrestricted text generation without built-in alignment constraints
- Efficient 4-bit precision operation with minimal quality loss
- Support for both direct transformers implementation and pipeline usage
- Customizable inference parameters for temperature, top-p, and top-k sampling
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its uncensored nature and efficient quantization, allowing unrestricted responses while maintaining small memory requirements. It's particularly useful for applications requiring raw model outputs without built-in filtering.
Q: What are the recommended use cases?
The model is suitable for research purposes and applications where unrestricted outputs are desired. Users should note that they are responsible for implementing any necessary content filtering or alignment as needed for their specific use case.