WizardLM-7B-uncensored-GPTQ

Property	Value
Parameter Count	1.13B
Model Type	LLaMA-based
License	Other
Quantization	4-bit GPTQ

What is WizardLM-7B-uncensored-GPTQ?

WizardLM-7B-uncensored-GPTQ is a quantized version of Eric Hartford's WizardLM model, specifically designed to provide unrestricted responses without built-in alignment or moral constraints. This GPTQ-quantized variant maintains the core capabilities of the original model while significantly reducing its memory footprint through 4-bit precision.

Implementation Details

The model comes in multiple GPTQ variants, each optimized for different use cases. It features group size options of 128, with both Act-Order and non-Act-Order versions available. The quantization process utilized the WikiText dataset with a sequence length of 2048 tokens.

Multiple quantization options available across different branches
Compatible with ExLlama and Hugging Face's Text Generation Inference
Optimized for both VRAM efficiency and performance
Includes automatic configuration through quantize_config.json

Core Capabilities

Unrestricted text generation without built-in alignment constraints
Efficient 4-bit precision operation with minimal quality loss
Support for both direct transformers implementation and pipeline usage
Customizable inference parameters for temperature, top-p, and top-k sampling

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its uncensored nature and efficient quantization, allowing unrestricted responses while maintaining small memory requirements. It's particularly useful for applications requiring raw model outputs without built-in filtering.

Q: What are the recommended use cases?

The model is suitable for research purposes and applications where unrestricted outputs are desired. Users should note that they are responsible for implementing any necessary content filtering or alignment as needed for their specific use case.