WizardLM 1.0 Uncensored LLaMA2 13B GGUF
Property | Value |
---|---|
Parameter Count | 13B |
Base Model | LLaMA2 |
License | LLaMA2 |
Creator | Eric Hartford |
Quantizer | TheBloke |
What is WizardLM-1.0-Uncensored-Llama2-13B-GGUF?
This model is a specially quantized version of Eric Hartford's WizardLM 1.0 Uncensored LLaMA2 13B, designed for efficient deployment across various hardware configurations. It's based on the original WizardLM-13B-V1.0 but retrained with a filtered dataset to reduce refusals, avoidance, and inherent biases while maintaining high performance.
Implementation Details
The model is available in multiple GGUF quantizations ranging from 2-bit to 8-bit precision, offering different tradeoffs between model size, memory usage, and quality. The GGUF format represents a significant advancement over the previous GGML format, providing improved tokenization, special token support, and extensible metadata capabilities.
- Multiple quantization options from 5.43GB (Q2_K) to 13.83GB (Q8_0)
- Supports both CPU and GPU inference with layer offloading
- Uses Vicuna 1.1-style prompting format
- Compatible with various front-ends including text-generation-webui and LM Studio
Core Capabilities
- Reduced ethical constraints compared to base WizardLM while maintaining LLaMA2's core capabilities
- Flexible deployment options across different hardware configurations
- Support for extended context lengths with automatic RoPE scaling
- Integration with popular frameworks like LangChain
Frequently Asked Questions
Q: What makes this model unique?
This model combines the powerful WizardLM architecture with reduced constraints while providing highly efficient GGUF quantization options for practical deployment. It maintains performance while offering significantly reduced model sizes.
Q: What are the recommended use cases?
The model is particularly suited for applications requiring more direct and unrestricted responses, while being deployable on consumer hardware. The Q4_K_M and Q5_K_M quantizations are recommended for balanced performance and efficiency.