WizardLM 1.0 Uncensored LLaMA2 13B GGUF

Property	Value
Parameter Count	13B
Base Model	LLaMA2
License	LLaMA2
Creator	Eric Hartford
Quantizer	TheBloke

What is WizardLM-1.0-Uncensored-Llama2-13B-GGUF?

This model is a specially quantized version of Eric Hartford's WizardLM 1.0 Uncensored LLaMA2 13B, designed for efficient deployment across various hardware configurations. It's based on the original WizardLM-13B-V1.0 but retrained with a filtered dataset to reduce refusals, avoidance, and inherent biases while maintaining high performance.

Implementation Details

The model is available in multiple GGUF quantizations ranging from 2-bit to 8-bit precision, offering different tradeoffs between model size, memory usage, and quality. The GGUF format represents a significant advancement over the previous GGML format, providing improved tokenization, special token support, and extensible metadata capabilities.

Multiple quantization options from 5.43GB (Q2_K) to 13.83GB (Q8_0)
Supports both CPU and GPU inference with layer offloading
Uses Vicuna 1.1-style prompting format
Compatible with various front-ends including text-generation-webui and LM Studio

Core Capabilities

Reduced ethical constraints compared to base WizardLM while maintaining LLaMA2's core capabilities
Flexible deployment options across different hardware configurations
Support for extended context lengths with automatic RoPE scaling
Integration with popular frameworks like LangChain

Frequently Asked Questions

Q: What makes this model unique?

This model combines the powerful WizardLM architecture with reduced constraints while providing highly efficient GGUF quantization options for practical deployment. It maintains performance while offering significantly reduced model sizes.

Q: What are the recommended use cases?

The model is particularly suited for applications requiring more direct and unrestricted responses, while being deployable on consumer hardware. The Q4_K_M and Q5_K_M quantizations are recommended for balanced performance and efficiency.