stable-vicuna-13B-GPTQ

Maintained By
TheBloke

Stable-Vicuna-13B-GPTQ

PropertyValue
Parameter Count13B
Model TypeGPTQ-Quantized LLaMA
LicenseCC-BY-NC-SA-4.0
PaperResearch Paper
Precision4-bit

What is stable-vicuna-13B-GPTQ?

Stable-Vicuna-13B-GPTQ is a quantized version of CarperAI's StableVicuna model, optimized through RLHF (Reinforcement Learning from Human Feedback) using PPO. This model represents a significant advancement in efficient AI deployment, offering the capabilities of a 13B parameter model in a compressed 4-bit format.

Implementation Details

The model is implemented using GPTQ quantization technology, offering two variants: a compatible version without act-order for maximum compatibility, and an optimized version with act-order for enhanced performance. It utilizes a groupsize of 128 and maintains the fundamental architecture of LLaMA.

  • 4-bit precision quantization for efficient deployment
  • Compatible with text-generation-webui
  • Includes safetensors format for improved safety
  • Supports both Triton and CUDA implementations

Core Capabilities

  • Advanced conversational AI abilities
  • Instruction-following capabilities
  • Multi-dataset training incorporating OASST1, GPT4All, and Alpaca
  • Optimized for both performance and memory efficiency

Frequently Asked Questions

Q: What makes this model unique?

This model combines the benefits of RLHF training with efficient 4-bit quantization, making it particularly suitable for deployment on consumer hardware while maintaining high-quality outputs.

Q: What are the recommended use cases?

The model excels in conversational AI applications, instruction-following tasks, and general text generation. It's particularly well-suited for applications requiring a balance between performance and resource efficiency.

The first platform built for prompt engineering