Stable-Vicuna-13B-GPTQ
Property | Value |
---|---|
Parameter Count | 13B |
Model Type | GPTQ-Quantized LLaMA |
License | CC-BY-NC-SA-4.0 |
Paper | Research Paper |
Precision | 4-bit |
What is stable-vicuna-13B-GPTQ?
Stable-Vicuna-13B-GPTQ is a quantized version of CarperAI's StableVicuna model, optimized through RLHF (Reinforcement Learning from Human Feedback) using PPO. This model represents a significant advancement in efficient AI deployment, offering the capabilities of a 13B parameter model in a compressed 4-bit format.
Implementation Details
The model is implemented using GPTQ quantization technology, offering two variants: a compatible version without act-order for maximum compatibility, and an optimized version with act-order for enhanced performance. It utilizes a groupsize of 128 and maintains the fundamental architecture of LLaMA.
- 4-bit precision quantization for efficient deployment
- Compatible with text-generation-webui
- Includes safetensors format for improved safety
- Supports both Triton and CUDA implementations
Core Capabilities
- Advanced conversational AI abilities
- Instruction-following capabilities
- Multi-dataset training incorporating OASST1, GPT4All, and Alpaca
- Optimized for both performance and memory efficiency
Frequently Asked Questions
Q: What makes this model unique?
This model combines the benefits of RLHF training with efficient 4-bit quantization, making it particularly suitable for deployment on consumer hardware while maintaining high-quality outputs.
Q: What are the recommended use cases?
The model excels in conversational AI applications, instruction-following tasks, and general text generation. It's particularly well-suited for applications requiring a balance between performance and resource efficiency.