Stable-Vicuna-13B-GPTQ

Property	Value
Parameter Count	13B
Model Type	GPTQ-Quantized LLaMA
License	CC-BY-NC-SA-4.0
Paper	Research Paper
Precision	4-bit

What is stable-vicuna-13B-GPTQ?

Stable-Vicuna-13B-GPTQ is a quantized version of CarperAI's StableVicuna model, optimized through RLHF (Reinforcement Learning from Human Feedback) using PPO. This model represents a significant advancement in efficient AI deployment, offering the capabilities of a 13B parameter model in a compressed 4-bit format.

Implementation Details

The model is implemented using GPTQ quantization technology, offering two variants: a compatible version without act-order for maximum compatibility, and an optimized version with act-order for enhanced performance. It utilizes a groupsize of 128 and maintains the fundamental architecture of LLaMA.

4-bit precision quantization for efficient deployment
Compatible with text-generation-webui
Includes safetensors format for improved safety
Supports both Triton and CUDA implementations

Core Capabilities

Advanced conversational AI abilities
Instruction-following capabilities
Multi-dataset training incorporating OASST1, GPT4All, and Alpaca
Optimized for both performance and memory efficiency

Frequently Asked Questions

Q: What makes this model unique?

This model combines the benefits of RLHF training with efficient 4-bit quantization, making it particularly suitable for deployment on consumer hardware while maintaining high-quality outputs.

Q: What are the recommended use cases?

The model excels in conversational AI applications, instruction-following tasks, and general text generation. It's particularly well-suited for applications requiring a balance between performance and resource efficiency.