GPT4-X-Alpasta-30b-4bit

Property	Value
Framework	PyTorch
Type	Text Generation, Transformers
Base Architecture	LLaMA

What is GPT4-X-Alpasta-30b-4bit?

GPT4-X-Alpasta-30b-4bit is an innovative merged model that combines Chansung's GPT4-Alpaca Lora and Open Assistant's native fine-tune, designed to enhance instruction-following capabilities while maintaining superior prose generation. The model supports both GPU and CPU implementations through GPTQ and GGML quantizations respectively.

Implementation Details

The model offers multiple quantization options: two GPTQ versions (one with true-sequential and act-order optimizations, another with true-sequential and groupsize 128) and three GGML versions (q4_1, q5_0, and q5_1 quantization levels). Notable benchmarks show impressive performance, with Wikitext2 scores as low as 4.70 for the groupsize 128 version.

GPTQ optimization with true-sequential and act-order allows full context fitting with 24GB VRAM
Multiple GGML quantizations for CPU usage flexibility
Compatible with Oobabooga's Text Generation Webui and KoboldAI
Updated GGML quantizations for latest llamacpp compatibility

Core Capabilities

Enhanced instruction-following abilities
High-quality prose generation
Flexible deployment options (GPU/CPU)
Impressive perplexity scores across multiple benchmarks

Frequently Asked Questions

Q: What makes this model unique?

This model uniquely combines the strengths of GPT4-Alpaca and Open Assistant, offering improved instruction-following while maintaining high-quality prose generation. Its multiple quantization options provide flexibility for different hardware configurations.

Q: What are the recommended use cases?

The model is well-suited for text generation tasks requiring both instruction following and natural language generation. It can be deployed on both GPU and CPU systems, making it versatile for various applications from personal projects to production environments.