gemma-3-27b-it-GPTQ-4b-128g

Property	Value
Original Model	Gemma-3-27B-IT
Quantization	INT4 (GPTQ)
Group Size	128
Author	ISTA-DASLab
Model URL	HuggingFace Repository

What is gemma-3-27b-it-GPTQ-4b-128g?

This is a highly optimized version of the Gemma-3-27B-IT model, specifically quantized to reduce its computational footprint while maintaining performance. The model employs GPTQ quantization to compress the weights from 16-bit to 4-bit precision, resulting in approximately 75% reduction in disk space and GPU memory requirements.

Implementation Details

The quantization process specifically targets the linear operators within the language model transformer blocks while preserving the original precision for vision model and multimodal projection components. The implementation uses a symmetric per-group quantization scheme with a group size of 128, optimized through the GPTQ algorithm. The model checkpoint is stored in compressed_tensors format for efficient storage and loading.

Selective quantization of transformer blocks only
Preservation of vision and multimodal components in original precision
Symmetric per-group quantization scheme
4-bit precision with 128 group size

Core Capabilities

Multimodal processing (text and image)
Reduced memory footprint (75% reduction)
Maintained model quality despite compression
Compatible with standard transformers library

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its efficient compression of the Gemma architecture while maintaining multimodal capabilities. The selective quantization approach ensures that critical vision-related components remain at full precision while achieving significant memory savings.

Q: What are the recommended use cases?

The model is ideal for deployment scenarios where GPU memory is limited but full multimodal capabilities are required. It's particularly suitable for applications involving both text and image processing, such as image description, visual question answering, and multimodal chat interactions.