CodeLlama-34B-Instruct-GPTQ

Maintained By
TheBloke

CodeLlama-34B-Instruct-GPTQ

PropertyValue
Model Size34B parameters
LicenseLlama 2
PaperResearch Paper
AuthorMeta (Original), TheBloke (Quantized)
QuantizationGPTQ (Multiple Options)

What is CodeLlama-34B-Instruct-GPTQ?

CodeLlama-34B-Instruct-GPTQ is a quantized version of Meta's powerful code-focused language model, specifically optimized for instruction-following and code generation tasks. This GPTQ-quantized variant maintains the capabilities of the original model while reducing the computational requirements through various quantization options.

Implementation Details

The model offers multiple quantization configurations, including 4-bit and 8-bit options with different group sizes (32g, 64g, 128g) and Act Order settings. The implementation uses the AutoGPTQ framework and is compatible with modern GPU inference frameworks.

  • Multiple GPTQ parameter options for different hardware configurations
  • Optimized using Evol Instruct Code dataset for quantization
  • Supports 4096 sequence length
  • Compatible with ExLlama for 4-bit variants

Core Capabilities

  • Code completion and generation
  • Instruction-following for coding tasks
  • Multi-language code understanding
  • Optimized for production deployment
  • Flexible quantization options for different hardware configurations

Frequently Asked Questions

Q: What makes this model unique?

This model combines the powerful capabilities of CodeLlama-34B with efficient quantization options, making it more accessible for deployment while maintaining high performance. The various quantization options allow users to balance between model quality and hardware requirements.

Q: What are the recommended use cases?

The model is ideal for code generation, code completion, and instruction-following tasks related to programming. It's particularly well-suited for production environments where efficient resource usage is crucial while maintaining high-quality code generation capabilities.

The first platform built for prompt engineering