CodeQwen1.5-7B

Maintained By
Qwen

CodeQwen1.5-7B

PropertyValue
Parameter Count7.25B
Licensetongyi-qianwen-research
ArchitectureTransformer Decoder-only with GQA
PaperResearch Paper
Context Length64K tokens

What is CodeQwen1.5-7B?

CodeQwen1.5-7B is a specialized code generation model built on the Qwen1.5 architecture, specifically designed for programming tasks. Trained on 3 trillion tokens of code data, it represents a significant advancement in AI-powered code generation and understanding. The model incorporates Group Query Attention (GQA) for efficient inference and supports an impressive context length of 64K tokens.

Implementation Details

Built as a decoder-only transformer model, CodeQwen1.5-7B requires transformers>=4.37.0 for proper functionality. The model utilizes BF16 tensor types and implements advanced attention mechanisms for optimal performance in code-related tasks.

  • Transformer-based decoder-only architecture
  • Group Query Attention (GQA) implementation
  • 3 trillion tokens training dataset
  • 64K token context window

Core Capabilities

  • Support for 92 programming languages
  • Advanced code generation and completion
  • Text-to-SQL conversion capabilities
  • Bug fixing functionality
  • Long context understanding and generation

Frequently Asked Questions

Q: What makes this model unique?

CodeQwen1.5-7B stands out for its specialized focus on code generation, extensive language support, and impressive context length of 64K tokens. Its training on 3 trillion tokens of code data makes it particularly effective for programming-related tasks.

Q: What are the recommended use cases?

The model is ideal for code infilling, generation, and bug fixing tasks. While it's not recommended for direct chat applications, it's well-suited for fine-tuning and specialized code-related applications.

The first platform built for prompt engineering