Granite-3B-Code-Base-2K
Property | Value |
---|---|
Parameter Count | 3.48B |
License | Apache 2.0 |
Paper | Link |
Developer | IBM Research |
Training Data | 4 trillion tokens, 116 programming languages |
What is granite-3b-code-base-2k?
Granite-3B-Code-Base-2K is a decoder-only language model specifically designed for code intelligence tasks. Developed by IBM Research, this model represents a significant advancement in AI-powered code generation and understanding. It underwent a two-phase training process, initially training on 4 trillion tokens across 116 programming languages, followed by fine-tuning on 500 billion tokens of high-quality code and natural language data.
Implementation Details
The model employs a sophisticated training approach with extensive data preprocessing, including exact and fuzzy deduplication, HAP content filtering, and PII redaction. Training was conducted on IBM's super computing clusters using NVIDIA A100 and H100 GPUs.
- Comprehensive language support across 116 programming languages
- Aggressive deduplication strategy for training data quality
- Advanced security measures including malware scanning using ClamAV
- Tensor type: BF16 for optimal performance
Core Capabilities
- Code generation with strong performance across multiple languages
- Code explanation and documentation generation
- Bug fixing and code improvement
- Unit test generation
- Technical debt identification
- Vulnerability detection
Frequently Asked Questions
Q: What makes this model unique?
The model's distinctive feature is its comprehensive training approach combining vast code data with natural language understanding, making it particularly effective for both code generation and explanation tasks. It achieves impressive performance metrics, with pass@1 scores ranging from 36.6% for Python to 40.9% for Java in code synthesis tasks.
Q: What are the recommended use cases?
The model is optimized for enterprise software engineering tasks including code generation, documentation creation, bug fixing, and code translation. However, users should note that the model hasn't undergone safety alignment, and generated code should be carefully reviewed before production use.