CodeLlama-13B-OASST-SFT-v10
Property | Value |
---|---|
Parameter Count | 13B |
Model Type | Causal decoder-only transformer |
License | LLAMA 2 Community License |
Training Steps | 6123 (BS 64) |
Tensor Type | BF16 |
What is codellama-13b-oasst-sft-v10?
CodeLlama-13B-OASST-SFT-v10 is an Open-Assistant fine-tuned version of Meta's CodeLlama 13B language model. This model represents a significant advancement in code-generation capabilities, incorporating the chatml standard prompt format for better compatibility with chat applications. It's trained on a carefully curated mix of datasets including OpenAssistant/oasst1 and specialized code datasets.
Implementation Details
The model utilizes a new RoPE Theta value of 1e6 (upgraded from 1e4) and requires loading with trust_remote_code=True. It's implemented using the epfLLM/Megatron-LLM trainer and supports multi-turn conversations through a structured prompt template system.
- Trained for 6123 steps with batch size 64
- Implements chatml standard prompt format
- Supports multiple conversation turns between user and assistant
- Includes integrated system message support
Core Capabilities
- Advanced code generation and completion
- Multi-turn conversation support
- Compatibility with chat inference/frontend applications
- Multilingual support with focus on European languages
- Safe and ethical response generation
Frequently Asked Questions
Q: What makes this model unique?
This model combines CodeLlama's powerful code generation capabilities with Open-Assistant's conversational abilities, using the widely-adopted chatml format. The integration of multiple high-quality datasets and careful fine-tuning makes it particularly effective for both coding and general assistance tasks.
Q: What are the recommended use cases?
The model is well-suited for code generation, technical assistance, and general conversation tasks. It's particularly effective in scenarios requiring both programming expertise and natural language understanding, while maintaining ethical guidelines and safety considerations.