CodeLlama-13B-OASST-SFT-v10

Property	Value
Parameter Count	13B
Model Type	Causal decoder-only transformer
License	LLAMA 2 Community License
Training Steps	6123 (BS 64)
Tensor Type	BF16

What is codellama-13b-oasst-sft-v10?

CodeLlama-13B-OASST-SFT-v10 is an Open-Assistant fine-tuned version of Meta's CodeLlama 13B language model. This model represents a significant advancement in code-generation capabilities, incorporating the chatml standard prompt format for better compatibility with chat applications. It's trained on a carefully curated mix of datasets including OpenAssistant/oasst1 and specialized code datasets.

Implementation Details

The model utilizes a new RoPE Theta value of 1e6 (upgraded from 1e4) and requires loading with trust_remote_code=True. It's implemented using the epfLLM/Megatron-LLM trainer and supports multi-turn conversations through a structured prompt template system.

Trained for 6123 steps with batch size 64
Implements chatml standard prompt format
Supports multiple conversation turns between user and assistant
Includes integrated system message support

Core Capabilities

Advanced code generation and completion
Multi-turn conversation support
Compatibility with chat inference/frontend applications
Multilingual support with focus on European languages
Safe and ethical response generation

Frequently Asked Questions

Q: What makes this model unique?

This model combines CodeLlama's powerful code generation capabilities with Open-Assistant's conversational abilities, using the widely-adopted chatml format. The integration of multiple high-quality datasets and careful fine-tuning makes it particularly effective for both coding and general assistance tasks.

Q: What are the recommended use cases?

The model is well-suited for code generation, technical assistance, and general conversation tasks. It's particularly effective in scenarios requiring both programming expertise and natural language understanding, while maintaining ethical guidelines and safety considerations.