ALLaM-7B-Instruct-preview
Property | Value |
---|---|
Parameter Count | 7 Billion |
Context Length | 4096 tokens |
Training Tokens | 5.2T (4T English + 1.2T Arabic/English) |
Developer | National Center for Artificial Intelligence at SDAIA |
Model Type | Autoregressive Transformer |
Languages | Arabic, English |
What is ALLaM-7B-Instruct-preview?
ALLaM-7B-Instruct-preview is a groundbreaking bilingual language model developed by the Saudi Data and AI Authority (SDAIA) specifically designed to advance Arabic Language Technology while maintaining strong English language capabilities. The model represents a significant step forward in bilingual AI, trained through a novel two-stage process that involves initial training on English followed by mixed Arabic/English content.
Implementation Details
The model employs a sophisticated training approach using NVIDIA/MegatronLM with bf16-mixed precision, achieving approximately 42% MFU during training. Its architecture is optimized to function without requiring a predefined system prompt, though it supports custom prompts in both Arabic and English.
- Trained on 4T English tokens followed by 1.2T mixed Arabic/English tokens
- Instruction-tuned with 7M instructions and 260K preference pairs
- Supports 4096 token context length
- Built using state-of-the-art autoregressive transformer architecture
Core Capabilities
- Superior performance on Arabic language tasks, outperforming many existing models on Arabic benchmarks
- Strong bilingual capabilities in both Arabic and English
- Flexible system prompt support for customized interactions
- Competitive performance on various evaluation metrics including MMLU, MT-bench, and Arabic-specific benchmarks
Frequently Asked Questions
Q: What makes this model unique?
ALLaM-7B-Instruct-preview stands out for its specialized focus on Arabic language processing while maintaining strong English capabilities, achieved through its innovative two-stage training process. It demonstrates superior performance on Arabic benchmarks while remaining competitive in English tasks.
Q: What are the recommended use cases?
The model is ideal for research and development in Arabic Language Technology, bilingual applications, and as a component in larger AI systems. It's particularly well-suited for tasks requiring strong understanding of both Arabic and English contexts, though developers should implement appropriate safety measures for production use.