SmolLM2-1.7B-Instruct
Property | Value |
---|---|
Parameter Count | 1.7B |
Training Tokens | 11 Trillion |
License | Apache 2.0 |
Architecture | Transformer decoder |
Precision | BFloat16 |
What is SmolLM2-1.7B-Instruct?
SmolLM2-1.7B-Instruct is a compact yet powerful language model that represents a significant advancement in efficient AI modeling. This instruction-tuned variant demonstrates remarkable capabilities in instruction following, knowledge processing, reasoning, and mathematics, while maintaining a relatively small footprint that enables on-device deployment.
Implementation Details
The model was trained using a diverse dataset combination including FineWeb-Edu, DCLM, and The Stack, supplemented with specialized mathematics and coding datasets. The instruction-following capabilities were developed through supervised fine-tuning (SFT) and further refined using Direct Preference Optimization (DPO) with UltraFeedback.
- Advanced function calling capabilities with 27% score on BFCL Leaderboard
- Specialized text rewriting and summarization capabilities
- Comprehensive support for chat-based interactions
- Optimized for both CPU and GPU deployment
Core Capabilities
- Strong performance in zero-shot tasks with 66.1% accuracy on HellaSwag
- Impressive mathematical reasoning with 48.2% accuracy on GSM8K (5-shot)
- Enhanced instruction following with 56.7% average on IFEval
- Efficient text rewriting and summarization tasks
Frequently Asked Questions
Q: What makes this model unique?
SmolLM2-1.7B-Instruct stands out for its exceptional balance between model size and performance. Despite being relatively compact at 1.7B parameters, it achieves competitive results against larger models in various benchmarks, making it ideal for resource-constrained environments.
Q: What are the recommended use cases?
The model excels in instruction following, text rewriting, summarization, and function calling tasks. It's particularly well-suited for applications requiring on-device deployment or resource-efficient processing while maintaining high-quality output.