SmolLM2-135M-Instruct

Maintained By
HuggingFaceTB

SmolLM2-135M-Instruct

PropertyValue
Parameter Count135M
Training Tokens2 trillion
LicenseApache 2.0
ArchitectureTransformer decoder
PrecisionBFloat16

What is SmolLM2-135M-Instruct?

SmolLM2-135M-Instruct is a compact yet powerful language model designed for efficient instruction following and general text generation. As part of the SmolLM2 family, it represents a significant advancement over its predecessor, particularly excelling in instruction following, knowledge application, and reasoning capabilities. The model was trained on an extensive and diverse dataset of 2 trillion tokens, including FineWeb-Edu, DCLM, and The Stack.

Implementation Details

The model underwent a sophisticated training process involving supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) using the UltraFeedback dataset. It was trained using 64 H100 GPUs and the nanotron framework, demonstrating impressive performance metrics across various benchmarks.

  • Zero-shot performance improvements over predecessor in multiple benchmarks
  • Supports text rewriting and summarization tasks
  • Optimized for efficient on-device deployment
  • Implements chat template for conversational applications

Core Capabilities

  • Instruction following with 29.9% average performance on IFEval
  • Strong performance on reasoning tasks (28.2% on BBH 3-shot)
  • Efficient text generation and summarization
  • Lightweight deployment options with ONNX and Transformers.js support

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its exceptional performance-to-size ratio, delivering strong capabilities in instruction following and reasoning tasks despite its compact 135M parameter size. It's specifically optimized for on-device deployment while maintaining competitive performance metrics.

Q: What are the recommended use cases?

The model is well-suited for text generation, summarization, and instruction-following tasks. It's particularly valuable for applications requiring efficient on-device deployment or where computational resources are limited, while still needing reliable language model capabilities.

The first platform built for prompt engineering