SmolLM-1.7B-Instruct

Maintained By
HuggingFaceTB

SmolLM-1.7B-Instruct

PropertyValue
Parameter Count1.71B parameters
Model TypeInstruction-tuned Language Model
LicenseApache 2.0
PrecisionBF16

What is SmolLM-1.7B-Instruct?

SmolLM-1.7B-Instruct is part of the SmolLM series, representing a compact yet powerful instruction-tuned language model. It's specifically designed to balance efficiency with capability, trained on a carefully curated mix of datasets including everyday conversations, programming instructions, and general knowledge content.

Implementation Details

The model is trained using the alignment-handbook framework with specific optimization parameters including a learning rate of 1e-3, cosine schedule, and a warmup ratio of 0.1. It leverages a global batch size of 262k tokens and incorporates multiple high-quality datasets for instruction tuning.

  • Trained on four specialized datasets including Magpie-Pro-300K-Filtered and OpenHermes-2.5
  • Optimized for deployment with support for ONNX and Safetensors formats
  • Implements efficient BF16 precision for balanced performance and resource usage

Core Capabilities

  • General knowledge question answering
  • Creative writing tasks
  • Basic Python programming
  • Everyday conversation handling
  • English language processing

Frequently Asked Questions

Q: What makes this model unique?

SmolLM-1.7B-Instruct stands out for its efficient architecture and careful optimization for everyday tasks while maintaining a relatively small parameter count. The v0.2 version shows significant improvements in staying on topic and handling standard prompts.

Q: What are the recommended use cases?

The model is best suited for general knowledge questions, creative writing, basic programming tasks, and everyday conversations. However, it may have limitations with complex arithmetic, detailed editing tasks, and sophisticated reasoning.

The first platform built for prompt engineering