Phi-3.5-mini-instruct

Maintained By
microsoft

Phi-3.5-mini-instruct

PropertyValue
Parameter Count3.82B
Context Length128K tokens
LicenseMIT
PaperTechnical Report
Supported Languages23 languages including English, Chinese, Arabic, German, etc.

What is Phi-3.5-mini-instruct?

Phi-3.5-mini-instruct is a lightweight, state-of-the-art language model that achieves remarkable performance despite its compact size of 3.82B parameters. Built upon the datasets used for Phi-3, it focuses on high-quality, reasoning-dense data and supports an impressive 128K token context length.

Implementation Details

The model leverages a decoder-only Transformer architecture and has undergone comprehensive enhancement through supervised fine-tuning, proximal policy optimization, and direct preference optimization. It requires specific GPU hardware for optimal performance, being tested on NVIDIA A100, A6000, and H100.

  • Training involved 3.4T tokens across multiple data sources
  • Supports flash attention for improved performance
  • Implements robust safety measures and instruction adherence

Core Capabilities

  • Multilingual support across 23 languages with competitive performance
  • Strong performance in reasoning tasks, particularly in code, math, and logic
  • Long-context understanding with 128K token support
  • Efficient operation in memory/compute constrained environments

Frequently Asked Questions

Q: What makes this model unique?

The model's ability to achieve performance comparable to much larger models (7B-12B parameters) while maintaining a compact size of 3.82B parameters makes it unique. It also offers extensive multilingual capabilities and long context support, making it versatile for various applications.

Q: What are the recommended use cases?

The model is ideal for scenarios requiring: 1) Memory/compute constrained environments, 2) Latency-sensitive applications, 3) Strong reasoning capabilities in code and math, and 4) Multilingual support. It's particularly suitable for commercial and research applications needing efficient language processing.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.