Orca-2-13b
Property | Value |
---|---|
Author | Microsoft |
License | Microsoft Research License |
Research Paper | arXiv:2311.11045 |
Base Model | LLAMA-2 |
What is Orca-2-13b?
Orca-2-13b is Microsoft's advanced language model specifically designed for research purposes. Built upon the LLAMA-2 architecture, this 13B parameter model specializes in complex reasoning tasks, including data analysis, reading comprehension, mathematical problem-solving, and text summarization. The model represents a significant advancement in teaching Small Language Models (SLMs) new capabilities through synthetic data and complex workflows.
Implementation Details
The model is implemented using the PyTorch framework and leverages the Transformers architecture. It's optimized for single-turn responses and operates without RLHF or DPO training. The model utilizes a special tokenization system and supports both CPU and GPU inference, with particular attention to safe inference through Azure AI Content Safety integration.
- Built on LLAMA-2 architecture with 13B parameters
- Trained on carefully moderated synthetic dataset
- Implements special markup tokens for system and user messages
- Supports integration with Azure AI Content Safety for content moderation
Core Capabilities
- Advanced reasoning over user-provided data
- Sophisticated reading comprehension
- Mathematical problem-solving abilities
- Text summarization
- Single-turn response optimization
- Research-focused applications
Frequently Asked Questions
Q: What makes this model unique?
Orca-2-13b's uniqueness lies in its specialized training approach using synthetic data to enhance reasoning capabilities in smaller language models. It demonstrates that complex capabilities can be effectively transferred to more compact models through careful training design.
Q: What are the recommended use cases?
The model is primarily intended for research purposes and performs best in scenarios requiring complex reasoning, data analysis, and single-turn responses. It's particularly suitable for academic research on model capabilities and development of better frontier models.