EXAONE-3.0-7.8B-Instruct
Property | Value |
---|---|
Parameter Count | 7.8 Billion |
Training Tokens | 8 Trillion |
License | EXAONE AI Model License Agreement 1.1 - NC |
Author | LG AI Research |
Model Type | Instruction-tuned Language Model |
Languages | English and Korean (Bilingual) |
What is EXAONE-3.0-7.8B-Instruct?
EXAONE-3.0-7.8B-Instruct is a state-of-the-art bilingual language model developed by LG AI Research. This model represents a significant advancement in multilingual AI capabilities, featuring 7.8 billion parameters and trained on an impressive 8 trillion curated tokens. The model underwent both pre-training and post-training phases, including supervised fine-tuning and direct preference optimization.
Implementation Details
The model leverages advanced transformer architecture and requires transformers v4.41 or later for optimal performance. It supports both English and Korean inputs, with built-in system prompts for enhanced interaction. The model operates with bfloat16 precision and includes automatic device mapping for efficient resource utilization.
- Pre-trained on 8T curated tokens
- Supports chat template for structured interactions
- Implements advanced instruction-tuning techniques
- Features automatic device mapping and optimization
Core Capabilities
- Achieves 9.01 on MT-Bench (English) and 8.92 on KoMT-Bench (Korean)
- Outperforms similar-sized models like Llama 3.1 8B and Gemma 2 9B
- Excels in multi-turn conversations and complex reasoning tasks
- Demonstrates strong performance in both English and Korean language understanding
Frequently Asked Questions
Q: What makes this model unique?
EXAONE-3.0-7.8B-Instruct stands out for its exceptional bilingual capabilities and state-of-the-art performance metrics, particularly in benchmarks like MT-Bench and Arena-Hard-v0.1. It achieves superior results compared to other models in its size category while maintaining efficient resource usage.
Q: What are the recommended use cases?
The model is particularly well-suited for bilingual applications requiring advanced language understanding in both English and Korean. It excels in tasks such as text generation, instruction following, and complex reasoning, making it ideal for chatbots, content generation, and language assistance applications.