mbart-large-turkish-summarization
Property | Value |
---|---|
Parameter Count | 611M |
Base Model | facebook/mbart-large-50 |
Paper | Mukayese: Turkish NLP Strikes Back |
ROUGE-1 Score | 46.70 |
What is mbart-large-turkish-summarization?
This is a specialized Turkish text summarization model developed by Mukayese, based on Facebook's mBART-large-50 architecture. It represents a significant advancement in Turkish natural language processing, fine-tuned specifically on the mlsum/tu dataset to generate high-quality summaries of Turkish text.
Implementation Details
The model utilizes advanced transformer architecture with 611M parameters, trained using mixed-precision training with Native AMP. Training was conducted across 8 GPUs with specific hyperparameters including a learning rate of 5e-05, batch size of 64, and label smoothing factor of 0.1.
- Trained for 10 epochs using Adam optimizer
- Implements gradient accumulation steps of 4
- Utilizes linear learning rate scheduling
- Achieves impressive ROUGE scores (ROUGE-1: 46.70, ROUGE-2: 34.01)
Core Capabilities
- Efficient Turkish text summarization
- Multi-GPU training support
- Optimized for production deployment
- State-of-the-art performance on mlsum dataset
Frequently Asked Questions
Q: What makes this model unique?
This model is specifically optimized for Turkish language summarization, utilizing the powerful mBART architecture with custom fine-tuning on Turkish news articles, achieving state-of-the-art ROUGE scores.
Q: What are the recommended use cases?
The model is ideal for Turkish text summarization tasks, particularly in news article summarization, content condensation, and automated abstract generation for Turkish language content.