mbart-large-turkish-summarization

Property	Value
Parameter Count	611M
Base Model	facebook/mbart-large-50
Paper	Mukayese: Turkish NLP Strikes Back
ROUGE-1 Score	46.70

What is mbart-large-turkish-summarization?

This is a specialized Turkish text summarization model developed by Mukayese, based on Facebook's mBART-large-50 architecture. It represents a significant advancement in Turkish natural language processing, fine-tuned specifically on the mlsum/tu dataset to generate high-quality summaries of Turkish text.

Implementation Details

The model utilizes advanced transformer architecture with 611M parameters, trained using mixed-precision training with Native AMP. Training was conducted across 8 GPUs with specific hyperparameters including a learning rate of 5e-05, batch size of 64, and label smoothing factor of 0.1.

Trained for 10 epochs using Adam optimizer
Implements gradient accumulation steps of 4
Utilizes linear learning rate scheduling
Achieves impressive ROUGE scores (ROUGE-1: 46.70, ROUGE-2: 34.01)

Core Capabilities

Efficient Turkish text summarization
Multi-GPU training support
Optimized for production deployment
State-of-the-art performance on mlsum dataset

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically optimized for Turkish language summarization, utilizing the powerful mBART architecture with custom fine-tuning on Turkish news articles, achieving state-of-the-art ROUGE scores.

Q: What are the recommended use cases?

The model is ideal for Turkish text summarization tasks, particularly in news article summarization, content condensation, and automated abstract generation for Turkish language content.