mbart-large-turkish-summarization

Maintained By
mukayese

mbart-large-turkish-summarization

PropertyValue
Parameter Count611M
Base Modelfacebook/mbart-large-50
PaperMukayese: Turkish NLP Strikes Back
ROUGE-1 Score46.70

What is mbart-large-turkish-summarization?

This is a specialized Turkish text summarization model developed by Mukayese, based on Facebook's mBART-large-50 architecture. It represents a significant advancement in Turkish natural language processing, fine-tuned specifically on the mlsum/tu dataset to generate high-quality summaries of Turkish text.

Implementation Details

The model utilizes advanced transformer architecture with 611M parameters, trained using mixed-precision training with Native AMP. Training was conducted across 8 GPUs with specific hyperparameters including a learning rate of 5e-05, batch size of 64, and label smoothing factor of 0.1.

  • Trained for 10 epochs using Adam optimizer
  • Implements gradient accumulation steps of 4
  • Utilizes linear learning rate scheduling
  • Achieves impressive ROUGE scores (ROUGE-1: 46.70, ROUGE-2: 34.01)

Core Capabilities

  • Efficient Turkish text summarization
  • Multi-GPU training support
  • Optimized for production deployment
  • State-of-the-art performance on mlsum dataset

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically optimized for Turkish language summarization, utilizing the powerful mBART architecture with custom fine-tuning on Turkish news articles, achieving state-of-the-art ROUGE scores.

Q: What are the recommended use cases?

The model is ideal for Turkish text summarization tasks, particularly in news article summarization, content condensation, and automated abstract generation for Turkish language content.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.