mbart-large-turkish-summarization

Maintained By
mukayese

mbart-large-turkish-summarization

PropertyValue
Parameter Count611M
Base Modelfacebook/mbart-large-50
PaperMukayese: Turkish NLP Strikes Back
ROUGE-1 Score46.70

What is mbart-large-turkish-summarization?

This is a specialized Turkish text summarization model developed by Mukayese, based on Facebook's mBART-large-50 architecture. It represents a significant advancement in Turkish natural language processing, fine-tuned specifically on the mlsum/tu dataset to generate high-quality summaries of Turkish text.

Implementation Details

The model utilizes advanced transformer architecture with 611M parameters, trained using mixed-precision training with Native AMP. Training was conducted across 8 GPUs with specific hyperparameters including a learning rate of 5e-05, batch size of 64, and label smoothing factor of 0.1.

  • Trained for 10 epochs using Adam optimizer
  • Implements gradient accumulation steps of 4
  • Utilizes linear learning rate scheduling
  • Achieves impressive ROUGE scores (ROUGE-1: 46.70, ROUGE-2: 34.01)

Core Capabilities

  • Efficient Turkish text summarization
  • Multi-GPU training support
  • Optimized for production deployment
  • State-of-the-art performance on mlsum dataset

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically optimized for Turkish language summarization, utilizing the powerful mBART architecture with custom fine-tuning on Turkish news articles, achieving state-of-the-art ROUGE scores.

Q: What are the recommended use cases?

The model is ideal for Turkish text summarization tasks, particularly in news article summarization, content condensation, and automated abstract generation for Turkish language content.

The first platform built for prompt engineering