rut5_base_sum_gazeta

Property	Value
Author	IlyaGusev
Task	Russian Abstractive Summarization
Base Model	RuT5-base
Training Data	Gazeta Dataset

What is rut5_base_sum_gazeta?

rut5_base_sum_gazeta is a specialized Russian language model designed for abstractive text summarization. Built on the RuT5-base architecture, it has been specifically trained on the Gazeta dataset to generate concise and coherent summaries of Russian news articles. The model demonstrates competitive performance metrics, achieving ROUGE-1 scores of 32.2, comparable to more complex models like mBART.

Implementation Details

The model implements a T5-based architecture optimized for summarization tasks. It processes input texts up to 600 tokens and generates summaries with a maximum length of 200 tokens. The implementation includes special features like n-gram repetition prevention (no_repeat_ngram_size=4) to ensure output quality.

Maximum input length: 600 tokens
Maximum output length: 200 tokens
No-repeat n-gram size: 4
Average output length: 330 characters

Core Capabilities

Abstractive summarization of Russian texts
ROUGE-1 F1 score: 32.2
ROUGE-2 F1 score: 14.4
ROUGE-L F1 score: 28.1
METEOR score: 25.7
BLEU score: 12.3

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialized focus on Russian news summarization, offering performance comparable to larger models like mBART while maintaining a more efficient architecture based on RuT5-base. Its optimization for the Gazeta dataset makes it particularly effective for news-related content.

Q: What are the recommended use cases?

The model is best suited for summarizing Russian news articles and similar journalistic content. It's optimized for texts that can be tokenized to 600 tokens or less and works best when generating concise summaries around 330 characters in length.