rut5_base_sum_gazeta
Property | Value |
---|---|
Author | IlyaGusev |
Task | Russian Abstractive Summarization |
Base Model | RuT5-base |
Training Data | Gazeta Dataset |
What is rut5_base_sum_gazeta?
rut5_base_sum_gazeta is a specialized Russian language model designed for abstractive text summarization. Built on the RuT5-base architecture, it has been specifically trained on the Gazeta dataset to generate concise and coherent summaries of Russian news articles. The model demonstrates competitive performance metrics, achieving ROUGE-1 scores of 32.2, comparable to more complex models like mBART.
Implementation Details
The model implements a T5-based architecture optimized for summarization tasks. It processes input texts up to 600 tokens and generates summaries with a maximum length of 200 tokens. The implementation includes special features like n-gram repetition prevention (no_repeat_ngram_size=4) to ensure output quality.
- Maximum input length: 600 tokens
- Maximum output length: 200 tokens
- No-repeat n-gram size: 4
- Average output length: 330 characters
Core Capabilities
- Abstractive summarization of Russian texts
- ROUGE-1 F1 score: 32.2
- ROUGE-2 F1 score: 14.4
- ROUGE-L F1 score: 28.1
- METEOR score: 25.7
- BLEU score: 12.3
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its specialized focus on Russian news summarization, offering performance comparable to larger models like mBART while maintaining a more efficient architecture based on RuT5-base. Its optimization for the Gazeta dataset makes it particularly effective for news-related content.
Q: What are the recommended use cases?
The model is best suited for summarizing Russian news articles and similar journalistic content. It's optimized for texts that can be tokenized to 600 tokens or less and works best when generating concise summaries around 330 characters in length.