PEGASUS-XSUM Model
Property | Value |
---|---|
Authors | Jingqing Zhang, Yao Zhao, Mohammad Saleh, Peter J. Liu |
Paper | arXiv:1912.08777 |
Downloads | 149,372 |
Task | Abstractive Summarization |
What is pegasus-xsum?
PEGASUS-XSUM is a state-of-the-art abstractive summarization model developed by Google Research. It's specifically fine-tuned for extreme summarization tasks, demonstrating exceptional performance with ROUGE-1 scores of 46.86 on the XSUM dataset. The model implements an innovative pre-training objective called "gap-sentence generation" that creates self-supervised learning tasks specifically designed for abstractive summarization.
Implementation Details
The model utilizes a transformer-based architecture with significant modifications for summarization tasks. It's trained using a mixed and stochastic approach, combining both C4 and HugeNews datasets, with training extending to 1.5M steps. The implementation features dynamic gap sentence ratios between 15% and 45%, with importance sentences sampled using 20% uniform noise.
- Achieves 46.86 ROUGE-1, 24.45 ROUGE-2, and 39.05 ROUGE-L scores on XSUM
- Implements specialized sentencepiece tokenization supporting newline characters
- Utilizes both PyTorch and TensorFlow frameworks
Core Capabilities
- Extreme summarization of long documents
- Multi-dataset performance (XSUM, CNN/DailyMail, NewsRoom)
- Flexible deployment across different domains
- Support for various text lengths and styles
Frequently Asked Questions
Q: What makes this model unique?
The model's unique strength lies in its pre-training approach using gap-sentence generation and its ability to maintain high performance across diverse summarization tasks. Its mixed & stochastic training approach significantly improves generalization across different domains.
Q: What are the recommended use cases?
PEGASUS-XSUM is particularly well-suited for extreme summarization tasks, news article summarization, and scenarios requiring concise, single-sentence summaries. It performs exceptionally well on documents requiring significant compression while maintaining key information.