pegasus-xsum

Maintained By
google

PEGASUS-XSUM Model

PropertyValue
AuthorsJingqing Zhang, Yao Zhao, Mohammad Saleh, Peter J. Liu
PaperarXiv:1912.08777
Downloads149,372
TaskAbstractive Summarization

What is pegasus-xsum?

PEGASUS-XSUM is a state-of-the-art abstractive summarization model developed by Google Research. It's specifically fine-tuned for extreme summarization tasks, demonstrating exceptional performance with ROUGE-1 scores of 46.86 on the XSUM dataset. The model implements an innovative pre-training objective called "gap-sentence generation" that creates self-supervised learning tasks specifically designed for abstractive summarization.

Implementation Details

The model utilizes a transformer-based architecture with significant modifications for summarization tasks. It's trained using a mixed and stochastic approach, combining both C4 and HugeNews datasets, with training extending to 1.5M steps. The implementation features dynamic gap sentence ratios between 15% and 45%, with importance sentences sampled using 20% uniform noise.

  • Achieves 46.86 ROUGE-1, 24.45 ROUGE-2, and 39.05 ROUGE-L scores on XSUM
  • Implements specialized sentencepiece tokenization supporting newline characters
  • Utilizes both PyTorch and TensorFlow frameworks

Core Capabilities

  • Extreme summarization of long documents
  • Multi-dataset performance (XSUM, CNN/DailyMail, NewsRoom)
  • Flexible deployment across different domains
  • Support for various text lengths and styles

Frequently Asked Questions

Q: What makes this model unique?

The model's unique strength lies in its pre-training approach using gap-sentence generation and its ability to maintain high performance across diverse summarization tasks. Its mixed & stochastic training approach significantly improves generalization across different domains.

Q: What are the recommended use cases?

PEGASUS-XSUM is particularly well-suited for extreme summarization tasks, news article summarization, and scenarios requiring concise, single-sentence summaries. It performs exceptionally well on documents requiring significant compression while maintaining key information.

The first platform built for prompt engineering