Pegasus-Large
Property | Value |
---|---|
Author | |
Paper | arXiv:1912.08777 |
Task | Abstractive Summarization |
Framework | PyTorch, TensorFlow |
What is pegasus-large?
Pegasus-large is a state-of-the-art transformer model designed for abstractive text summarization, developed by Google Research. The model implements a novel pre-training objective called "Gap Sentence Generation" (GSG), where important sentences are removed and must be generated from the remaining text.
Implementation Details
The model features a mixed & stochastic training approach, combining both C4 and HugeNews datasets. It was trained for 1.5M steps, significantly longer than the original 500k steps, with dynamic gap sentence ratios between 15% and 45%. The model utilizes a specialized sentencepiece tokenizer capable of encoding newline characters.
- Trained on combined C4 and HugeNews datasets
- Uses stochastic sentence sampling with 20% uniform noise
- Implements adaptive gap sentence ratios
- Enhanced tokenization for newline character preservation
Core Capabilities
- Achieves SOTA results on multiple summarization benchmarks (XSUM: 47.60/24.83/39.64)
- Excels in long-form summarization tasks (CNN/DailyMail: 44.16/21.56/41.30)
- Effective on both scientific (arXiv, PubMed) and general domain content
- Supports multi-document summarization
Frequently Asked Questions
Q: What makes this model unique?
The model's mixed & stochastic training approach, combined with its dynamic gap sentence ratio and enhanced tokenization, makes it particularly effective at capturing document structure and generating coherent summaries. Its performance across diverse domains demonstrates exceptional versatility.
Q: What are the recommended use cases?
Pegasus-large is ideal for news article summarization, scientific paper abstraction, patent summarization, and multi-document summarization tasks. It performs particularly well on datasets like XSUM, CNN/DailyMail, and scientific papers.