specter2_base

Maintained By
allenai

SPECTER2 Base Model

PropertyValue
Model TypeBERT-base-uncased + adapters
LicenseApache 2.0
Base Modelallenai/scibert
PaperSciRepEval Paper

What is specter2_base?

SPECTER2 base is a sophisticated scientific document embedding model developed by Allen AI. It serves as the foundation for generating task-specific embeddings when combined with specialized adapters. The model has been trained on over 6 million triplets of scientific paper citations, making it particularly effective for scientific document representation tasks.

Implementation Details

The model implements a two-stage training process: first, training the base model on citation triplets, followed by training task-specific adapters on the SciRepEval training tasks. It uses a BERT-based architecture with adapter modules for different task formats including classification, regression, proximity, and adhoc search.

  • Training batch size: 1024 for base model, 256 for adapters
  • Maximum input length: 512 tokens
  • Learning rates: 2e-5 (base model), 1e-4 (adapters)
  • Training epochs: 2 for base model, 6 for adapters

Core Capabilities

  • Generation of scientific document embeddings
  • Support for multiple task formats through adapters
  • State-of-the-art performance on SciRepEval benchmark
  • Effective encoding of paper titles and abstracts
  • Superior performance in citation recommendation tasks

Frequently Asked Questions

Q: What makes this model unique?

SPECTER2 base stands out due to its adapter-based architecture that allows task-specific fine-tuning while maintaining a robust base model. It achieves state-of-the-art performance on scientific document representation tasks, with particularly strong results on out-of-training evaluation (73.6% accuracy).

Q: What are the recommended use cases?

The model is best suited for scientific document representation tasks, especially when combined with specific adapters. Common applications include paper similarity search, citation recommendation, classification of scientific papers, and processing of scientific queries.

The first platform built for prompt engineering