roberta-base-squad2

Maintained By
deepset

RoBERTa Base SQuAD2

PropertyValue
Parameter Count124M parameters
LicenseCC-BY-4.0
Base ModelFacebookAI/roberta-base
Training DataSQuAD 2.0
F1 Score82.95%

What is roberta-base-squad2?

Roberta-base-squad2 is a fine-tuned version of the RoBERTa base model specifically optimized for extractive question answering tasks. Developed by deepset, this model has been trained on the SQuAD 2.0 dataset, making it capable of not only answering questions but also identifying when questions are unanswerable based on the given context.

Implementation Details

The model was trained using 4 Tesla V100 GPUs with carefully chosen hyperparameters including a batch size of 96, learning rate of 3e-5, and maximum sequence length of 386 tokens. It employs linear warmup scheduling and uses document stride of 128 for processing long texts.

  • Enhanced Squad 2.0 performance with 79.93% exact match accuracy
  • Robust F1 score of 82.95% on validation set
  • Handles both answerable and unanswerable questions
  • Trained with advanced warm-up techniques

Core Capabilities

  • Extractive Question Answering on English text
  • High performance on out-of-domain datasets (SQuADShifts)
  • Efficient processing with 124M parameters
  • Integration with popular frameworks like Haystack and Transformers

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its robust performance on SQuAD 2.0 and its ability to handle unanswerable questions, making it ideal for real-world applications where not all questions have answers in the given context.

Q: What are the recommended use cases?

The model is best suited for extractive QA tasks in production environments, particularly when integrated with frameworks like Haystack. It's especially effective for applications requiring accurate answer extraction from documents with the ability to identify when answers aren't present.

The first platform built for prompt engineering