RoBERTa Base SQuAD2

Property	Value
Parameter Count	124M parameters
License	CC-BY-4.0
Base Model	FacebookAI/roberta-base
Training Data	SQuAD 2.0
F1 Score	82.95%

What is roberta-base-squad2?

Roberta-base-squad2 is a fine-tuned version of the RoBERTa base model specifically optimized for extractive question answering tasks. Developed by deepset, this model has been trained on the SQuAD 2.0 dataset, making it capable of not only answering questions but also identifying when questions are unanswerable based on the given context.

Implementation Details

The model was trained using 4 Tesla V100 GPUs with carefully chosen hyperparameters including a batch size of 96, learning rate of 3e-5, and maximum sequence length of 386 tokens. It employs linear warmup scheduling and uses document stride of 128 for processing long texts.

Enhanced Squad 2.0 performance with 79.93% exact match accuracy
Robust F1 score of 82.95% on validation set
Handles both answerable and unanswerable questions
Trained with advanced warm-up techniques

Core Capabilities

Extractive Question Answering on English text
High performance on out-of-domain datasets (SQuADShifts)
Efficient processing with 124M parameters
Integration with popular frameworks like Haystack and Transformers

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its robust performance on SQuAD 2.0 and its ability to handle unanswerable questions, making it ideal for real-world applications where not all questions have answers in the given context.

Q: What are the recommended use cases?

The model is best suited for extractive QA tasks in production environments, particularly when integrated with frameworks like Haystack. It's especially effective for applications requiring accurate answer extraction from documents with the ability to identify when answers aren't present.