roberta-large-mnli

Maintained By
FacebookAI

roberta-large-mnli

PropertyValue
Parameter Count356M
LicenseMIT
PaperRoBERTa: A Robustly Optimized BERT Pretraining Approach
DeveloperFacebookAI

What is roberta-large-mnli?

roberta-large-mnli is a sophisticated language model based on the RoBERTa architecture, specifically fine-tuned on the Multi-Genre Natural Language Inference (MNLI) corpus. This model represents a significant advancement in natural language processing, built upon the robust foundation of RoBERTa-large and optimized for zero-shot classification tasks.

Implementation Details

The model utilizes a transformer-based architecture and was trained on a massive dataset including BookCorpus, English Wikipedia, CC-News, OpenWebText, and Stories, totaling 160GB of text. The training process involved 1024 V100 GPUs running for 500K steps with a batch size of 8K and sequence length of 512.

  • Employs byte-level BPE tokenization with 50,000 vocabulary size
  • Uses dynamic masking during pretraining
  • Achieves 90.2% accuracy on MNLI dev set
  • Supports multiple languages through XNLI evaluation

Core Capabilities

  • Zero-shot classification for various text classification tasks
  • Sentence-pair classification with high accuracy
  • Natural language inference across multiple genres
  • Cross-lingual transfer capabilities

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its robust optimization and fine-tuning on the MNLI dataset, making it particularly effective for zero-shot classification tasks. Its training on diverse text sources and dynamic masking approach contributes to its superior performance.

Q: What are the recommended use cases?

The model excels in zero-shot classification tasks, making it ideal for applications requiring text classification without specific training data. It's particularly useful for natural language inference, sentiment analysis, and cross-lingual applications.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.