distilbert-base-uncased-mnli
Property | Value |
---|---|
Parameter Count | 67M |
Model Type | Zero-Shot Classification |
Architecture | DistilBERT |
Training Data | MultiNLI (433k sentence pairs) |
Accuracy | 82.07% |
Hardware Used | AWS EC2 p3.2xlarge |
What is distilbert-base-uncased-mnli?
This is a specialized version of DistilBERT developed by Typeform, fine-tuned specifically for zero-shot classification tasks using the Multi-Genre Natural Language Inference (MNLI) dataset. It's an uncased model, meaning it doesn't differentiate between uppercase and lowercase text, making it more flexible for various text processing tasks.
Implementation Details
The model is built on the DistilBERT architecture and trained using specific hyperparameters including a learning rate of 2e-5 over 5 epochs. It processes sequences up to 128 tokens and uses a batch size of 16 during training. The model achieved an impressive evaluation accuracy of 82.07% on both MNLI and MNLI-mm tasks.
- F32 tensor type for computation
- Supports PyTorch and TensorFlow frameworks
- Implements Safetensors for secure tensor handling
- Available through Inference Endpoints
Core Capabilities
- Zero-shot text classification
- Multi-genre text analysis
- Cross-genre generalization
- Case-insensitive text processing
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its efficient architecture (being a distilled version of BERT) while maintaining strong performance on zero-shot classification tasks. Its training on MNLI makes it particularly good at understanding textual entailment and natural language inference across various genres.
Q: What are the recommended use cases?
The model is ideal for text classification tasks, particularly when you need to classify text into categories without specific training data for each category. It's especially useful for applications requiring natural language inference and cross-genre text analysis.