distilbert-base-uncased-mnli

Property	Value
Parameter Count	67M
Model Type	Zero-Shot Classification
Architecture	DistilBERT
Training Data	MultiNLI (433k sentence pairs)
Accuracy	82.07%
Hardware Used	AWS EC2 p3.2xlarge

What is distilbert-base-uncased-mnli?

This is a specialized version of DistilBERT developed by Typeform, fine-tuned specifically for zero-shot classification tasks using the Multi-Genre Natural Language Inference (MNLI) dataset. It's an uncased model, meaning it doesn't differentiate between uppercase and lowercase text, making it more flexible for various text processing tasks.

Implementation Details

The model is built on the DistilBERT architecture and trained using specific hyperparameters including a learning rate of 2e-5 over 5 epochs. It processes sequences up to 128 tokens and uses a batch size of 16 during training. The model achieved an impressive evaluation accuracy of 82.07% on both MNLI and MNLI-mm tasks.

F32 tensor type for computation
Supports PyTorch and TensorFlow frameworks
Implements Safetensors for secure tensor handling
Available through Inference Endpoints

Core Capabilities

Zero-shot text classification
Multi-genre text analysis
Cross-genre generalization
Case-insensitive text processing

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its efficient architecture (being a distilled version of BERT) while maintaining strong performance on zero-shot classification tasks. Its training on MNLI makes it particularly good at understanding textual entailment and natural language inference across various genres.

Q: What are the recommended use cases?

The model is ideal for text classification tasks, particularly when you need to classify text into categories without specific training data for each category. It's especially useful for applications requiring natural language inference and cross-genre text analysis.