distilbert-base-uncased-mnli

Maintained By
typeform

distilbert-base-uncased-mnli

PropertyValue
Parameter Count67M
Model TypeZero-Shot Classification
ArchitectureDistilBERT
Training DataMultiNLI (433k sentence pairs)
Accuracy82.07%
Hardware UsedAWS EC2 p3.2xlarge

What is distilbert-base-uncased-mnli?

This is a specialized version of DistilBERT developed by Typeform, fine-tuned specifically for zero-shot classification tasks using the Multi-Genre Natural Language Inference (MNLI) dataset. It's an uncased model, meaning it doesn't differentiate between uppercase and lowercase text, making it more flexible for various text processing tasks.

Implementation Details

The model is built on the DistilBERT architecture and trained using specific hyperparameters including a learning rate of 2e-5 over 5 epochs. It processes sequences up to 128 tokens and uses a batch size of 16 during training. The model achieved an impressive evaluation accuracy of 82.07% on both MNLI and MNLI-mm tasks.

  • F32 tensor type for computation
  • Supports PyTorch and TensorFlow frameworks
  • Implements Safetensors for secure tensor handling
  • Available through Inference Endpoints

Core Capabilities

  • Zero-shot text classification
  • Multi-genre text analysis
  • Cross-genre generalization
  • Case-insensitive text processing

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its efficient architecture (being a distilled version of BERT) while maintaining strong performance on zero-shot classification tasks. Its training on MNLI makes it particularly good at understanding textual entailment and natural language inference across various genres.

Q: What are the recommended use cases?

The model is ideal for text classification tasks, particularly when you need to classify text into categories without specific training data for each category. It's especially useful for applications requiring natural language inference and cross-genre text analysis.

The first platform built for prompt engineering