DeBERTa-v3-base-zeroshot-v1.1-all-33

Property	Value
Parameter Count	184M
Architecture	DeBERTa-v3
License	MIT
Paper	arxiv.org/pdf/2312.17543.pdf
Training Data	33 datasets, 387 classes

What is deberta-v3-base-zeroshot-v1.1-all-33?

This is a specialized zero-shot classification model built on DeBERTa-v3 architecture, designed to perform universal binary classification tasks through natural language inference (NLI). The model determines whether a hypothesis is "true" or "not true" given a text, making it adaptable to any classification scenario without specific training.

Implementation Details

The model was trained on a diverse mixture of 33 datasets encompassing 387 classes, including 5 NLI datasets (~885k texts) and 28 classification tasks reformatted into the universal NLI format (~51k cleaned texts). It uses a binary classification approach (entailment vs. not_entailment) rather than the traditional three-way NLI classification.

Trained on multiple domains including sentiment analysis, emotion detection, topic classification, and more
Implements FP16 tensor type for efficient computation
Supports both single-label and multi-label classification
Designed for English language processing with recommendation for machine translation for multilingual use-cases

Core Capabilities

Zero-shot classification across diverse domains
Binary classification through natural language inference
Flexible hypothesis template customization
High performance on multiple benchmark datasets (70.7% average accuracy in zero-shot scenarios)

Frequently Asked Questions

Q: What makes this model unique?

The model's universal binary classification approach through NLI makes it highly adaptable to new classification tasks without specific training, while being more efficient than larger language models.

Q: What are the recommended use cases?

The model excels in various classification tasks including sentiment analysis, emotion detection, topic classification, and content moderation, particularly when traditional supervised learning isn't feasible due to lack of labeled data.