DeBERTa-v3-Large Zero-Shot Classifier v2.0
Property | Value |
---|---|
Parameters | 435M |
License | MIT |
Paper | arXiv:2312.17543 |
Base Model | microsoft/deberta-v3-large |
What is deberta-v3-large-zeroshot-v2.0?
This model is a powerful zero-shot text classifier built on DeBERTa-v3-large architecture, specifically designed for efficient classification without requiring training data. It represents a significant advancement in universal text classification, trained on commercially-friendly data and capable of handling any classification task through natural language inference (NLI) formatting.
Implementation Details
The model is trained on a combination of synthetic data generated with Mixtral-8x7B-Instruct-v0.1 and commercial-friendly NLI datasets like MNLI and FEVER-NLI. It operates by determining whether a hypothesis is "true" or "not true" given a text input.
- Architecture: DeBERTa-v3-large with 435M parameters
- Input Processing: 512 token context window
- Commercial-friendly training data
- Supports both single-label and multi-label classification
Core Capabilities
- Zero-shot classification without training data
- Flexible hypothesis template customization
- Strong performance across 28 different classification tasks
- Average f1_macro score of 0.676, significantly outperforming BART-large-mnli
Frequently Asked Questions
Q: What makes this model unique?
The model combines commercial-friendly training data with state-of-the-art performance, making it especially suitable for business applications while maintaining high accuracy across diverse classification tasks.
Q: What are the recommended use cases?
The model excels in sentiment analysis, topic classification, content moderation, and general text categorization tasks. It's particularly valuable when training data is scarce or when quick deployment is needed without fine-tuning.