DeBERTa-v3-Large Zero-Shot Classifier v2.0

Property	Value
Parameters	435M
License	MIT
Paper	arXiv:2312.17543
Base Model	microsoft/deberta-v3-large

What is deberta-v3-large-zeroshot-v2.0?

This model is a powerful zero-shot text classifier built on DeBERTa-v3-large architecture, specifically designed for efficient classification without requiring training data. It represents a significant advancement in universal text classification, trained on commercially-friendly data and capable of handling any classification task through natural language inference (NLI) formatting.

Implementation Details

The model is trained on a combination of synthetic data generated with Mixtral-8x7B-Instruct-v0.1 and commercial-friendly NLI datasets like MNLI and FEVER-NLI. It operates by determining whether a hypothesis is "true" or "not true" given a text input.

Architecture: DeBERTa-v3-large with 435M parameters
Input Processing: 512 token context window
Commercial-friendly training data
Supports both single-label and multi-label classification

Core Capabilities

Zero-shot classification without training data
Flexible hypothesis template customization
Strong performance across 28 different classification tasks
Average f1_macro score of 0.676, significantly outperforming BART-large-mnli

Frequently Asked Questions

Q: What makes this model unique?

The model combines commercial-friendly training data with state-of-the-art performance, making it especially suitable for business applications while maintaining high accuracy across diverse classification tasks.

Q: What are the recommended use cases?

The model excels in sentiment analysis, topic classification, content moderation, and general text categorization tasks. It's particularly valuable when training data is scarce or when quick deployment is needed without fine-tuning.

deberta-v3-large-zeroshot-v2.0