deberta-v3-large-zeroshot-v2.0

Maintained By
MoritzLaurer

DeBERTa-v3-Large Zero-Shot Classifier v2.0

PropertyValue
Parameters435M
LicenseMIT
PaperarXiv:2312.17543
Base Modelmicrosoft/deberta-v3-large

What is deberta-v3-large-zeroshot-v2.0?

This model is a powerful zero-shot text classifier built on DeBERTa-v3-large architecture, specifically designed for efficient classification without requiring training data. It represents a significant advancement in universal text classification, trained on commercially-friendly data and capable of handling any classification task through natural language inference (NLI) formatting.

Implementation Details

The model is trained on a combination of synthetic data generated with Mixtral-8x7B-Instruct-v0.1 and commercial-friendly NLI datasets like MNLI and FEVER-NLI. It operates by determining whether a hypothesis is "true" or "not true" given a text input.

  • Architecture: DeBERTa-v3-large with 435M parameters
  • Input Processing: 512 token context window
  • Commercial-friendly training data
  • Supports both single-label and multi-label classification

Core Capabilities

  • Zero-shot classification without training data
  • Flexible hypothesis template customization
  • Strong performance across 28 different classification tasks
  • Average f1_macro score of 0.676, significantly outperforming BART-large-mnli

Frequently Asked Questions

Q: What makes this model unique?

The model combines commercial-friendly training data with state-of-the-art performance, making it especially suitable for business applications while maintaining high accuracy across diverse classification tasks.

Q: What are the recommended use cases?

The model excels in sentiment analysis, topic classification, content moderation, and general text categorization tasks. It's particularly valuable when training data is scarce or when quick deployment is needed without fine-tuning.

The first platform built for prompt engineering