MetricX-23-Large-v2p0

Property	Value
License	Apache 2.0
Author	Google
Framework	PyTorch
Paper	MetricX-23: The Google Submission to the WMT 2023 Metrics Shared Task

What is metricx-23-large-v2p0?

MetricX-23-Large is part of Google's family of models designed for automatic evaluation of translations. It's a reference-based model initialized with mT5 and fine-tuned on a combination of direct assessment and MQM (Multidimensional Quality Metrics) data. The model outputs scores in the range of 0-25, where lower scores indicate better translation quality.

Implementation Details

The model is built on the T5X architecture and converted for PyTorch usage. It's trained with a maximum input length of 1024 tokens and incorporates synthetic data to handle various translation edge cases.

Trained on combined DA and MQM datasets
Incorporates robust synthetic data handling
Supports both reference-based and reference-free evaluation
Optimized for speed compared to larger XXL variant

Core Capabilities

Automatic evaluation of translation quality
Handles multiple language pairs effectively
Robust detection of translation issues like under/over-translation
System-level and segment-level correlation with human judgments
Efficient processing with balanced performance-speed trade-off

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its robust handling of translation edge cases through synthetic data training, including undertranslation, overtranslation, and gibberish detection. It provides a practical balance between performance and computational efficiency.

Q: What are the recommended use cases?

MetricX-23-Large is recommended for scenarios where processing speed is a priority while maintaining good translation quality assessment. It's particularly suitable for large-scale translation evaluation tasks where real-time feedback is needed.