metricx-23-large-v2p0

Maintained By
google

MetricX-23-Large-v2p0

PropertyValue
LicenseApache 2.0
AuthorGoogle
FrameworkPyTorch
PaperMetricX-23: The Google Submission to the WMT 2023 Metrics Shared Task

What is metricx-23-large-v2p0?

MetricX-23-Large is part of Google's family of models designed for automatic evaluation of translations. It's a reference-based model initialized with mT5 and fine-tuned on a combination of direct assessment and MQM (Multidimensional Quality Metrics) data. The model outputs scores in the range of 0-25, where lower scores indicate better translation quality.

Implementation Details

The model is built on the T5X architecture and converted for PyTorch usage. It's trained with a maximum input length of 1024 tokens and incorporates synthetic data to handle various translation edge cases.

  • Trained on combined DA and MQM datasets
  • Incorporates robust synthetic data handling
  • Supports both reference-based and reference-free evaluation
  • Optimized for speed compared to larger XXL variant

Core Capabilities

  • Automatic evaluation of translation quality
  • Handles multiple language pairs effectively
  • Robust detection of translation issues like under/over-translation
  • System-level and segment-level correlation with human judgments
  • Efficient processing with balanced performance-speed trade-off

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its robust handling of translation edge cases through synthetic data training, including undertranslation, overtranslation, and gibberish detection. It provides a practical balance between performance and computational efficiency.

Q: What are the recommended use cases?

MetricX-23-Large is recommended for scenarios where processing speed is a priority while maintaining good translation quality assessment. It's particularly suitable for large-scale translation evaluation tasks where real-time feedback is needed.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.