t5-small-spanish-nahuatl

Maintained By
somosnlp-hackathon-2022

t5-small-spanish-nahuatl

PropertyValue
Parameter Count60.5M
Model TypeTranslation (Text-to-Text)
LicenseApache 2.0
LanguagesSpanish ↔ Nahuatl

What is t5-small-spanish-nahuatl?

t5-small-spanish-nahuatl is a specialized machine translation model designed to bridge the communication gap between Spanish and Nahuatl, Mexico's most widely spoken indigenous language. Built on the T5-small architecture, this model employs an innovative two-stage training approach to overcome the challenges of limited data availability.

Implementation Details

The model uses a unique two-stage training process: first learning Spanish through English-Spanish paired data (118,964 samples), then adapting to Nahuatl using a combined dataset of approximately 23,000 sentences from the Axolotl corpus, bible-corpus, and web-sourced content. The training spans 660k steps with a batch size of 16 and a learning rate of 2e-5.

  • Leverages T5 text-to-text prefix training strategy
  • Incorporates normalized data using py-elotl's 'sep' normalization
  • Combines multiple data sources to enhance robustness

Core Capabilities

  • Efficient translation of short Spanish-Nahuatl sentences
  • BLEU score of 6.18 and Chrf score of 28.21 on validation set
  • Handles multiple Nahuatl variants
  • Maintains Spanish language understanding while adapting to Nahuatl

Frequently Asked Questions

Q: What makes this model unique?

The model's two-stage training approach and its ability to handle the scarce resources available for Nahuatl translation make it unique. It successfully addresses the challenges of multiple Nahuatl variants and limited training data through innovative preprocessing and training strategies.

Q: What are the recommended use cases?

The model is best suited for translating short sentences between Spanish and Nahuatl, particularly useful for basic communication, educational purposes, and cultural preservation efforts. It's important to note that performance may vary with complex or lengthy sentences.

The first platform built for prompt engineering