english-filipino-wav2vec2-l-xls-r-test-09

Maintained By
Khalsuu

english-filipino-wav2vec2-l-xls-r-test-09

PropertyValue
LicenseApache 2.0
Base Modelwav2vec2-large-xlsr-53-english
Downloads18,976
Final WER57.50%

What is english-filipino-wav2vec2-l-xls-r-test-09?

This is a specialized speech recognition model fine-tuned from wav2vec2-large-xlsr-53-english for English-Filipino speech recognition. The model leverages transfer learning to adapt the robust wav2vec2 architecture for Filipino language processing.

Implementation Details

The model was trained using PyTorch with native AMP mixed precision training. Key training parameters include a learning rate of 0.002, batch size of 16 (with gradient accumulation), and Adam optimizer. The training spanned 20 epochs with a linear learning rate scheduler and 500 warmup steps.

  • Architecture: Wav2vec2-Large-XLSR based
  • Training Dataset: filipino_voice
  • Validation Loss: 1.0054
  • Word Error Rate: 57.50%

Core Capabilities

  • Automatic Speech Recognition for Filipino and English
  • Support for mixed-language speech processing
  • Optimized for TensorBoard monitoring
  • Compatible with Transformers pipeline

Frequently Asked Questions

Q: What makes this model unique?

This model specializes in Filipino-English speech recognition, built on the powerful wav2vec2-large-xlsr-53 architecture. It shows progressive improvement during training, reducing WER from 95.95% to 57.50%.

Q: What are the recommended use cases?

The model is suitable for automatic speech recognition tasks involving Filipino and English content, particularly in scenarios requiring mixed-language processing. It's best used in applications where a WER of around 57.50% is acceptable.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.