english-filipino-wav2vec2-l-xls-r-test-09
Property | Value |
---|---|
License | Apache 2.0 |
Base Model | wav2vec2-large-xlsr-53-english |
Downloads | 18,976 |
Final WER | 57.50% |
What is english-filipino-wav2vec2-l-xls-r-test-09?
This is a specialized speech recognition model fine-tuned from wav2vec2-large-xlsr-53-english for English-Filipino speech recognition. The model leverages transfer learning to adapt the robust wav2vec2 architecture for Filipino language processing.
Implementation Details
The model was trained using PyTorch with native AMP mixed precision training. Key training parameters include a learning rate of 0.002, batch size of 16 (with gradient accumulation), and Adam optimizer. The training spanned 20 epochs with a linear learning rate scheduler and 500 warmup steps.
- Architecture: Wav2vec2-Large-XLSR based
- Training Dataset: filipino_voice
- Validation Loss: 1.0054
- Word Error Rate: 57.50%
Core Capabilities
- Automatic Speech Recognition for Filipino and English
- Support for mixed-language speech processing
- Optimized for TensorBoard monitoring
- Compatible with Transformers pipeline
Frequently Asked Questions
Q: What makes this model unique?
This model specializes in Filipino-English speech recognition, built on the powerful wav2vec2-large-xlsr-53 architecture. It shows progressive improvement during training, reducing WER from 95.95% to 57.50%.
Q: What are the recommended use cases?
The model is suitable for automatic speech recognition tasks involving Filipino and English content, particularly in scenarios requiring mixed-language processing. It's best used in applications where a WER of around 57.50% is acceptable.