wav2vec2-large-danish-npsc-nst
Property | Value |
---|---|
License | Apache 2.0 |
Base Model | chcaa/xls-r-300m-danish |
Best WER | 6.69% |
Training Steps | 142,000 |
What is wav2vec2-large-danish-npsc-nst?
This is a specialized Danish speech recognition model built on the wav2vec2 architecture, fine-tuned from the XLS-R-300M Danish base model. The model demonstrates exceptional performance in Danish speech recognition, achieving a Word Error Rate (WER) of 6.69% after extensive training.
Implementation Details
The model underwent rigorous training for 15 epochs using mixed-precision training with Native AMP. Key training parameters included a learning rate of 0.0001, batch size of 32 (16 with gradient accumulation steps of 2), and 2000 warmup steps.
- Optimizer: Adam (β1=0.9, β2=0.999, ε=1e-8)
- Learning rate scheduler: Linear with warmup
- Training duration: 142,000 steps with consistent performance improvements
- Final validation loss: 0.0587
Core Capabilities
- State-of-the-art Danish speech recognition
- Robust performance with low WER
- Optimized for production deployment with PyTorch
- Supports inference endpoints for scalable usage
Frequently Asked Questions
Q: What makes this model unique?
The model stands out for its exceptional performance on Danish speech recognition, achieving a very low WER of 6.69% through careful fine-tuning and extensive training. It builds upon the proven wav2vec2 architecture while being specifically optimized for Danish language processing.
Q: What are the recommended use cases?
This model is ideal for Danish automatic speech recognition tasks, including transcription services, voice assistants, and audio content analysis. It's particularly well-suited for applications requiring high accuracy in Danish speech processing.