wav2vec2-large-danish-npsc-nst

Maintained By
NbAiLab

wav2vec2-large-danish-npsc-nst

PropertyValue
LicenseApache 2.0
Base Modelchcaa/xls-r-300m-danish
Best WER6.69%
Training Steps142,000

What is wav2vec2-large-danish-npsc-nst?

This is a specialized Danish speech recognition model built on the wav2vec2 architecture, fine-tuned from the XLS-R-300M Danish base model. The model demonstrates exceptional performance in Danish speech recognition, achieving a Word Error Rate (WER) of 6.69% after extensive training.

Implementation Details

The model underwent rigorous training for 15 epochs using mixed-precision training with Native AMP. Key training parameters included a learning rate of 0.0001, batch size of 32 (16 with gradient accumulation steps of 2), and 2000 warmup steps.

  • Optimizer: Adam (β1=0.9, β2=0.999, ε=1e-8)
  • Learning rate scheduler: Linear with warmup
  • Training duration: 142,000 steps with consistent performance improvements
  • Final validation loss: 0.0587

Core Capabilities

  • State-of-the-art Danish speech recognition
  • Robust performance with low WER
  • Optimized for production deployment with PyTorch
  • Supports inference endpoints for scalable usage

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its exceptional performance on Danish speech recognition, achieving a very low WER of 6.69% through careful fine-tuning and extensive training. It builds upon the proven wav2vec2 architecture while being specifically optimized for Danish language processing.

Q: What are the recommended use cases?

This model is ideal for Danish automatic speech recognition tasks, including transcription services, voice assistants, and audio content analysis. It's particularly well-suited for applications requiring high accuracy in Danish speech processing.

The first platform built for prompt engineering