NB-Whisper Large
Property | Value |
---|---|
Parameter Count | 1.54B |
License | Apache 2.0 |
Languages | Norwegian, Norwegian Bokmål, Norwegian Nynorsk, English |
Training Data | 66,000 hours of speech |
Base Model | OpenAI Whisper Large |
What is nb-whisper-large?
NB-Whisper Large is a state-of-the-art speech recognition model developed by the National Library of Norway, specifically designed for Norwegian language processing. Built upon OpenAI's Whisper architecture, this model represents the largest variant in the NB-Whisper series with 1.54B parameters, trained on an extensive dataset of 8 million samples totaling 66,000 hours of speech.
Implementation Details
The model utilizes a transformer-based architecture and has been trained for 250,000 steps on diverse Norwegian speech data. It supports multiple deployment formats including PyTorch, TensorFlow, and ONNX, making it versatile for different implementation needs.
- Supports both transcription and translation tasks
- Handles long-form audio through chunk-based processing
- Provides word-level and sentence-level timestamp capabilities
- Compatible with WhisperX for speaker diarization
Core Capabilities
- Automatic Speech Recognition in Norwegian (Bokmål and Nynorsk)
- English translation support
- Real-time transcription capability through whisper.cpp
- Flexible output formatting with semantic and verbatim versions
Frequently Asked Questions
Q: What makes this model unique?
The model's specialization in Norwegian language processing, combined with its extensive training on 66,000 hours of Norwegian speech data, makes it particularly effective for Norwegian ASR tasks. It offers superior performance compared to general-purpose models for Norwegian language processing.
Q: What are the recommended use cases?
The model is ideal for transcribing Norwegian speech in various contexts, including parliamentary speeches, broadcast media, audiobooks, and general conversational audio. It's particularly suitable for applications requiring high accuracy in Norwegian language understanding and transcription.