NB-Whisper Large Beta

Property	Value
Parameter Count	1.54B parameters
License	CC-BY-4.0
Languages	Norwegian (Bokmål, Nynorsk), English
Training Data	20,000 hours of labeled data
Paper	Coming Soon

What is nb-whisper-large-beta?

NB-Whisper Large Beta is a state-of-the-art automatic speech recognition (ASR) model developed by the National Library of Norway. Built upon OpenAI's Whisper architecture, this model represents the largest variant in the NB-Whisper series, specifically optimized for Norwegian language processing. With 1.54 billion parameters, it's trained on an extensive dataset of 20,000 hours of labeled Norwegian speech data.

Implementation Details

The model is implemented using JAX/Flax for training and converted to multiple formats including PyTorch, TensorFlow, whisper.cpp, and ONNX for broader accessibility. It leverages the Transformer architecture and has been fine-tuned from the original Whisper model with specific optimizations for Norwegian language varieties.

Trained on TPUv4 hardware with significant computational resources
Supports both transcription and timestamp generation
Handles multiple Norwegian language variants (Bokmål and Nynorsk)
Environmental impact: 247.77 kgCO₂ (100% offset)

Core Capabilities

High-accuracy Norwegian speech recognition
Timestamp generation for precise audio alignment
Multi-dialect support including Bokmål and Nynorsk
Ability to "translate" spoken language into grammatically correct written form
Handling of both formal and conversational Norwegian speech

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically optimized for Norwegian language processing, trained on an extensive dataset of Norwegian speech, making it the current state-of-the-art for Norwegian ASR tasks. It's particularly notable for its ability to handle different Norwegian dialects and convert spoken language into proper written form.

Q: What are the recommended use cases?

The model is ideal for Norwegian speech transcription tasks, though it's currently in beta and not recommended for production use without proper risk assessment. It's particularly suitable for academic research, prototype development, and non-critical applications requiring Norwegian speech recognition.