Dolphin-Small
Property | Value |
---|---|
Parameter Count | 372M |
Model Type | ASR (Automatic Speech Recognition) |
Architecture | Joint CTC-Attention with E-Branchformer encoder |
License | Apache 2.0 |
Average WER | 25.2% |
What is dolphin-small?
Dolphin-small is a powerful multilingual ASR model developed through collaboration between DataoceanAI and Tsinghua University. It's designed specifically for Eastern languages, supporting 40 languages across East Asia, South Asia, Southeast Asia, and the Middle East, plus 22 Chinese dialects. Trained on over 210,000 hours of data, it represents a significant advancement in multilingual speech recognition technology.
Implementation Details
The model employs a sophisticated joint CTC-Attention architecture, utilizing an E-Branchformer-based encoder and a standard Transformer decoder. A notable innovation is its two-level language token system, which handles linguistic and regional diversity through separate language and region tokens (e.g.,
- 372M parameters for optimal performance-efficiency balance
- Trained on both proprietary and open-source datasets
- FFmpeg requirement for audio conversion to WAV format
- Streamlined architecture without translation capabilities
Core Capabilities
- Speech Recognition across 40+ languages
- Voice Activity Detection (VAD)
- Audio Segmentation
- Language Identification (LID)
- Regional dialect support for 22 Chinese variants
Frequently Asked Questions
Q: What makes this model unique?
The model's distinctive feature is its specialized focus on Eastern languages and dialects, combined with its innovative two-level language token system. This makes it particularly effective for Asian language processing, with state-of-the-art performance for these specific language families.
Q: What are the recommended use cases?
The model is ideal for applications requiring multilingual ASR capabilities in Eastern languages, such as transcription services, voice assistants, and automated content processing systems. It's particularly valuable for applications dealing with Chinese dialects and various Asian languages.