whisper-tiny

Maintained By
openai

Whisper Tiny

PropertyValue
Parameter Count37.8M parameters
Model TypeAutomatic Speech Recognition
LicenseApache 2.0
PaperView Paper

What is whisper-tiny?

Whisper-tiny is the most compact variant of OpenAI's Whisper family, designed for efficient automatic speech recognition and translation. As a transformer-based encoder-decoder model, it offers an impressive balance between performance and resource efficiency, supporting 99 languages while maintaining a relatively small footprint of 37.8M parameters.

Implementation Details

The model utilizes a sequence-to-sequence architecture trained on 680,000 hours of multilingual audio data. It processes audio by converting it to log-Mel spectrograms and can handle both transcription and translation tasks through specialized decoder prompts.

  • Supports both English-only and multilingual transcription
  • Handles audio chunks of up to 30 seconds
  • Includes timestamp prediction capabilities
  • Uses F32 tensor type for computations

Core Capabilities

  • Multilingual ASR supporting 99 languages
  • Speech-to-text transcription with 7.54% WER on LibriSpeech clean test
  • Speech translation to English
  • Long-form transcription through chunking
  • Robust performance across various accents and background noise conditions

Frequently Asked Questions

Q: What makes this model unique?

Whisper-tiny stands out for its exceptional efficiency-to-performance ratio, offering multilingual capabilities in a compact form factor. It's particularly notable for achieving reasonable accuracy while maintaining a small parameter count, making it suitable for deployment in resource-constrained environments.

Q: What are the recommended use cases?

The model is ideal for lightweight ASR applications, development and testing environments, and scenarios where resource efficiency is crucial. It's particularly well-suited for English transcription tasks, basic multilingual transcription, and prototyping speech recognition solutions.

The first platform built for prompt engineering