Kokoro-82M-ONNX
Property | Value |
---|---|
Parameter Count | 82 Million |
Model Type | Text-to-Speech (TTS) |
Model Format | ONNX |
Hugging Face URL | Link |
Sample Rate | 24kHz |
What is Kokoro-82M-ONNX?
Kokoro-82M-ONNX is a frontier text-to-speech model that delivers high-quality voice synthesis in a remarkably compact package of 82 million parameters. The model supports multiple voice personas across American and British English accents, offering both male and female voice options.
Implementation Details
The model is implemented in ONNX format and offers various quantization options for efficient deployment. It supports multiple precision formats from FP32 to 4-bit quantization, allowing users to balance between model size and quality. The model accepts tokenized text input and a style vector, producing audio output at 24kHz sample rate.
- Multiple quantization options ranging from 326MB (FP32) to 86MB (Mixed precision)
- Support for custom voice styling through reference vectors
- Context length of up to 512 tokens
- Easy integration with both JavaScript and Python environments
Core Capabilities
- 11 distinct voice personas including American and British accents
- Flexible deployment options through various quantization levels
- High-quality speech synthesis with natural prosody
- Real-time inference capability with optimized performance
- Cross-platform support through ONNX runtime
Frequently Asked Questions
Q: What makes this model unique?
The model stands out for its excellent balance between size and quality, offering production-ready speech synthesis in a compact 82M parameter package. Its support for multiple voices and quantization options makes it highly versatile for different deployment scenarios.
Q: What are the recommended use cases?
Kokoro-82M-ONNX is ideal for applications requiring high-quality text-to-speech conversion, including audiobook generation, virtual assistants, accessibility tools, and content creation platforms. Its various quantization options make it suitable for both server-side and edge deployment.