viXTTS

Property	Value
Author	capleaf
License	Coqui Public Model License
Base Model	XTTS-v2.0.3
Model URL	https://huggingface.co/capleaf/viXTTS

What is viXTTS?

viXTTS is an advanced text-to-speech model specifically optimized for Vietnamese language processing while maintaining multilingual capabilities. Built upon the XTTS-v2.0.3 architecture, this model stands out for its ability to clone voices across different languages using just a 6-second audio sample. The model has been fine-tuned on the viVoice dataset with an expanded tokenizer for Vietnamese language support.

Implementation Details

The model represents a significant advancement in multilingual voice synthesis, particularly focusing on Vietnamese language support. It employs an enhanced tokenizer specifically adapted for Vietnamese while maintaining support for 17 other languages. The implementation builds upon the robust XTTS architecture, with specific optimizations for Vietnamese speech patterns.

Fine-tuned on viVoice dataset
Expanded tokenizer for Vietnamese language
Based on XTTS-v2.0.3 architecture
Requires minimal audio input (6 seconds) for voice cloning

Core Capabilities

Support for 18 languages including English, Spanish, French, German, and Vietnamese
Voice cloning with minimal audio input
Optimized performance for Vietnamese language
Cross-lingual voice synthesis

Frequently Asked Questions

Q: What makes this model unique?

viXTTS uniquely combines extensive language support with specialized Vietnamese optimization, allowing for high-quality voice cloning across 18 languages with just 6 seconds of audio input. Its Vietnamese-focused fine-tuning makes it particularly effective for Vietnamese speech synthesis while maintaining multilingual capabilities.

Q: What are the recommended use cases?

The model is ideal for applications requiring Vietnamese text-to-speech conversion, multilingual voice cloning, and cross-lingual speech synthesis. It's particularly suitable for applications needing quick voice adaptation with minimal input audio, though it's recommended to use sentences longer than 10 words for optimal Vietnamese output quality.

viXTTS

viXTTS

What is viXTTS?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models