wav2vec2-large-xlsr-persian-v3

Maintained By
m3hrdadfi

wav2vec2-large-xlsr-persian-v3

PropertyValue
Authorm3hrdadfi
TaskAutomatic Speech Recognition
LanguagePersian (Farsi)
WER Score10.36%

What is wav2vec2-large-xlsr-persian-v3?

This is a fine-tuned version of Facebook's wav2vec2-large-xlsr-53 model, specifically optimized for Persian (Farsi) speech recognition. The model has been trained on the Common Voice dataset and demonstrates impressive performance with a Word Error Rate (WER) of 10.36%. It's designed to process audio input sampled at 16kHz and includes specialized normalization for Persian text.

Implementation Details

The model leverages the wav2vec2 architecture and includes custom preprocessing steps for Persian language handling. It requires specific packages including transformers, torchaudio, and custom normalizers for optimal performance.

  • Built on wav2vec2-large-xlsr-53 architecture
  • Includes specialized Persian text normalization
  • Optimized for 16kHz audio input
  • Supports batch processing for efficient inference

Core Capabilities

  • Accurate Persian speech recognition
  • Robust handling of various Persian dialects
  • Efficient batch processing of audio files
  • Custom text normalization for Persian language

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialized optimization for Persian language processing, achieving a competitive 10.36% WER. It includes custom normalization tools specifically designed for Persian text processing, making it particularly effective for Persian ASR tasks.

Q: What are the recommended use cases?

The model is ideal for Persian speech recognition applications, including transcription services, voice assistants, and automated subtitling systems. It's particularly suited for applications requiring 16kHz audio processing and those needing accurate Persian language handling.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.