Qari-OCR-0.2.2.1-VL-2B-Instruct

Maintained By
NAMAA-Space

Qari-OCR-0.2.2.1-VL-2B-Instruct

PropertyValue
Base ModelQwen2 VL
Parameters2 Billion
LicenseFollows Qwen2 VL licensing terms
AuthorNAMAA-Space
Model URLHugging Face

What is Qari-OCR-0.2.2.1-VL-2B-Instruct?

Qari-OCR-0.2.2.1-VL-2B-Instruct is a state-of-the-art Arabic Optical Character Recognition (OCR) model fine-tuned on Qwen2-VL-2B-Instruct. It represents a significant advancement in Arabic text recognition, achieving impressive metrics with a Word Error Rate (WER) of 0.221 and Character Error Rate (CER) of 0.059.

Implementation Details

The model was trained on a comprehensive dataset of 50,000 records, incorporating various font sizes (14-40pt) and multiple page layouts including A4, Letter, and custom formats. It supports 12 different Arabic fonts, making it highly versatile for real-world applications.

  • Superior accuracy compared to existing solutions like easyOCR and pytesseract
  • Full diacritics (tashkeel) support including fatḥah, kasrah, ḍammah, and more
  • Flexible layout handling for various document formats
  • Trained on multiple font styles and sizes

Core Capabilities

  • High-accuracy Arabic text extraction (BLEU score: 0.597)
  • Complete diacritical mark recognition
  • Support for multiple page layouts and formats
  • Robust performance across various font styles
  • Enhanced handling of complex Arabic typography

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its exceptional accuracy in Arabic OCR, particularly in handling diacritical marks and various font styles. It achieves significantly better performance metrics compared to existing solutions, with a WER of 0.221 versus competitors' 0.757-1.294.

Q: What are the recommended use cases?

The model is ideal for digitizing Arabic documents, processing academic texts with diacritics, handling business documents, and converting printed Arabic text to digital format. It works best with font sizes between 14-40pt and supports various standard document layouts.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.