lemone-router-s
Property | Value |
---|---|
Parameter Count | 118M |
License | Apache 2.0 |
Base Model | intfloat/multilingual-e5-base |
Accuracy | 90.61% |
Training Data | 7 French taxation datasets |
What is lemone-router-s?
lemone-router-s is a specialized French taxation classification model designed to facilitate multi-agent systems in tax law interpretation. Built on the multilingual-e5-base architecture, it has been fine-tuned on an extensive dataset of 49,000 synthetic questions generated using GPT-4 Turbo and Llama 3.1 70B, with additional refinement through evol-instruction tuning and manual curation.
Implementation Details
The model implements an 8-category classification system derived from the Bulletin officiel des finances publiques-impôts, leveraging transformer architecture with F32 tensor type. Training was performed on an NVIDIA H100 NVL GPU using PyTorch 2.4.1, achieving optimal performance through a linear learning rate scheduler with warmup.
- Training batch size: 8 with 64 for evaluation
- Learning rate: 9.736e-05
- Training epochs: 4
- Final validation loss: 0.4446
Core Capabilities
- Eight-category French tax law classification
- Multilingual support with French optimization
- High accuracy (90.61%) in tax document routing
- Specialized in professional benefits, tax control, cross-cutting provisions, and more
Frequently Asked Questions
Q: What makes this model unique?
The model's specialization in French taxation law, combined with its high accuracy and comprehensive category coverage, makes it particularly valuable for legal and financial applications. Its foundation on multilingual-e5-base ensures robust language understanding while maintaining specific domain expertise.
Q: What are the recommended use cases?
The model is ideal for automated tax document classification, legal research assistance, and integration into tax advisory systems. It's particularly useful for organizations dealing with French tax law, accounting firms, and legal technology platforms.