tamil-codemixed-abusive-MuRIL
Property | Value |
---|---|
License | AFL-3.0 |
Language | Tamil-English (Code-mixed) |
Research Paper | View Paper |
Downloads | 666,021 |
What is tamil-codemixed-abusive-MuRIL?
tamil-codemixed-abusive-MuRIL is a specialized natural language processing model designed to detect abusive speech in code-mixed Tamil-English text. Built on the MuRIL architecture, this model addresses the challenging task of content moderation in multilingual Indian social media contexts.
Implementation Details
The model is fine-tuned on the MuRIL base architecture with a learning rate of 2e-5. It implements a binary classification system, categorizing text as either normal (LABEL_0) or abusive (LABEL_1). The implementation leverages PyTorch and the Transformers library, making it suitable for production deployments.
- Built on MuRIL's multilingual understanding capabilities
- Optimized for Tamil-English code-mixed content
- Implements binary classification architecture
- Supports Inference Endpoints for scalable deployment
Core Capabilities
- Accurate detection of abusive content in code-mixed text
- Handles both Tamil and English language elements
- Optimized for social media content analysis
- Supports real-time content moderation
Frequently Asked Questions
Q: What makes this model unique?
This model specifically addresses the challenge of detecting abusive content in code-mixed Tamil-English text, a task that traditional monolingual models struggle with. It's built on the robust MuRIL architecture and has been extensively validated through academic research.
Q: What are the recommended use cases?
The model is ideal for social media platforms, content moderation systems, and online communities where Tamil-English code-mixed communications are common. It can be integrated into automated content filtering systems or used for research in online behavior analysis.