tamil-codemixed-abusive-MuRIL

Maintained By
Hate-speech-CNERG

tamil-codemixed-abusive-MuRIL

PropertyValue
LicenseAFL-3.0
LanguageTamil-English (Code-mixed)
Research PaperView Paper
Downloads666,021

What is tamil-codemixed-abusive-MuRIL?

tamil-codemixed-abusive-MuRIL is a specialized natural language processing model designed to detect abusive speech in code-mixed Tamil-English text. Built on the MuRIL architecture, this model addresses the challenging task of content moderation in multilingual Indian social media contexts.

Implementation Details

The model is fine-tuned on the MuRIL base architecture with a learning rate of 2e-5. It implements a binary classification system, categorizing text as either normal (LABEL_0) or abusive (LABEL_1). The implementation leverages PyTorch and the Transformers library, making it suitable for production deployments.

  • Built on MuRIL's multilingual understanding capabilities
  • Optimized for Tamil-English code-mixed content
  • Implements binary classification architecture
  • Supports Inference Endpoints for scalable deployment

Core Capabilities

  • Accurate detection of abusive content in code-mixed text
  • Handles both Tamil and English language elements
  • Optimized for social media content analysis
  • Supports real-time content moderation

Frequently Asked Questions

Q: What makes this model unique?

This model specifically addresses the challenge of detecting abusive content in code-mixed Tamil-English text, a task that traditional monolingual models struggle with. It's built on the robust MuRIL architecture and has been extensively validated through academic research.

Q: What are the recommended use cases?

The model is ideal for social media platforms, content moderation systems, and online communities where Tamil-English code-mixed communications are common. It can be integrated into automated content filtering systems or used for research in online behavior analysis.

The first platform built for prompt engineering