ruT5-base-detox

Maintained By
s-nlp

ruT5-base-detox

PropertyValue
Parameter Count223M
Model TypeText-to-Text Generation
Base Modelai-forever/ruT5-base
LicenseOpenRAIL++
LanguageRussian

What is ruT5-base-detox?

ruT5-base-detox is a specialized Russian language model designed for text detoxification. Built on the ruT5-base architecture, this model was specifically trained to transform toxic Russian text from social media platforms like Odnoklassniki, Pikabu, and Twitter into neutral, non-offensive language while preserving the original meaning.

Implementation Details

The model utilizes the T5 architecture and was trained on the RUSSE 2022 competition's training dataset. It employs a sequence-to-sequence approach, processing input text through the T5 encoder-decoder framework to generate detoxified output.

  • Based on the ruT5-base architecture with 223M parameters
  • Implements PyTorch framework with F32 tensor type
  • Supports text generation inference endpoints
  • Utilizes Safetensors for model weight storage

Core Capabilities

  • Russian text detoxification
  • Preserves original message meaning while removing offensive content
  • Handles various types of toxic content from different social media sources
  • Supports batch processing and inference endpoints

Frequently Asked Questions

Q: What makes this model unique?

This model specifically targets Russian language detoxification, a specialized task that requires understanding of Russian cultural and linguistic nuances in toxic speech. It's trained on real-world data from multiple social media platforms, making it practical for actual use cases.

Q: What are the recommended use cases?

The model is ideal for content moderation systems, social media platforms, and applications requiring automatic transformation of toxic Russian text into more appropriate language. It can be integrated into content filtering systems, chatbots, or any application requiring text sanitization.

The first platform built for prompt engineering