toxic-bert

Maintained By
unitary

Toxic-BERT

PropertyValue
Parameter Count109M
LicenseApache 2.0
FrameworkPyTorch, JAX
PapersLink, Link

What is toxic-bert?

Toxic-BERT is a sophisticated text classification model designed to detect various forms of toxic content in online communications. Developed by Unitary, it's built on BERT architecture and trained on multiple Jigsaw challenges datasets, making it capable of identifying different types of toxicity including threats, obscenity, insults, and identity-based hate.

Implementation Details

The model utilizes the BERT-base-uncased architecture and has been trained using PyTorch Lightning and Hugging Face Transformers. It offers multi-label classification capabilities and achieves impressive performance scores, with a 0.98636 score on the original Toxic Comment Classification Challenge.

  • Supports multiple toxicity labels including toxic, severe_toxic, obscene, threat, insult, and identity_hate
  • Implements bias-aware classification to minimize unintended bias towards identity groups
  • Offers multilingual support for 7 languages including English, French, Spanish, Italian, Portuguese, Turkish, and Russian

Core Capabilities

  • Multi-label toxic content classification
  • Bias-aware prediction system
  • Multilingual support
  • High accuracy with 109M parameters
  • Easy integration through PyTorch and JAX frameworks

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its comprehensive approach to toxicity detection, combining high accuracy with bias awareness and multilingual capabilities. It's been trained on multiple Jigsaw challenge datasets, making it robust across different types of toxic content.

Q: What are the recommended use cases?

The model is primarily intended for research purposes and content moderation assistance. It's particularly useful for automated content filtering systems, helping content moderators flag potentially harmful content more efficiently. However, users should be aware of potential limitations regarding biases and consider the ethical implications discussed in the model's documentation.

The first platform built for prompt engineering