roberta-spam

Maintained By
mshenoda

RoBERTa Spam Detection Model

PropertyValue
Parameter Count125M
LicenseMIT
Accuracy99.06%
PaperRoBERTa Paper

What is roberta-spam?

Roberta-spam is a specialized text classification model built on the RoBERTa architecture, designed to detect spam messages with high accuracy. The model achieves impressive metrics with 99.71% precision and 99.34% recall, making it particularly effective for organizational security against spam threats.

Implementation Details

The model is fine-tuned on a comprehensive dataset merged from three major sources: SMS Spam Collection, Telegram Spam Ham, and Enron Spam. It utilizes the RoBERTa-base architecture and implements binary classification (0 for ham, 1 for spam) with state-of-the-art transformer technology.

  • Built on RoBERTa-base architecture
  • Training data split: 80% training, 10% validation, 10% testing
  • Implements safetensors for efficient inference
  • Supports PyTorch framework

Core Capabilities

  • Binary classification of messages as spam or ham
  • High precision spam detection (99.71%)
  • Effective handling of various message formats
  • Production-ready with inference endpoints

Frequently Asked Questions

Q: What makes this model unique?

The model combines the powerful RoBERTa architecture with a carefully curated dataset from multiple sources, achieving exceptional accuracy in spam detection. Its high precision and recall make it particularly reliable for production environments.

Q: What are the recommended use cases?

The model is ideal for organizations looking to enhance their security infrastructure against spam messages, particularly those containing malicious links or phishing attempts. It can be integrated into email systems, messaging platforms, and content moderation systems.

The first platform built for prompt engineering