roberta-base-openai-detector

Maintained By
openai-community

roberta-base-openai-detector

PropertyValue
Parameters125M
LicenseMIT
PaperRelease strategies and social impacts of language models
Accuracy~95% on GPT-2 generated text

What is roberta-base-openai-detector?

The roberta-base-openai-detector is a specialized model developed by OpenAI for detecting text generated by GPT-2 models. Built on the RoBERTa base architecture, this model was fine-tuned specifically to distinguish between human-written text and content generated by the 1.5B parameter GPT-2 model. It represents a crucial tool in the ongoing effort to identify AI-generated content.

Implementation Details

The model is implemented as a sequence classifier based on RoBERTa base architecture, fine-tuned on a dataset comprising outputs from the 1.5B GPT-2 model and WebText data. It leverages advanced transformer architecture to provide binary classification of text as either human-written or AI-generated.

  • Based on RoBERTa base architecture (125M parameters)
  • Fine-tuned on GPT-2 1.5B model outputs
  • Supports various sampling methods including temperature, Top-K, and nucleus sampling
  • Achieves approximately 95% detection accuracy

Core Capabilities

  • Binary classification of text (Real vs AI-generated)
  • Robust performance across different text sampling methods
  • Particularly effective with GPT-2 generated content
  • Integrates easily with Transformers pipeline

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically designed and trained to detect GPT-2 generated text with high accuracy, making it one of the first dedicated AI content detectors released by OpenAI alongside their language models.

Q: What are the recommended use cases?

The model is best suited for research purposes related to synthetic text generation and detection. However, it should not be used as a standalone tool for making serious allegations of AI-generated content, particularly for newer models like ChatGPT.

The first platform built for prompt engineering