granite-guardian-hap-38m

Maintained By
ibm-granite

Granite Guardian HAP 38M

PropertyValue
Parameter Count38.5M
LicenseApache 2.0
DeveloperIBM Research
Architecture4-layer RoBERTa-based
Release DateSeptember 6th, 2024

What is granite-guardian-hap-38m?

Granite Guardian HAP 38M is IBM's lightweight toxicity classifier designed specifically for detecting hateful, abusive, and profane content in English text. This model represents a significant optimization of the RoBERTa architecture, reducing the number of hidden layers from 12 to 4 and decreasing the hidden size from 768 to 576, while maintaining high performance standards.

Implementation Details

The model utilizes a compressed architecture optimized for both CPU and GPU deployment, featuring F32 tensor types and PyTorch implementation. It's specifically designed for high-throughput scenarios and can serve as an efficient guardrail for large language models.

  • Reduced parameter count (38.5M) compared to standard models
  • Optimized hidden size (576) and intermediate size (768)
  • Compatible with Transformers library and Safetensors
  • Supports batch processing and real-time inference

Core Capabilities

  • Binary classification of toxic content
  • Low-latency inference suitable for real-time applications
  • Efficient CPU performance without compromising accuracy
  • Bulk document processing support
  • Integration with data preparation workflows

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its optimal balance between performance and resource efficiency, offering comparable accuracy to larger models while maintaining significantly lower inference latency. It's specifically designed for production deployment where quick response times are crucial.

Q: What are the recommended use cases?

The model is ideal for content moderation systems, AI safety guardrails, bulk content analysis, and real-time text filtering applications. It's particularly suitable for scenarios requiring high-throughput processing or where computational resources are limited.

The first platform built for prompt engineering