tiny-bert-sst2-distilled

Maintained By
philschmid

tiny-bert-sst2-distilled

PropertyValue
LicenseApache 2.0
FrameworkPyTorch 1.9.1
DatasetGLUE SST2
Accuracy83.26%

What is tiny-bert-sst2-distilled?

tiny-bert-sst2-distilled is a lightweight, distilled version of BERT specifically optimized for sentiment analysis tasks. Based on google/bert_uncased_L-2_H-128_A-2, this model demonstrates impressive efficiency while maintaining strong performance on the SST2 (Stanford Sentiment Treebank) dataset.

Implementation Details

The model utilizes a compact architecture with just 2 layers and 128 hidden dimensions, trained using mixed precision and the Adam optimizer. Training was conducted over 7 epochs with a carefully tuned learning rate of 0.00072 and a batch size of 1024.

  • Native AMP mixed precision training
  • Linear learning rate scheduler
  • Optimized batch size for efficiency
  • Achieves 83.26% accuracy on evaluation

Core Capabilities

  • Binary sentiment classification
  • Efficient inference with minimal computational requirements
  • Balanced trade-off between model size and performance
  • Suitable for resource-constrained environments

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its extreme compression while maintaining strong performance, using only 2 layers compared to BERT-base's 12 layers, making it ideal for deployment in resource-constrained environments.

Q: What are the recommended use cases?

This model is best suited for sentiment analysis tasks where computational efficiency is crucial, such as real-time applications, mobile devices, or large-scale text processing with limited resources.

The first platform built for prompt engineering