tiny-bert-sst2-distilled

Property	Value
License	Apache 2.0
Framework	PyTorch 1.9.1
Dataset	GLUE SST2
Accuracy	83.26%

What is tiny-bert-sst2-distilled?

tiny-bert-sst2-distilled is a lightweight, distilled version of BERT specifically optimized for sentiment analysis tasks. Based on google/bert_uncased_L-2_H-128_A-2, this model demonstrates impressive efficiency while maintaining strong performance on the SST2 (Stanford Sentiment Treebank) dataset.

Implementation Details

The model utilizes a compact architecture with just 2 layers and 128 hidden dimensions, trained using mixed precision and the Adam optimizer. Training was conducted over 7 epochs with a carefully tuned learning rate of 0.00072 and a batch size of 1024.

Native AMP mixed precision training
Linear learning rate scheduler
Optimized batch size for efficiency
Achieves 83.26% accuracy on evaluation

Core Capabilities

Binary sentiment classification
Efficient inference with minimal computational requirements
Balanced trade-off between model size and performance
Suitable for resource-constrained environments

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its extreme compression while maintaining strong performance, using only 2 layers compared to BERT-base's 12 layers, making it ideal for deployment in resource-constrained environments.

Q: What are the recommended use cases?

This model is best suited for sentiment analysis tasks where computational efficiency is crucial, such as real-time applications, mobile devices, or large-scale text processing with limited resources.