tiny-bert-sst2-distilled
Property | Value |
---|---|
License | Apache 2.0 |
Framework | PyTorch 1.9.1 |
Dataset | GLUE SST2 |
Accuracy | 83.26% |
What is tiny-bert-sst2-distilled?
tiny-bert-sst2-distilled is a lightweight, distilled version of BERT specifically optimized for sentiment analysis tasks. Based on google/bert_uncased_L-2_H-128_A-2, this model demonstrates impressive efficiency while maintaining strong performance on the SST2 (Stanford Sentiment Treebank) dataset.
Implementation Details
The model utilizes a compact architecture with just 2 layers and 128 hidden dimensions, trained using mixed precision and the Adam optimizer. Training was conducted over 7 epochs with a carefully tuned learning rate of 0.00072 and a batch size of 1024.
- Native AMP mixed precision training
- Linear learning rate scheduler
- Optimized batch size for efficiency
- Achieves 83.26% accuracy on evaluation
Core Capabilities
- Binary sentiment classification
- Efficient inference with minimal computational requirements
- Balanced trade-off between model size and performance
- Suitable for resource-constrained environments
Frequently Asked Questions
Q: What makes this model unique?
The model stands out for its extreme compression while maintaining strong performance, using only 2 layers compared to BERT-base's 12 layers, making it ideal for deployment in resource-constrained environments.
Q: What are the recommended use cases?
This model is best suited for sentiment analysis tasks where computational efficiency is crucial, such as real-time applications, mobile devices, or large-scale text processing with limited resources.