bert-tiny-Massive-intent-KD-BERT

bert-tiny-Massive-intent-KD-BERT

gokuls

A compact BERT model fine-tuned for intent classification on the MASSIVE dataset, achieving 85.34% accuracy using knowledge distillation

PropertyValue
LicenseApache 2.0
Base Modelgoogle/bert_uncased_L-2_H-128_A-2
Accuracy85.34%
Training DatasetMASSIVE

What is bert-tiny-Massive-intent-KD-BERT?

This is a lightweight BERT model specifically designed for intent classification tasks, built using knowledge distillation techniques on the MASSIVE dataset. It's based on BERT-tiny architecture, making it efficient while maintaining strong performance with an accuracy of 85.34%.

Implementation Details

The model was trained using a carefully optimized process with the following specifications: learning rate of 5e-05, batch size of 16, and 50 epochs of training using the Adam optimizer. The training utilized native AMP (Automatic Mixed Precision) for efficient computation.

  • Linear learning rate scheduler
  • Mixed precision training implementation
  • Trained on the MASSIVE dataset with English (en-US) configuration
  • Validation loss of 0.8380

Core Capabilities

  • Text classification optimized for intent detection
  • Efficient inference with a compact model architecture
  • Suitable for production deployment with PyTorch backend
  • Compatible with TensorBoard for monitoring

Frequently Asked Questions

Q: What makes this model unique?

This model combines the efficiency of BERT-tiny architecture with knowledge distillation techniques to achieve high accuracy (85.34%) on intent classification tasks while maintaining a small footprint.

Q: What are the recommended use cases?

The model is particularly well-suited for intent classification in conversational AI applications, chatbots, and other natural language understanding tasks where computational efficiency is important.

Related Models

Socials
PromptLayer
Company
All services online
Location IconPromptLayer is located in the heart of New York City
PromptLayer © 2026