twitter-roberta-base-dec2021-tweet-topic-multi-all
Property | Value |
---|---|
Author | CardiffNLP |
Downloads | 24,276 |
Task Type | Multi-label Text Classification |
Base Architecture | RoBERTa |
What is twitter-roberta-base-dec2021-tweet-topic-multi-all?
This is a specialized RoBERTa-based model fine-tuned for multi-label topic classification of tweets. Developed by CardiffNLP, it represents a significant advancement in social media content analysis, achieving a micro F1 score of 0.76 and macro F1 score of 0.62 on the test set.
Implementation Details
The model is built upon the twitter-roberta-base-dec2021 architecture and has been specifically fine-tuned on the tweet_topic_multi dataset. It implements multi-label classification capabilities, allowing it to assign multiple relevant topics to a single tweet.
- Fine-tuned on the train_all split of the tweet_topic dataset
- Validated on test_2021 split
- Uses PyTorch framework with Transformers library
- Implements sigmoid activation for multi-label prediction
Core Capabilities
- Multi-label topic classification of tweets
- Handles complex social media text including mentions and hashtags
- Achieves 54.85% accuracy on multi-label classification
- Provides probability scores for each topic category
Frequently Asked Questions
Q: What makes this model unique?
This model specializes in multi-label topic classification for tweets, utilizing a RoBERTa architecture specifically trained on social media content. Its ability to handle multiple topics per tweet makes it particularly valuable for complex content analysis tasks.
Q: What are the recommended use cases?
The model is ideal for social media analytics, content categorization, trend analysis, and automated content tagging systems. It's particularly useful for applications requiring simultaneous identification of multiple topics in short-form social media content.