tweet-topic-21-multi
Property | Value |
---|---|
License | MIT |
Paper | TweetTopic (COLING 2022) |
Training Data | 124M tweets (2018-2021) |
Framework | PyTorch/TensorFlow |
What is tweet-topic-21-multi?
tweet-topic-21-multi is a sophisticated multi-label topic classification model specifically designed for Twitter content analysis. Built by CardiffNLP, it leverages a TimeLMs language model trained on an extensive dataset of 124 million tweets spanning from January 2018 to December 2021. The model can classify tweets into 19 distinct categories, ranging from arts & culture to youth & student life.
Implementation Details
The model is implemented using the Transformers architecture and supports both PyTorch and TensorFlow frameworks. It utilizes a RoBERTa-based architecture fine-tuned on 11,267 carefully annotated tweets. The model employs a multi-label classification approach, allowing a single tweet to be associated with multiple topics simultaneously.
- Built on TimeLMs architecture with custom Twitter training
- Supports 19 distinct topic categories
- Implements threshold-based prediction (0.5) for multi-label classification
- Provides both PyTorch and TensorFlow compatibility
Core Capabilities
- Multi-label topic classification of tweets
- High-accuracy prediction across diverse categories
- Real-time classification support
- Handles complex, multi-themed content
- Supports batch processing and inference endpoints
Frequently Asked Questions
Q: What makes this model unique?
This model stands out due to its specialized training on recent Twitter data and ability to handle multiple topic classifications simultaneously. Its foundation on TimeLMs and extensive training dataset of 124M tweets makes it particularly effective for modern social media content analysis.
Q: What are the recommended use cases?
The model is ideal for social media monitoring, content categorization, trend analysis, and research applications requiring topic classification of tweets. It's particularly useful for applications requiring multi-topic categorization of social media content.