bert-base-turkish-sentiment-cased
Property | Value |
---|---|
Parameter Count | 111M |
Model Type | BERT-based Sentiment Analysis |
Paper | arXiv:2401.17396 |
Accuracy | 95.4% |
Training Data Size | 48,290 samples |
What is bert-base-turkish-sentiment-cased?
This is a specialized Turkish language sentiment analysis model based on BERTurk, fine-tuned for binary sentiment classification. The model leverages a comprehensive dataset of 48,290 samples combining movie reviews, product reviews, and tweets, achieving impressive 95.4% accuracy on evaluation tasks.
Implementation Details
The model is built upon the dbmdz/bert-base-turkish-cased architecture and trained using the Transformers library. It processes text input through a specialized tokenizer and outputs binary sentiment classifications (positive/negative) with confidence scores.
- Built on BERTurk architecture with 111M parameters
- Trained on a merged dataset from multiple Turkish sentiment sources
- Supports F32 tensor operations
- Implements both CPU and GPU inference
Core Capabilities
- Binary sentiment classification (positive/negative)
- Processing of Turkish text with cased tokenization
- Confidence score output for predictions
- Batch processing support
- Integration with Hugging Face Transformers pipeline
Frequently Asked Questions
Q: What makes this model unique?
This model is specifically optimized for Turkish sentiment analysis, trained on a diverse dataset including movie reviews, product reviews, and social media content. Its high accuracy (95.4%) and specialized Turkish language understanding make it particularly valuable for Turkish text analysis tasks.
Q: What are the recommended use cases?
The model is ideal for analyzing customer feedback, social media monitoring, product reviews, and movie reviews in Turkish. It's particularly suited for applications requiring binary sentiment classification with high confidence scores.