heBERT Sentiment Analysis
Property | Value |
---|---|
Author | avichr |
Paper | arXiv:2102.01909 |
Architecture | BERT-Base |
Task | Hebrew Sentiment Analysis |
What is heBERT_sentiment_analysis?
heBERT is a groundbreaking Hebrew language model based on BERT architecture, specifically fine-tuned for sentiment analysis tasks. The model was trained on an impressive collection of Hebrew texts, including 9.8GB from OSCAR, 650MB from Wikipedia, and 150MB of user-generated content, making it one of the most comprehensive Hebrew language models available.
Implementation Details
The model demonstrates exceptional performance in sentiment analysis, achieving 97% accuracy in polarity classification. It can categorize text into three sentiment classes: natural, positive, and negative, with particularly strong performance in identifying negative (F1: 0.98) and positive (F1: 0.94) sentiments.
- Built on BERT-Base architecture
- Trained on over 1 billion words and 20.8 million sentences
- Incorporates crowd-annotated emotion data from news site comments
- Supports both masked language modeling and sentiment classification
Core Capabilities
- High-accuracy sentiment classification for Hebrew text
- Pretrained masked language modeling functionality
- AWS deployment support
- Easy integration with Hugging Face Transformers pipeline
Frequently Asked Questions
Q: What makes this model unique?
This model is specifically designed for Hebrew language processing, trained on an extensive and diverse Hebrew corpus, making it particularly effective for Hebrew sentiment analysis tasks. Its high accuracy scores and ability to handle both masked language modeling and sentiment classification make it a versatile tool for Hebrew NLP applications.
Q: What are the recommended use cases?
The model is ideal for sentiment analysis of Hebrew text, particularly in applications requiring polarity detection (positive, negative, or neutral sentiment). It's well-suited for analyzing user comments, social media content, and customer feedback in Hebrew.