heBERT Sentiment Analysis

Property	Value
Author	avichr
Paper	arXiv:2102.01909
Architecture	BERT-Base
Task	Hebrew Sentiment Analysis

What is heBERT_sentiment_analysis?

heBERT is a groundbreaking Hebrew language model based on BERT architecture, specifically fine-tuned for sentiment analysis tasks. The model was trained on an impressive collection of Hebrew texts, including 9.8GB from OSCAR, 650MB from Wikipedia, and 150MB of user-generated content, making it one of the most comprehensive Hebrew language models available.

Implementation Details

The model demonstrates exceptional performance in sentiment analysis, achieving 97% accuracy in polarity classification. It can categorize text into three sentiment classes: natural, positive, and negative, with particularly strong performance in identifying negative (F1: 0.98) and positive (F1: 0.94) sentiments.

Built on BERT-Base architecture
Trained on over 1 billion words and 20.8 million sentences
Incorporates crowd-annotated emotion data from news site comments
Supports both masked language modeling and sentiment classification

Core Capabilities

High-accuracy sentiment classification for Hebrew text
Pretrained masked language modeling functionality
AWS deployment support
Easy integration with Hugging Face Transformers pipeline

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically designed for Hebrew language processing, trained on an extensive and diverse Hebrew corpus, making it particularly effective for Hebrew sentiment analysis tasks. Its high accuracy scores and ability to handle both masked language modeling and sentiment classification make it a versatile tool for Hebrew NLP applications.

Q: What are the recommended use cases?

The model is ideal for sentiment analysis of Hebrew text, particularly in applications requiring polarity detection (positive, negative, or neutral sentiment). It's well-suited for analyzing user comments, social media content, and customer feedback in Hebrew.