bert-base-swedish-cased-ner

Property	Value
Developer	KB (National Library of Sweden)
Framework	PyTorch
Language	Swedish
Downloads	83,043

What is bert-base-swedish-cased-ner?

bert-base-swedish-cased-ner is a specialized BERT model developed by the National Library of Sweden (KB) for Named Entity Recognition in Swedish text. This model is fine-tuned on the SUC 3.0 dataset and represents an experimental version designed specifically for identifying and classifying named entities in Swedish content.

Implementation Details

The model is built upon the base Swedish BERT architecture, which was trained on approximately 15-20GB of text (200M sentences, 3000M tokens) from diverse sources including books, news, government publications, Swedish Wikipedia, and internet forums. It maintains case sensitivity and implements whole word masking during training.

Pretrained on comprehensive Swedish text corpus
Case-sensitive implementation
Specialized for NER tasks
Supports identification of multiple entity types (TME, PRS, LOC, EVN, ORG)

Core Capabilities

Time expression recognition (TME)
Personal name detection (PRS)
Location identification (LOC)
Event recognition (EVN)
Organization name detection (ORG)

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically optimized for Swedish language NER tasks, trained on a comprehensive dataset of Swedish text, making it particularly effective for processing Swedish content. Its ability to maintain case sensitivity and handle whole word masking makes it especially accurate for named entity detection.

Q: What are the recommended use cases?

The model is ideal for applications requiring named entity extraction from Swedish text, such as information extraction systems, content analysis tools, and automated document processing. It's particularly useful for identifying organizations, locations, personal names, time expressions, and events in Swedish text.