bert-base-swedish-cased-ner
Property | Value |
---|---|
Developer | KB (National Library of Sweden) |
Framework | PyTorch |
Language | Swedish |
Downloads | 83,043 |
What is bert-base-swedish-cased-ner?
bert-base-swedish-cased-ner is a specialized BERT model developed by the National Library of Sweden (KB) for Named Entity Recognition in Swedish text. This model is fine-tuned on the SUC 3.0 dataset and represents an experimental version designed specifically for identifying and classifying named entities in Swedish content.
Implementation Details
The model is built upon the base Swedish BERT architecture, which was trained on approximately 15-20GB of text (200M sentences, 3000M tokens) from diverse sources including books, news, government publications, Swedish Wikipedia, and internet forums. It maintains case sensitivity and implements whole word masking during training.
- Pretrained on comprehensive Swedish text corpus
- Case-sensitive implementation
- Specialized for NER tasks
- Supports identification of multiple entity types (TME, PRS, LOC, EVN, ORG)
Core Capabilities
- Time expression recognition (TME)
- Personal name detection (PRS)
- Location identification (LOC)
- Event recognition (EVN)
- Organization name detection (ORG)
Frequently Asked Questions
Q: What makes this model unique?
This model is specifically optimized for Swedish language NER tasks, trained on a comprehensive dataset of Swedish text, making it particularly effective for processing Swedish content. Its ability to maintain case sensitivity and handle whole word masking makes it especially accurate for named entity detection.
Q: What are the recommended use cases?
The model is ideal for applications requiring named entity extraction from Swedish text, such as information extraction systems, content analysis tools, and automated document processing. It's particularly useful for identifying organizations, locations, personal names, time expressions, and events in Swedish text.