MedNER-CR-JA
Property | Value |
---|---|
Parameter Count | 110M |
License | CC-BY-4.0 |
Language | Japanese |
Framework | PyTorch with Transformers |
Training Data | MedTxt-CR-JA-training-v2.xml |
What is MedNER-CR-JA?
MedNER-CR-JA is a specialized Named Entity Recognition (NER) model designed for processing Japanese medical documents. Developed by sociocom, this model excels at identifying and classifying various medical entities within clinical texts, including diseases, medications, and temporal expressions.
Implementation Details
Built on the BERT architecture and implemented using PyTorch and Transformers, this model processes Japanese medical text through a token classification approach. The model requires specific files (id_to_tags.pkl, key_attr.pkl, NER_medNLP.py, predict.py) to function and can be easily deployed using a simple Python script.
- Utilizes 110M parameters for comprehensive medical entity recognition
- Supports SafeTensors format for efficient model loading
- Includes specialized tagging for medical entities (d, m-key, timex3)
- Evaluated on NTCIR-16 Real-MedNLP subtask 1
Core Capabilities
- Disease and condition identification with certainty attribution
- Medication mention detection with state tracking
- Temporal expression recognition and classification
- Structured output format with XML-style annotations
- Batch processing of medical documents
Frequently Asked Questions
Q: What makes this model unique?
This model is specifically optimized for Japanese medical text analysis, offering detailed entity recognition with attributes like certainty for diseases and state information for medications. Its structured XML-style output makes it particularly valuable for clinical document processing systems.
Q: What are the recommended use cases?
The model is ideal for processing Japanese clinical records, medical research documents, and healthcare documentation. It can be used for automated information extraction, clinical research support, and medical document analysis systems where structured entity recognition is required.