mrebel-large
Property | Value |
---|---|
Parameter Count | 611M |
License | CC BY-NC-SA 4.0 |
Paper | RED^{FM}: a Filtered and Multilingual Relation Extraction Dataset |
Supported Languages | 18 languages including Arabic, Chinese, English, French, etc. |
What is mrebel-large?
mrebel-large is a sophisticated multilingual relation extraction model that extends the capabilities of the original REBEL architecture. Developed by Babelscape, it's designed to perform relation extraction tasks across 18 different languages using a seq2seq approach. The model represents a significant advancement in multilingual NLP, particularly in the domain of extracting structured relationships from unstructured text.
Implementation Details
Built on the mBART architecture, mrebel-large employs a transformer-based sequence-to-sequence approach with 611M parameters. It uses PyTorch and Safetensors for efficient computation and storage. The model accepts text input in any of its supported languages and generates structured triplets containing subject, relation, and object information.
- Transformer-based architecture with seq2seq capability
- Support for F32 tensor operations
- Implements special token handling for relation extraction
- Utilizes beam search for generation with configurable parameters
Core Capabilities
- Multilingual relation extraction across 18 languages
- Automatic triplet extraction from natural text
- Entity type classification
- Cross-lingual relation mapping
- Structured output generation in subject-relation-object format
Frequently Asked Questions
Q: What makes this model unique?
mrebel-large's uniqueness lies in its ability to perform relation extraction across 18 different languages while maintaining high accuracy. It reframes relation extraction as a seq2seq task, making it more flexible and adaptable than traditional classification-based approaches.
Q: What are the recommended use cases?
The model is ideal for multilingual information extraction, knowledge base construction, automated relationship mapping in documents, and cross-lingual information retrieval. It's particularly useful for organizations dealing with multilingual content needing to extract structured relationships.