OneKE

Maintained By
zjunlp

OneKE

PropertyValue
LicenseCC-BY-NC-SA-4.0
LanguagesEnglish, Chinese
PaperIEPile: Unearthing Large-Scale Schema-Based Information Extraction Corpus
Downloads1,888

What is OneKE?

OneKE is a sophisticated bilingual large language model framework developed jointly by Ant Group and Zhejiang University, specifically designed for comprehensive knowledge extraction tasks. Built on Chinese-Alpaca-2-13B, it excels at performing generalized knowledge extraction in both Chinese and English across multiple domains and tasks.

Implementation Details

The model implements a schema-generalizable approach to information extraction, utilizing advanced techniques such as normalization and cleaning of extraction instructions, difficult negative sample collection, and schema-based batched instruction construction. It requires at least 20GB of VRAM for optimal performance.

  • Supports Named Entity Recognition (NER), Relation Extraction (RE), and Event Extraction (EE)
  • Implements a unified knowledge extraction framework with schema-based capabilities
  • Utilizes 4-bit quantization for efficient deployment

Core Capabilities

  • Bilingual processing in Chinese and English
  • Zero-shot generalization across multiple domains
  • Structured knowledge extraction with customizable schemas
  • Support for complex event and relation extraction tasks
  • Batch processing of multiple schemas

Frequently Asked Questions

Q: What makes this model unique?

OneKE stands out for its ability to perform schema-generalizable information extraction across multiple languages and domains, while maintaining high performance in zero-shot scenarios. Its unified framework significantly reduces the cost of building domain-specific knowledge graphs.

Q: What are the recommended use cases?

The model is ideal for converting unstructured documents into structured knowledge, particularly in domains like medical information extraction, financial report analysis, and public sector document processing. It's especially useful for building knowledge graphs and enhancing other large language models by providing structured information.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.