NuExtract-tiny

Property	Value
Parameter Count	464M
Base Model	Qwen1.5-0.5B
License	MIT
Language	English
Tensor Type	F32

What is NuExtract-tiny?

NuExtract-tiny is a specialized information extraction model built on the Qwen1.5-0.5B architecture. This model has been fine-tuned on a private high-quality synthetic dataset specifically for extracting structured information from text using JSON templates.

Implementation Details

The model is implemented using the transformers library and leverages advanced architectures for efficient information extraction. It's designed to process inputs of up to 2000 tokens and provides purely extractive capabilities, meaning all output text is present verbatim in the input text.

Built on Qwen1.5-0.5B architecture
Supports JSON template-based extraction
Zero-shot capability with best results after fine-tuning (≥30 examples)
Purely extractive functionality

Core Capabilities

Structured information extraction using JSON templates
Support for example-based formatting
Processing of texts up to 2000 tokens
Zero-shot and fine-tuning compatibility

Frequently Asked Questions

Q: What makes this model unique?

NuExtract-tiny stands out for its specialized focus on information extraction using JSON templates, making it particularly effective for structured data extraction tasks while maintaining a relatively small parameter count of 464M.

Q: What are the recommended use cases?

The model is ideal for scenarios requiring structured information extraction from text, particularly when working with defined JSON templates. While it performs well in zero-shot settings, it's specifically designed to be fine-tuned on specific tasks with 30 or more examples for optimal performance.

NuExtract-tiny

NuExtract-tiny

What is NuExtract-tiny?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models