HTML-Pruner-Llama-1B

Property	Value
Parameter Count	1.24B
License	Apache 2.0
Paper	HtmlRAG Paper
Base Model	meta-llama/Llama-3.2-1B

What is HTML-Pruner-Llama-1B?

HTML-Pruner-Llama-1B is a specialized language model designed for efficient HTML content pruning in Retrieval-Augmented Generation (RAG) systems. It's a key component of the HtmlRAG framework, which innovatively uses HTML instead of plain text for handling external knowledge in RAG systems. The model implements a two-step block-tree-based HTML pruning approach to optimize content selection while preserving semantic structure.

Implementation Details

The model operates through a sophisticated two-step pruning process: first using embedding-based scoring, followed by path-generative pruning. It's built on the Llama 1B architecture and operates with BF16 precision, making it both efficient and effective for production deployments.

Implements Lossless HTML Cleaning for maintaining semantic integrity
Features Block-Tree-Based HTML pruning for optimal content selection
Supports flexible context window management
Includes built-in tokenization and processing capabilities

Core Capabilities

Efficient HTML document processing and cleaning
Intelligent content ranking and selection
Integration with various embedding models
Support for custom tokenizer implementations
Competitive performance across multiple benchmark datasets

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically designed for HTML processing in RAG systems, offering a novel approach to content pruning while maintaining HTML structure integrity. It achieves competitive results across various datasets including ASQA, HotpotQA, and NQ.

Q: What are the recommended use cases?

The model is ideal for RAG systems that need to process HTML content efficiently, particularly in applications requiring intelligent content selection and summarization while maintaining HTML structure. It's especially useful in scenarios where context length is a constraint.