inf-retriever-v1

Maintained By
infly

INF-Retriever-v1

PropertyValue
Model Size7B parameters
Embedding Dimension3584
Max Input Tokens32768
LanguagesChinese & English (effective in other languages)
Base ModelGTE-Qwen2-7B-instruct
PublishedJanuary 2025

What is inf-retriever-v1?

INF-Retriever-v1 is a cutting-edge dense retrieval model developed by INF TECH, built upon the GTE-Qwen2-7B-instruct architecture. As of January 2025, it holds the #1 position on both AIR-Bench 24.04 and 24.05 benchmarks, demonstrating exceptional performance in heterogeneous information retrieval tasks. The model excels particularly in Chinese and English content retrieval, while maintaining strong capabilities across multiple languages.

Implementation Details

The model leverages a sophisticated architecture with 7B parameters and a high-dimensional embedding space of 3584 dimensions. It supports an impressive context window of 32,768 tokens, making it suitable for processing long documents. The implementation is available through both Sentence Transformers and Hugging Face Transformers libraries, offering flexibility in deployment.

  • Specialized instruction-tuning for retrieval tasks
  • State-of-the-art performance on multilingual benchmarks
  • Efficient dense embedding generation
  • Optimized for both accuracy and computational efficiency

Core Capabilities

  • Superior performance in bilingual (Chinese/English) retrieval tasks
  • Effective cross-language information retrieval
  • Robust performance across diverse domains (healthcare, law, finance, etc.)
  • High accuracy in document matching and semantic search
  • Scalable to large-scale retrieval applications

Frequently Asked Questions

Q: What makes this model unique?

INF-Retriever-v1 stands out for its top-ranking performance on AIR-Bench benchmarks and its exceptional bilingual capabilities. Despite being trained primarily on Chinese and English, it demonstrates strong performance across 13 different languages, making it highly versatile for global applications.

Q: What are the recommended use cases?

The model is ideal for enterprise search systems, content recommendation engines, and large-scale information retrieval systems. It excels in scenarios requiring accurate document matching, semantic search, and cross-lingual information retrieval, particularly in professional domains like healthcare, finance, and legal documentation.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.