nomic-embed-text-v1.5-GGUF

Maintained By
nomic-ai

nomic-embed-text-v1.5-GGUF

PropertyValue
Parameter Count137M
LicenseApache 2.0
Context Length8192 tokens
Model TypeBERT-based embedding model

What is nomic-embed-text-v1.5-GGUF?

nomic-embed-text-v1.5-GGUF is a powerful text embedding model optimized for llama.cpp compatibility, designed specifically for sentence similarity tasks and RAG applications. This model comes in various quantization formats, ranging from lightweight 48 MiB to full-precision 262 MiB versions, offering flexibility in the trade-off between model size and accuracy.

Implementation Details

The model requires task instruction prefixes for embedding generation and supports an extended context length of 8192 tokens when properly configured. It implements Dynamic NTK-Aware RoPE scaling in its original form, though llama.cpp users need to use YaRN and linear scaling as alternatives.

  • Multiple quantization options (Q2_K through F32) with documented MSE metrics
  • Supports batch processing for multiple embeddings
  • Compatible with llama.cpp as of February 2024
  • Implements advanced context extension methods

Core Capabilities

  • High-quality text embeddings for semantic search
  • Extended context length support (8192 tokens)
  • Efficient processing with various quantization options
  • Optimized for RAG (Retrieval-Augmented Generation) applications

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its efficient GGUF format implementation and extensive quantization options, making it highly versatile for different deployment scenarios while maintaining embedding quality. The documented MSE metrics for each quantization level allow users to make informed decisions about the trade-off between model size and accuracy.

Q: What are the recommended use cases?

The model is particularly well-suited for RAG applications, semantic search, and any use case requiring high-quality text embeddings. Its various quantization options make it adaptable to different hardware constraints while maintaining acceptable performance levels.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.