Tesslate_Tessa-T1-3B-GGUF

Maintained By
bartowski

Tesslate_Tessa-T1-3B-GGUF

PropertyValue
Original ModelTessa-T1-3B
Quantization Frameworkllama.cpp (b4978)
Size Range1.14GB - 6.18GB
Model Linkhttps://huggingface.co/bartowski/Tesslate_Tessa-T1-3B-GGUF

What is Tesslate_Tessa-T1-3B-GGUF?

Tesslate_Tessa-T1-3B-GGUF is a comprehensive collection of quantized versions of the Tessa-T1-3B model, optimized for different use cases and hardware configurations. The collection features various quantization levels using the imatrix option, ranging from full BF16 weights to highly compressed IQ2_M variants.

Implementation Details

The model implementations utilize advanced quantization techniques with specific prompt formatting requirements using system and user delimiters. The quantization process employs llama.cpp's latest features, including online repacking for ARM and AVX CPU inference in certain variants.

  • Multiple quantization options (Q8_0 to IQ2_M)
  • Special handling for embed/output weights in certain variants
  • Optimized performance for different hardware configurations
  • Support for online weight repacking

Core Capabilities

  • Flexible deployment options for different RAM/VRAM configurations
  • Quality-size tradeoff options for various use cases
  • Optimized performance on both CPU and GPU systems
  • Special variants for ARM and AVX architecture optimization

Frequently Asked Questions

Q: What makes this model unique?

This model collection stands out for its comprehensive range of quantization options, allowing users to choose the perfect balance between model size, quality, and performance for their specific hardware setup. The implementation includes cutting-edge features like online repacking and specialized embed/output weight handling.

Q: What are the recommended use cases?

For maximum quality, users should choose Q6_K_L or Q6_K variants. For balanced performance, Q4_K_M is recommended as the default choice. For systems with limited RAM, the IQ3 and IQ2 variants offer surprisingly usable performance at smaller sizes.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.