HomerCreativeAnvita-Mix-Qw7B

Maintained By
suayptalha

HomerCreativeAnvita-Mix-Qw7B

PropertyValue
Parameter Count7.62B
Model TypeMerged Language Model
ArchitectureQwen2-based Transformer
Tensor TypeBF16

What is HomerCreativeAnvita-Mix-Qw7B?

HomerCreativeAnvita-Mix-Qw7B is a sophisticated merged language model that combines two powerful base models using the SLERP (Spherical Linear Interpolation) merge method. Currently ranked #1 on the Open LLM Leaderboard among models up to 13B parameters, it demonstrates exceptional performance across various benchmarks.

Implementation Details

The model is implemented using mergekit, utilizing a carefully crafted SLERP merge configuration that combines ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix and ZeroXClem/Qwen2.5-7B-HomerCreative-Mix. The merge configuration employs specific attention and MLP layer weightings across 28 layers.

  • SLERP merge method with customized layer-wise interpolation
  • BFloat16 precision for optimal performance and memory usage
  • Sophisticated attention and MLP layer mixing ratios
  • 28-layer architecture derived from Qwen2.5 base models

Core Capabilities

  • 78.08% accuracy on IFEval (0-Shot) tasks
  • 36.98% normalized accuracy on BBH (3-Shot)
  • 31.04% exact match on MATH Level 5 (4-Shot)
  • 38.28% accuracy on MMLU-PRO (5-shot)

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its sophisticated merge strategy using SLERP, combining two specialized Qwen2.5 variants to achieve state-of-the-art performance in its parameter class. The careful balance of attention and MLP layer weightings results in superior performance across diverse tasks.

Q: What are the recommended use cases?

Given its strong performance on various benchmarks, this model is particularly well-suited for text generation tasks, complex reasoning problems, and educational applications requiring mathematical comprehension. It performs especially well in zero-shot and few-shot scenarios.

The first platform built for prompt engineering