HomerCreativeAnvita-Mix-Qw7B

Property	Value
Parameter Count	7.62B
Model Type	Merged Language Model
Architecture	Qwen2-based Transformer
Tensor Type	BF16

What is HomerCreativeAnvita-Mix-Qw7B?

HomerCreativeAnvita-Mix-Qw7B is a sophisticated merged language model that combines two powerful base models using the SLERP (Spherical Linear Interpolation) merge method. Currently ranked #1 on the Open LLM Leaderboard among models up to 13B parameters, it demonstrates exceptional performance across various benchmarks.

Implementation Details

The model is implemented using mergekit, utilizing a carefully crafted SLERP merge configuration that combines ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix and ZeroXClem/Qwen2.5-7B-HomerCreative-Mix. The merge configuration employs specific attention and MLP layer weightings across 28 layers.

SLERP merge method with customized layer-wise interpolation
BFloat16 precision for optimal performance and memory usage
Sophisticated attention and MLP layer mixing ratios
28-layer architecture derived from Qwen2.5 base models

Core Capabilities

78.08% accuracy on IFEval (0-Shot) tasks
36.98% normalized accuracy on BBH (3-Shot)
31.04% exact match on MATH Level 5 (4-Shot)
38.28% accuracy on MMLU-PRO (5-shot)

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its sophisticated merge strategy using SLERP, combining two specialized Qwen2.5 variants to achieve state-of-the-art performance in its parameter class. The careful balance of attention and MLP layer weightings results in superior performance across diverse tasks.

Q: What are the recommended use cases?

Given its strong performance on various benchmarks, this model is particularly well-suited for text generation tasks, complex reasoning problems, and educational applications requiring mathematical comprehension. It performs especially well in zero-shot and few-shot scenarios.