cybertron-v4-qw7B-UNAMGS

Property	Value
Parameter Count	7.62B
Base Model	Qwen2.5-7B-Instruct
License	Qwen License
Training Dataset	Magpie-Align/Magpie-Qwen2.5-Pro-1M-v0.1
Average Benchmark Score	31.82

What is cybertron-v4-qw7B-UNAMGS?

Cybertron-v4-qw7B-UNAMGS is an advanced language model that builds upon the Qwen2.5 7B architecture, incorporating novel techniques like UNA (Uniform Neural Alignment) and MGS. It has achieved the #1 position in the 7-8B LLM category with no contamination, demonstrating impressive performance across various benchmarks.

Implementation Details

The model underwent specialized training using the Magpie-Align dataset with UNA applied to MLP layers. The training process involved one epoch with a total train batch size of 64 across 8 GPUs, utilizing the Adam optimizer. The model demonstrates strong contamination resistance, as evidenced by benchmark comparisons with the original Qwen2.5 model.

Implements novel UNA (Uniform Neural Alignment) technique
Utilizes proprietary MGS methodology
Trained on high-quality Magpie-Align dataset
BF16 tensor type for optimal performance

Core Capabilities

IFEval (0-Shot): 60.84% accuracy
BBH (3-Shot): 37.71% normalized accuracy
MATH Lvl 5 (4-Shot): 29.91% exact match
MMLU-PRO (5-shot): 38.89% accuracy

Frequently Asked Questions

Q: What makes this model unique?

The model's uniqueness stems from its implementation of UNA and MGS techniques, achieving state-of-the-art performance in its parameter class while maintaining low contamination levels.

Q: What are the recommended use cases?

The model excels in text generation tasks, particularly in scenarios requiring strong reasoning capabilities and accurate instruction following, as demonstrated by its high IFEval scores.