cybertron-v4-qw7B-UNAMGS
Property | Value |
---|---|
Parameter Count | 7.62B |
Base Model | Qwen2.5-7B-Instruct |
License | Qwen License |
Training Dataset | Magpie-Align/Magpie-Qwen2.5-Pro-1M-v0.1 |
Average Benchmark Score | 31.82 |
What is cybertron-v4-qw7B-UNAMGS?
Cybertron-v4-qw7B-UNAMGS is an advanced language model that builds upon the Qwen2.5 7B architecture, incorporating novel techniques like UNA (Uniform Neural Alignment) and MGS. It has achieved the #1 position in the 7-8B LLM category with no contamination, demonstrating impressive performance across various benchmarks.
Implementation Details
The model underwent specialized training using the Magpie-Align dataset with UNA applied to MLP layers. The training process involved one epoch with a total train batch size of 64 across 8 GPUs, utilizing the Adam optimizer. The model demonstrates strong contamination resistance, as evidenced by benchmark comparisons with the original Qwen2.5 model.
- Implements novel UNA (Uniform Neural Alignment) technique
- Utilizes proprietary MGS methodology
- Trained on high-quality Magpie-Align dataset
- BF16 tensor type for optimal performance
Core Capabilities
- IFEval (0-Shot): 60.84% accuracy
- BBH (3-Shot): 37.71% normalized accuracy
- MATH Lvl 5 (4-Shot): 29.91% exact match
- MMLU-PRO (5-shot): 38.89% accuracy
Frequently Asked Questions
Q: What makes this model unique?
The model's uniqueness stems from its implementation of UNA and MGS techniques, achieving state-of-the-art performance in its parameter class while maintaining low contamination levels.
Q: What are the recommended use cases?
The model excels in text generation tasks, particularly in scenarios requiring strong reasoning capabilities and accurate instruction following, as demonstrated by its high IFEval scores.