cybertron-v4-qw7B-UNAMGS

Maintained By
fblgit

cybertron-v4-qw7B-UNAMGS

PropertyValue
Parameter Count7.62B
Base ModelQwen2.5-7B-Instruct
LicenseQwen License
Training DatasetMagpie-Align/Magpie-Qwen2.5-Pro-1M-v0.1
Average Benchmark Score31.82

What is cybertron-v4-qw7B-UNAMGS?

Cybertron-v4-qw7B-UNAMGS is an advanced language model that builds upon the Qwen2.5 7B architecture, incorporating novel techniques like UNA (Uniform Neural Alignment) and MGS. It has achieved the #1 position in the 7-8B LLM category with no contamination, demonstrating impressive performance across various benchmarks.

Implementation Details

The model underwent specialized training using the Magpie-Align dataset with UNA applied to MLP layers. The training process involved one epoch with a total train batch size of 64 across 8 GPUs, utilizing the Adam optimizer. The model demonstrates strong contamination resistance, as evidenced by benchmark comparisons with the original Qwen2.5 model.

  • Implements novel UNA (Uniform Neural Alignment) technique
  • Utilizes proprietary MGS methodology
  • Trained on high-quality Magpie-Align dataset
  • BF16 tensor type for optimal performance

Core Capabilities

  • IFEval (0-Shot): 60.84% accuracy
  • BBH (3-Shot): 37.71% normalized accuracy
  • MATH Lvl 5 (4-Shot): 29.91% exact match
  • MMLU-PRO (5-shot): 38.89% accuracy

Frequently Asked Questions

Q: What makes this model unique?

The model's uniqueness stems from its implementation of UNA and MGS techniques, achieving state-of-the-art performance in its parameter class while maintaining low contamination levels.

Q: What are the recommended use cases?

The model excels in text generation tasks, particularly in scenarios requiring strong reasoning capabilities and accurate instruction following, as demonstrated by its high IFEval scores.

The first platform built for prompt engineering