SuperNova-Medius-GGUF

Maintained By
arcee-ai

SuperNova-Medius-GGUF

PropertyValue
Parameter Count14.8B
LicenseApache-2.0
ArchitectureQwen2.5-14B-Instruct
Authorarcee-ai

What is SuperNova-Medius-GGUF?

SuperNova-Medius-GGUF is an advanced language model that represents a significant achievement in cross-architecture knowledge distillation. Built on the Qwen2.5-14B-Instruct architecture, this model uniquely combines knowledge from both Qwen2.5-72B-Instruct and Llama-3.1-405B-Instruct models through a sophisticated distillation process.

Implementation Details

The model employs a multi-teacher distillation approach, utilizing both logit and hidden state distillation techniques. The implementation involves careful vocabulary alignment across different architectures using mergekit-tokensurgeon, followed by a specialized fine-tuning process using EvolKit.

  • Cross-architecture distillation from two teacher models
  • Sophisticated vocabulary alignment system
  • Custom instruction dataset training
  • Optimized for 14B parameter efficiency

Core Capabilities

  • Advanced instruction-following with 0.832 score on IFEval
  • Strong performance in complex reasoning (0.631 on BBH)
  • Excels in customer support and technical assistance
  • Content creation and generation capabilities
  • Resource-efficient deployment options

Frequently Asked Questions

Q: What makes this model unique?

The model's uniqueness lies in its cross-architecture distillation approach, combining knowledge from both Llama and Qwen architectures while maintaining a relatively compact 14B parameter size. It achieves performance metrics that rival larger models while being more deployment-friendly.

Q: What are the recommended use cases?

SuperNova-Medius is particularly well-suited for customer support automation, technical content creation, and complex reasoning tasks. Its balanced performance makes it ideal for organizations seeking advanced AI capabilities without the resource requirements of larger models.

The first platform built for prompt engineering