Higgs-Llama-3-70B

Maintained By
bosonai

Higgs-Llama-3-70B

PropertyValue
Parameter Count70.6B
Base ModelMeta-Llama-3-70B
LicenseLLama 3 Community License
Tensor TypeF32

What is Higgs-Llama-3-70B?

Higgs-Llama-3-70B is an advanced language model post-trained from Meta's Llama-3-70B, specifically engineered for enhanced role-playing capabilities while maintaining competitive performance in general instruction-following and reasoning tasks. The model demonstrates impressive results on challenging benchmarks like MMLU-Pro (63.2%) and Arena-Hard (49.6%), positioning it competitively among leading models like GPT-4 and Claude-3.

Implementation Details

The model underwent supervised fine-tuning using proprietary instruction-following and chat datasets. A distinctive feature is its semi-automated pipeline for preference optimization, utilizing both human labelers and private LLMs to align the model's behavior, particularly with system messages.

  • Specialized role-playing optimization
  • Benchmark-aware training approach avoiding overfitting
  • Compatible with standard transformers library implementation
  • Supports bfloat16 for efficient inference

Core Capabilities

  • Superior role-playing and character embodiment
  • Strong performance on MMLU-Pro (63.2%)
  • Competitive scores on Arena-Hard benchmark
  • Effective general instruction following
  • Enhanced system message adherence

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its specialized optimization for role-playing while maintaining strong general capabilities. It achieved this through a novel training approach that specifically avoids benchmark overfitting while focusing on real-world performance.

Q: What are the recommended use cases?

The model excels in scenarios requiring role-playing, character embodiment, and general instruction following. It's particularly suitable for applications needing consistent character maintenance while handling complex reasoning tasks.

The first platform built for prompt engineering