MAmmoTH2-8B-Plus

Maintained By
TIGER-Lab

MAmmoTH2-8B-Plus

PropertyValue
Parameter Count8.03B
Model TypeText Generation/Conversational
ArchitectureLlama-based
LicenseMIT
PaperarXiv:2405.03548

What is MAmmoTH2-8B-Plus?

MAmmoTH2-8B-Plus is an advanced language model developed by TIGER-Lab that represents a significant breakthrough in AI reasoning capabilities. Built on the Llama-3 architecture, this model has been fine-tuned using an innovative approach that harvested 10 million instruction-response pairs from web data, followed by additional training on public instruction datasets.

Implementation Details

The model utilizes BF16 tensor type and incorporates sophisticated training procedures that leverage the WEBINSTRUCT dataset. Its architecture is optimized for both general language understanding and specialized mathematical reasoning tasks.

  • 8.03 billion parameters for complex reasoning capabilities
  • Built on Llama-3 architecture with enhanced instruction tuning
  • Optimized using web-scale instruction data
  • Implements advanced mathematical reasoning capabilities

Core Capabilities

  • Exceptional performance on GSM8K (85.2% accuracy)
  • Strong results on MATH benchmark (43.0%)
  • Impressive scores on BBH (69.7%) and ARC-C (84.3%)
  • Versatile text generation and conversation abilities

Frequently Asked Questions

Q: What makes this model unique?

MAmmoTH2-8B-Plus stands out for its innovative training approach using web-harvested instruction data and its exceptional performance on mathematical reasoning tasks without specific domain training. The model achieved significant improvements over baseline models across multiple benchmarks.

Q: What are the recommended use cases?

The model excels in mathematical reasoning, problem-solving, and general conversation tasks. It's particularly well-suited for applications requiring complex mathematical understanding, educational tools, and general-purpose AI assistance.

The first platform built for prompt engineering