OpenELM-1_1B-Instruct

Maintained By
apple

OpenELM-1_1B-Instruct

PropertyValue
Parameter Count1.08B
Model TypeInstruction-tuned Language Model
LicenseApple Sample Code License
PaperArXiv Paper
FormatBF16

What is OpenELM-1_1B-Instruct?

OpenELM-1_1B-Instruct is part of Apple's OpenELM family of efficient language models. This 1.08B parameter model represents a balanced compromise between model size and performance, featuring a unique layer-wise scaling strategy that optimizes parameter allocation within transformer layers. The model has been pretrained on approximately 1.8 trillion tokens from diverse sources including RefinedWeb, PILE, RedPajama, and Dolma v1.6.

Implementation Details

The model employs advanced architectural optimizations including:

  • Layer-wise parameter scaling for enhanced efficiency
  • Built using the CoreNet library for optimal performance
  • Supports speculative generation for faster inference
  • Compatible with both lookup token and model-wise assisted generation

Core Capabilities

  • Strong performance on multiple benchmarks (71.20% on HellaSwag, 70.00% on BoolQ)
  • Versatile text generation capabilities
  • Efficient parameter utilization through innovative scaling
  • Instruction-tuned for better task alignment

Frequently Asked Questions

Q: What makes this model unique?

OpenELM-1_1B-Instruct stands out for its efficient parameter allocation strategy and strong performance metrics despite its relatively modest size. The model achieves impressive results across various benchmarks, often outperforming its base variant in instruction-following tasks.

Q: What are the recommended use cases?

The model is well-suited for a range of natural language processing tasks, particularly those requiring instruction following. It performs especially well in multiple-choice tasks, question answering, and general text generation while maintaining computational efficiency.

The first platform built for prompt engineering