OpenELM-270M-Instruct

Maintained By
apple

OpenELM-270M-Instruct

PropertyValue
Parameter Count272M
Model TypeInstruction-tuned Language Model
LicenseApple Sample Code License
PaperarXiv:2404.14619
Tensor TypeBF16

What is OpenELM-270M-Instruct?

OpenELM-270M-Instruct is part of Apple's OpenELM family of efficient language models. This instruction-tuned variant contains 272 million parameters and implements a novel layer-wise scaling strategy for optimal parameter allocation. The model was trained on a diverse dataset of approximately 1.8 trillion tokens, including RefinedWeb, PILE, RedPajama, and Dolma v1.6.

Implementation Details

The model utilizes the Transformers architecture with several key optimizations. It's implemented using BF16 precision and requires a Hugging Face access token for usage. The model supports various generation strategies including lookup token speculative generation and model-wise speculative generation with assistive models.

  • Efficient layer-wise parameter scaling
  • Instruction-tuned architecture
  • Support for speculative generation
  • Built on CoreNet library

Core Capabilities

  • Strong zero-shot performance across multiple benchmarks (55.11% average)
  • Improved performance on ARC-c (30.55%) and HellaSwag (52.07%)
  • Enhanced instruction following capabilities
  • Efficient text generation with customizable parameters

Frequently Asked Questions

Q: What makes this model unique?

OpenELM-270M-Instruct stands out for its efficient parameter allocation strategy and strong performance despite its relatively small size. The model achieves competitive results on various benchmarks while maintaining a compact architecture.

Q: What are the recommended use cases?

The model is well-suited for general text generation tasks, especially those requiring instruction following. It performs particularly well in zero-shot scenarios and can be effectively used for tasks like question answering, text completion, and general language understanding.

The first platform built for prompt engineering