TinyLlama-1.1B-step-50K-105b

Maintained By
TinyLlama

TinyLlama-1.1B-step-50K-105b

PropertyValue
Parameter Count1.1B parameters
Training Progress105B tokens (50K steps)
LicenseApache 2.0
ArchitectureLLaMA-based Transformer
FormatPyTorch with Safetensors

What is TinyLlama-1.1B-step-50K-105b?

TinyLlama-1.1B-step-50K-105b is an intermediate checkpoint of an ambitious project aiming to create a compact yet powerful language model. This model represents a milestone in the TinyLlama project, which aims to train a 1.1B parameter model on 3 trillion tokens within 90 days using 16 A100-40G GPUs.

Implementation Details

The model adopts the same architecture and tokenizer as Llama 2, making it compatible with existing Llama-based projects. This checkpoint has been trained on a combination of the cerebras/SlimPajama-627B and bigcode/starcoderdata datasets, achieving a HellaSwag Acc_norm score of 43.50.

  • Compatible with transformers>=4.31
  • Supports text generation tasks
  • Optimized for both CPU and GPU inference
  • Implements efficient F32 tensor operations

Core Capabilities

  • Text generation and completion
  • Efficient deployment in resource-constrained environments
  • Plug-and-play compatibility with Llama ecosystem
  • Balanced performance with minimal computational requirements

Frequently Asked Questions

Q: What makes this model unique?

TinyLlama stands out for its efficient architecture that maintains Llama 2 compatibility while requiring significantly fewer resources. At just 1.1B parameters, it's designed for applications where computational resources are limited but high-quality language processing is needed.

Q: What are the recommended use cases?

The model is particularly suited for applications requiring a small footprint while maintaining decent performance, such as edge devices, mobile applications, or scenarios where rapid deployment and inference are prioritized over maximum accuracy.

The first platform built for prompt engineering