RedPajama-INCITE-7B-Base

Maintained By
togethercomputer

RedPajama-INCITE-7B-Base

PropertyValue
Parameter Count6.9B
LicenseApache 2.0
Training Hardware512 nodes of 6xV100 (IBM Power9)
Training Tokens1.001T
LanguageEnglish

What is RedPajama-INCITE-7B-Base?

RedPajama-INCITE-7B-Base is a state-of-the-art language model developed through a collaborative effort led by Together Computer and various research institutions. Trained on the RedPajama-Data-1T dataset, this model represents a significant achievement in open-source AI development, utilizing 3,072 V100 GPUs through the INCITE 2023 project.

Implementation Details

The model features flexible deployment options, supporting both GPU and CPU inference with various optimization levels. It implements pipeline parallel 12 and tensor parallel 2 architectures, with a global batch size of 4M tokens and learning rate of 0.00012.

  • Multiple inference options: Full GPU (16GB), Int8 GPU (12GB), and CPU deployment
  • Supports both PyTorch and Transformers frameworks
  • Includes 11 intermediate checkpoints from 240B to 1T tokens

Core Capabilities

  • Advanced text generation and language understanding
  • Efficient resource utilization through various optimization options
  • Comprehensive training coverage across diverse text domains
  • Flexible deployment options for different hardware configurations

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its extensive training process (1T tokens), collaborative development approach, and the availability of multiple intermediate checkpoints for research purposes. It also offers various deployment options to accommodate different hardware configurations.

Q: What are the recommended use cases?

The model is primarily designed for language modeling tasks, including text generation and analysis. It's particularly suitable for research applications and production environments where flexible deployment options are needed. However, it should not be used for generating harmful content or making critical decisions affecting individuals.

The first platform built for prompt engineering