opt-1.3b

Maintained By
facebook

OPT-1.3B

PropertyValue
DeveloperMeta AI (Facebook)
LicenseCustom (Other)
PaperOpen Pre-trained Transformer Language Models
Training Data180B tokens (800GB)

What is opt-1.3b?

OPT-1.3B is part of Meta AI's Open Pre-trained Transformer (OPT) series, designed to democratize access to large language models. This 1.3 billion parameter model implements a decoder-only architecture similar to GPT-3, trained on a diverse dataset including BookCorpus, CC-Stories, and filtered content from The Pile.

Implementation Details

The model utilizes a GPT2-style byte-level BPE tokenizer with a vocabulary size of 50,272 tokens. It processes sequences of 2048 tokens and was trained using a causal language modeling objective. The training infrastructure leveraged A100 GPUs for efficient processing of the massive dataset.

  • Decoder-only transformer architecture
  • Pre-trained on 800GB of filtered text data
  • Supports both deterministic and top-k sampling generation
  • Implements efficient training practices for optimal performance

Core Capabilities

  • Text generation and completion
  • Zero-shot and few-shot learning
  • Language understanding and processing
  • Custom prompt-based tasks

Frequently Asked Questions

Q: What makes this model unique?

OPT-1.3B stands out for its open-source nature and commitment to responsible AI research. It provides comparable capabilities to GPT-3-class models while being fully accessible to researchers studying bias, toxicity, and robustness in language models.

Q: What are the recommended use cases?

The model is best suited for text generation tasks, research purposes, and fine-tuning for specific downstream applications. It can be used directly with the Transformers pipeline for text generation or fine-tuned using the causal language modeling approach.

The first platform built for prompt engineering