opt-6.7b

Maintained By
facebook

OPT-6.7B

PropertyValue
DeveloperMeta AI (Facebook)
PaperOpen Pre-trained Transformer Language Models
LicenseOther (Research Only)
Training Data180B tokens (800GB)
ArchitectureDecoder-only Transformer

What is opt-6.7b?

OPT-6.7B is part of Meta AI's Open Pre-trained Transformer (OPT) series, designed to democratize access to large language models. This 6.7 billion parameter model represents a significant milestone in open-source AI, trained to match GPT-3's capabilities while promoting transparent and responsible AI research.

Implementation Details

The model implements a decoder-only transformer architecture trained using causal language modeling. It utilizes the GPT2 tokenizer with a 50,272 token vocabulary and processes sequences of 2,048 tokens. The training corpus combines diverse sources including BookCorpus, CC-Stories, The Pile, Reddit data, and CCNewsV2.

  • Trained on 180B tokens across multiple datasets
  • Uses half-precision (float16) for efficient inference
  • Supports both deterministic and sampling-based text generation
  • Requires GPU acceleration for optimal performance

Core Capabilities

  • Text generation and completion
  • Zero-shot and few-shot learning
  • Language understanding and processing
  • Prompt-based task solving

Frequently Asked Questions

Q: What makes this model unique?

OPT-6.7B stands out for its open-source nature and focus on research accessibility, while matching the capabilities of similar-sized proprietary models. It's designed for responsible AI research and comes with comprehensive documentation about its limitations and biases.

Q: What are the recommended use cases?

The model is best suited for research purposes, text generation tasks, and fine-tuning for specific applications. It's particularly useful for studying model behavior, bias analysis, and developing more robust AI systems.

The first platform built for prompt engineering