bloom-3b

Maintained By
bigscience

BLOOM-3B Language Model

PropertyValue
Parameter Count3 Billion
Model TypeDecoder-only Transformer
LicenseRAIL-1.0
Supported Languages46 Natural + 13 Programming
ArchitectureModified Megatron-LM GPT2

What is BLOOM-3B?

BLOOM-3B is a powerful multilingual language model developed by BigScience, representing a significant milestone in open-source AI development. It's trained on a diverse corpus spanning 46 natural languages and 13 programming languages, making it one of the most linguistically diverse models of its size. The model employs FP16 precision and features advanced architectural elements like ALiBI positional encodings and stable embeddings.

Implementation Details

The model architecture is built on a modified version of Megatron-LM GPT2, featuring 30 layers and 32 attention heads. It incorporates layer normalization in the word embeddings layer and uses GeLU activation functions. The model processes sequences up to 2048 tokens and was trained using cross-entropy loss with mean reduction.

  • 3B parameters total (642M embedding parameters)
  • Hidden layers are 2560-dimensional
  • Uses ALiBI positional encodings
  • Implements stable embeddings with layer normalization

Core Capabilities

  • Multilingual text generation across 46 languages
  • Code generation in 13 programming languages
  • Zero-shot task performance
  • Cross-lingual understanding and generation
  • Task-specific fine-tuning potential

Frequently Asked Questions

Q: What makes this model unique?

BLOOM-3B stands out for its extensive language coverage and open-science approach. It's part of the larger BLOOM family and offers a balanced trade-off between model size and performance, making it accessible for research and production use cases.

Q: What are the recommended use cases?

The model is well-suited for text generation, information extraction, question answering, and summarization tasks. However, it should not be used for high-stakes decisions or in contexts requiring factual accuracy without human oversight.

The first platform built for prompt engineering