nomic-bert-2048

Maintained By
nomic-ai

nomic-bert-2048

PropertyValue
Parameter Count137M
LicenseApache 2.0
Tensor TypeF32
Max Sequence Length2048 tokens
Training DataWikipedia, BookCorpus

What is nomic-bert-2048?

nomic-bert-2048 is an advanced BERT model specifically designed to handle longer sequences of up to 2048 tokens, significantly extending the traditional BERT context window. This model incorporates modern architectural improvements while maintaining competitive performance on standard benchmarks like GLUE.

Implementation Details

The model implements several key architectural innovations from recent research, including Rotary Position Embeddings for better context length handling and SwiGLU activations for improved performance. It maintains zero dropout and achieves comparable results to RoBERTa-base while supporting 4x longer sequences.

  • Rotary Position Embeddings for context length extrapolation
  • SwiGLU activations for enhanced model performance
  • Zero dropout rate for optimal training
  • Trained on Wikipedia and BookCorpus with 2048-token sequences

Core Capabilities

  • Masked Language Modeling with extended context
  • Sequence Classification tasks
  • Strong performance on GLUE benchmark (0.84 average score)
  • Efficient handling of long-form text up to 2048 tokens

Frequently Asked Questions

Q: What makes this model unique?

The model's ability to handle 2048-token sequences while maintaining competitive performance on standard benchmarks sets it apart from traditional BERT models that typically handle only 512 tokens.

Q: What are the recommended use cases?

The model is particularly well-suited for tasks requiring longer context understanding, such as document-level analysis, long-form text processing, and tasks that benefit from extended context windows like document classification or long-sequence masked language modeling.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.