RWKV7-Goose-World3-2.9B-HF

Maintained By
RWKV

RWKV7-Goose-World3-2.9B-HF

PropertyValue
Parameter Count2.9B
LicenseApache-2.0
TokenizerRWKV World tokenizer (65,536 vocab)
Training Tokens3.119 trillion
Final Loss1.8745

What is RWKV7-Goose-World3-2.9B-HF?

RWKV7-Goose-World3-2.9B-HF is an advanced language model developed by the RWKV Project under the LF AI & Data Foundation. It represents a significant evolution in the RWKV architecture, implementing flash-linear attention format for improved efficiency and performance. The model was trained on a massive dataset of 3.119 trillion tokens using World v3 data.

Implementation Details

The model utilizes a sophisticated training regime with bfloat16 precision and employs a delayed cosine decay learning rate schedule ranging from 4e-4 to 1e-5, combined with a weight decay of 0.1. The implementation leverages the flash-linear-attention framework and requires the latest version of the transformers library (>=4.48.0) for optimal performance.

  • Flash-linear attention architecture for efficient processing
  • Custom RWKV World tokenizer with 65,536 vocabulary size
  • Optimized for English language tasks
  • Implements advanced training techniques with varying batch sizes

Core Capabilities

  • Large-scale text generation and completion
  • Efficient processing with flash-linear attention
  • Seamless integration with HuggingFace transformers library
  • Support for chat-template formatting and generation

Frequently Asked Questions

Q: What makes this model unique?

The model combines the innovative RWKV7 architecture with flash-linear attention, providing efficient processing while maintaining high performance. Its training on 3.119 trillion tokens and custom World tokenizer makes it particularly effective for English language tasks.

Q: What are the recommended use cases?

The model is well-suited for text generation, completion tasks, and chatbot applications. It can be easily integrated into existing pipelines using the HuggingFace transformers library and supports sophisticated chat templating.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.