Yi-34B

Maintained By
01-ai

Yi-34B

PropertyValue
Parameter Count34.4B
LicenseApache 2.0
ArchitectureTransformer-based (Llama architecture)
Training Data3T tokens multilingual corpus
Context Length200K tokens
PaperYi Tech Report

What is Yi-34B?

Yi-34B is a next-generation open-source large language model trained from scratch by 01.AI. It represents a significant advancement in bilingual language models, trained on a massive 3T multilingual corpus. The model has demonstrated exceptional capabilities, ranking first among open-source models in various benchmarks including Hugging Face Open LLM Leaderboard and C-Eval.

Implementation Details

Built on the Llama architecture, Yi-34B leverages advanced training techniques and infrastructure while maintaining its own unique development path. The model supports both BF16 precision and various quantization options (4-bit and 8-bit) for efficient deployment.

  • Supports context windows up to 200K tokens
  • Implements efficient training pipelines
  • Provides compatibility with existing Llama ecosystem tools
  • Offers multiple deployment options including Docker and vLLM

Core Capabilities

  • Superior performance in language understanding and reasoning tasks
  • Exceptional reading comprehension abilities
  • Strong performance in both English and Chinese evaluations
  • Advanced mathematical and coding capabilities
  • Efficient handling of long-context scenarios

Frequently Asked Questions

Q: What makes this model unique?

Yi-34B stands out for its exceptional performance despite its relatively moderate size, outperforming many larger models including some 70B parameter models. It's also unique in being fully Apache 2.0 licensed for commercial use.

Q: What are the recommended use cases?

The model is suitable for a wide range of applications including personal, academic, and commercial use. It excels in tasks requiring strong reasoning, comprehension, and bilingual capabilities. The 200K context window makes it particularly suitable for long-document analysis and complex reasoning tasks.

The first platform built for prompt engineering