Baichuan2-13B-Chat

Maintained By
baichuan-inc

Baichuan2-13B-Chat

PropertyValue
LicenseApache 2.0 with additional terms
Training Data2.6 trillion tokens
LanguagesChinese, English
FrameworkPyTorch 2.0

What is Baichuan2-13B-Chat?

Baichuan2-13B-Chat is a state-of-the-art large language model developed by Baichuan Intelligence. This latest version (v2.0) represents a significant advancement in bilingual capability, featuring enhanced mathematical reasoning and complex instruction-following abilities. The model is trained on a high-quality corpus of 2.6 trillion tokens and achieves superior performance in both Chinese and English benchmarks.

Implementation Details

The model leverages PyTorch 2.0's F.scaled_dot_product_attention for optimized inference speed. It supports both base precision and 4-bit quantization, making it adaptable for various deployment scenarios. The architecture is optimized for both academic research and commercial applications, pending proper licensing.

  • Supports context-aware chat interactions
  • Implements efficient token processing
  • Offers both full precision and 4-bit quantized versions
  • Requires PyTorch 2.0 environment

Core Capabilities

  • Strong performance on C-Eval (58.10%), MMLU (59.17%), and CMMLU (61.97%)
  • Enhanced mathematical and logical reasoning abilities
  • Sophisticated instruction following
  • Bilingual proficiency in Chinese and English
  • Support for long-context understanding (up to 192K tokens)

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its balanced performance in both Chinese and English, achieving state-of-the-art results across multiple benchmarks. It's particularly notable for its improved mathematical reasoning and instruction-following capabilities in the v2.0 release.

Q: What are the recommended use cases?

The model is suitable for a wide range of applications including general conversation, academic research, and commercial applications (with proper licensing). It excels in tasks requiring bilingual understanding, mathematical reasoning, and complex instruction following.

The first platform built for prompt engineering