DeepSeek-V2-Lite

Maintained By
deepseek-ai

DeepSeek-V2-Lite

PropertyValue
Total Parameters15.7B
Active Parameters2.4B
Context Length32k tokens
LicenseDeepSeek Model License
PaperarXiv:2405.04434

What is DeepSeek-V2-Lite?

DeepSeek-V2-Lite is an innovative Mixture-of-Experts (MoE) language model that represents a significant advancement in efficient AI model design. Trained on 5.7T tokens from scratch, it achieves remarkable performance while maintaining deployment feasibility on a single 40GB GPU. The model utilizes Multi-head Latent Attention (MLA) and DeepSeekMoE architecture to deliver superior performance with economical resource usage.

Implementation Details

The model features 27 layers with a hidden dimension of 2048 and employs 16 attention heads. Its architecture includes unique elements like KV compression dimension of 512 and a mixture of 2 shared experts and 64 routed experts in MoE layers, with 6 experts activated per token.

  • Efficient inference through MLA architecture
  • DeepSeekMoE implementation for optimal resource utilization
  • BF16 format for balanced precision and performance
  • 32k context length support

Core Capabilities

  • Strong performance on English benchmarks (MMLU: 58.3, BBH: 44.1)
  • Excellence in Chinese tasks (C-Eval: 60.3, CMMLU: 64.3)
  • Robust coding capabilities (HumanEval: 29.9, MBPP: 43.2)
  • Advanced mathematical reasoning (GSM8K: 41.1, Math: 17.1)

Frequently Asked Questions

Q: What makes this model unique?

DeepSeek-V2-Lite's uniqueness lies in its efficient MoE architecture that achieves high performance with only 2.4B active parameters, making it deployable on a single GPU while outperforming larger dense models.

Q: What are the recommended use cases?

The model excels in various applications including multilingual text processing, coding tasks, and mathematical reasoning. It's particularly suitable for deployment scenarios where resource efficiency is crucial while maintaining high performance standards.

The first platform built for prompt engineering