DeepSeek-Prover-V1.5-RL
Property | Value |
---|---|
Parameter Count | 6.91B |
Model Type | Theorem Proving LLM |
License | DeepSeek License |
Paper | arXiv:2408.08152 |
Tensor Type | BF16 |
What is DeepSeek-Prover-V1.5-RL?
DeepSeek-Prover-V1.5-RL is an advanced language model specifically designed for theorem proving in Lean 4. It represents a significant evolution from its predecessor, incorporating reinforcement learning from proof assistant feedback (RLPAF) and a novel Monte-Carlo tree search variant called RMaxTS. The model achieves state-of-the-art performance on key benchmarks, demonstrating 63.5% accuracy on miniF2F and 25.3% on ProofNet.
Implementation Details
The model builds upon DeepSeekMath-Base and undergoes a sophisticated training process including supervised fine-tuning and reinforcement learning. It introduces RMaxTS, an innovative approach that uses intrinsic-reward-driven exploration to generate diverse proof paths, moving beyond simple single-pass proof generation.
- Pre-trained on DeepSeekMath-Base with formal mathematical language specialization
- Enhanced through supervised fine-tuning using an improved theorem proving dataset
- Optimized via reinforcement learning with proof assistant feedback
- Implements RMaxTS for advanced proof path exploration
Core Capabilities
- State-of-the-art performance on high school level miniF2F benchmark (63.5%)
- Superior results on undergraduate level ProofNet benchmark (25.3%)
- Advanced theorem proving in Lean 4
- Diverse proof path generation using Monte-Carlo tree search
Frequently Asked Questions
Q: What makes this model unique?
The model's unique combination of reinforcement learning from proof assistant feedback and RMaxTS-based proof exploration sets it apart, enabling significantly improved theorem-proving capabilities compared to previous approaches.
Q: What are the recommended use cases?
This model is specifically designed for formal theorem proving in Lean 4, making it ideal for mathematical proof verification, academic research, and educational applications in formal mathematics.