DeepSeek-Prover-V1.5-RL
Property | Value |
---|---|
Parameter Count | 6.91B |
Model Type | Theorem Proving LLM |
License | DeepSeek License |
Paper | arXiv:2408.08152 |
Tensor Type | BF16 |
What is DeepSeek-Prover-V1.5-RL?
DeepSeek-Prover-V1.5-RL is a sophisticated language model specifically designed for theorem proving in Lean 4. Built upon DeepSeekMath-Base, this model represents a significant advancement in automated mathematical reasoning, achieving state-of-the-art results through a combination of reinforcement learning and innovative Monte-Carlo tree search techniques.
Implementation Details
The model implements a novel RMaxTS (Reward-Maximizing Tree Search) approach, combining reinforcement learning from proof assistant feedback (RLPAF) with sophisticated tree search algorithms. This implementation has led to remarkable improvements over its predecessor, particularly in handling complex mathematical proofs.
- Specialized pre-training on mathematical formal languages
- Enhanced supervised fine-tuning using an improved theorem proving dataset
- Implementation of RMaxTS for diverse proof path generation
- Integration of proof assistant feedback for reinforcement learning
Core Capabilities
- Achieves 63.5% accuracy on miniF2F test benchmark (high school level)
- 25.3% success rate on ProofNet (undergraduate level)
- Generates diverse proof paths through intrinsic-reward-driven exploration
- Supports complex mathematical reasoning and formal proof generation
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its innovative combination of reinforcement learning and Monte-Carlo tree search, specifically optimized for mathematical theorem proving. The RMaxTS approach represents a significant advancement in proof generation methodology.
Q: What are the recommended use cases?
The model is specifically designed for formal mathematical proof generation in Lean 4, making it ideal for automated theorem proving, mathematical research assistance, and educational applications in formal mathematics.