MathBERT

Maintained By
tbs17

MathBERT

PropertyValue
Authortbs17
Downloads25,588
TagsFill-Mask, Transformers, PyTorch, BERT
Training Data100M tokens from pre-K to graduate math texts

What is MathBERT?

MathBERT is a specialized BERT-based transformer model specifically designed for mathematical language understanding. It has been pretrained on a diverse corpus of mathematical texts ranging from pre-K to graduate-level content, including curriculum materials from engageNY, Utah Math, Illustrative Math, college textbooks, and arxiv math paper abstracts.

Implementation Details

The model implements a masked language modeling (MLM) approach with WordPiece tokenization and a vocabulary size of 30,522 tokens. It was trained on 8-core cloud TPUs for 600k steps with a batch size of 128, using Adam optimizer with carefully tuned hyperparameters.

  • Trained using MLM and Next Sentence Prediction objectives
  • 15% token masking during training
  • 512 token sequence length
  • Implements bidirectional representation learning

Core Capabilities

  • Mathematical text understanding and processing
  • Masked language modeling for math-specific content
  • Unbiased predictions compared to general BERT models
  • Suitable for fine-tuning on downstream math-related tasks

Frequently Asked Questions

Q: What makes this model unique?

MathBERT's uniqueness lies in its specialized training on mathematical content, making it particularly effective for math-related language tasks while avoiding general language biases found in traditional BERT models.

Q: What are the recommended use cases?

The model is best suited for mathematical text analysis, sequence classification, token classification, and question answering in mathematical contexts. It's particularly effective for tasks that require understanding of mathematical language and concepts.

The first platform built for prompt engineering