bert-base-uncased-contracts

Maintained By
nlpaueb

BERT-Base-Uncased-Contracts

PropertyValue
Parameters110M
Licensecc-by-sa-4.0
ArchitectureBERT (12-layer, 768-hidden, 12-heads)
Training Data76,366 US contracts from EDGAR

What is bert-base-uncased-contracts?

bert-base-uncased-contracts is a specialized BERT model that's part of the LEGAL-BERT family, specifically trained for processing legal contracts. It was developed by AUEB's Natural Language Processing Group and trained on a large corpus of US contracts from the SEC's EDGAR database. This model represents a significant advancement in domain-specific language understanding for legal documentation.

Implementation Details

The model follows the BERT-BASE architecture with 12 transformer layers, 768 hidden dimensions, and 12 attention heads. It was trained for 1 million steps with batches of 256 sequences (length 512) using a Google Cloud TPU v3-8. The training process maintained the original BERT configuration with an initial learning rate of 1e-4.

  • Pre-trained on 76,366 US contracts from EDGAR database
  • Maintains BERT-BASE architecture specifications
  • Optimized for legal contract understanding and analysis

Core Capabilities

  • Contract-specific term prediction and understanding
  • Legal language modeling with domain expertise
  • Masked token prediction optimized for contract terminology
  • Enhanced performance on contract-related NLP tasks

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically optimized for legal contracts, showing superior performance compared to general-purpose BERT models in contract-related tasks. Its training on the EDGAR database makes it particularly effective for US contract analysis.

Q: What are the recommended use cases?

The model is ideal for contract analysis, legal document processing, contract term extraction, and any NLP tasks involving legal agreements. It shows particular strength in understanding contract-specific terminology and structure.

The first platform built for prompt engineering