bioformer-8L

Maintained By
bioformers

Bioformer-8L

PropertyValue
Parameter Count42.8M
LicenseApache 2.0
PaperarXiv:2302.01588
Architecture8-layer BERT with 512 hidden size, 8 attention heads

What is bioformer-8L?

Bioformer-8L is a specialized BERT model designed specifically for biomedical text mining. It represents a significant advancement in efficient natural language processing for biomedical literature, achieving comparable or better performance than larger models while operating at 3x the speed of BERT-base. The model was pre-trained from scratch on an extensive biomedical corpus, including 33 million PubMed abstracts and 1 million PMC full-text articles.

Implementation Details

The model employs a custom-trained cased WordPiece vocabulary of 32,768 tokens, specifically optimized for biomedical text. Pre-training was conducted using whole-word masking with a 15% masking rate and includes both masked language modeling (MLM) and next sentence prediction (NSP) objectives. The training process was completed on a single Cloud TPU device over 2 million steps.

  • 8 transformer layers with 512 hidden embedding size
  • 8 self-attention heads
  • Maximum input sequence length of 512
  • Trained with batch size 256
  • Specialized biomedical vocabulary

Core Capabilities

  • Efficient biomedical text processing and analysis
  • Masked language modeling for biomedical terms
  • High performance on downstream biomedical NLP tasks
  • Award-winning performance in COVID-19 topic classification
  • Seamless integration with standard BERT workflows

Frequently Asked Questions

Q: What makes this model unique?

Bioformer-8L stands out for its efficiency-to-performance ratio, delivering BERT-base level results with significantly reduced computational requirements. Its specialized biomedical vocabulary and focused training make it particularly effective for healthcare and medical research applications.

Q: What are the recommended use cases?

The model is ideal for biomedical text mining tasks, including medical literature analysis, clinical text processing, and healthcare documentation analysis. It has proven particularly effective in multi-label topic classification for medical literature.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.