meditron-70b

Maintained By
epfl-llm

Meditron-70B

PropertyValue
Parameter Count70 Billion
Base ModelLlama-2-70B
Context Length4K tokens
LicenseLLAMA 2 COMMUNITY LICENSE
PaperMediTron-70B: Scaling Medical Pretraining for Large Language Models
Knowledge CutoffAugust 2023

What is Meditron-70B?

Meditron-70B is a sophisticated medical language model adapted from Llama-2-70B through continued pretraining on a comprehensive medical corpus. It represents a significant advancement in medical AI, trained on 48.1B tokens from clinical guidelines, PubMed articles, and medical abstracts, combined with general domain knowledge from RedPajama-v1.

Implementation Details

The model utilizes a three-way parallelism scheme for training, implemented using the Megatron-LLM distributed training library. It was trained on 128 NVIDIA A100 GPUs with sophisticated hardware configuration including NVLink and NVSwitch connectivity.

  • Architecture: Causal decoder-only transformer
  • Training Data: Includes 46K clinical guidelines, 16.1M medical abstracts, and 5M PubMed papers
  • Performance: Achieves 72.0% average accuracy across medical benchmarks
  • Training Infrastructure: 16 nodes with 8x NVIDIA A100 (80GB) GPUs

Core Capabilities

  • Medical exam question answering
  • Supporting differential diagnosis
  • Disease information query handling
  • General health information processing
  • Outperforms Llama-2-70B and GPT-3.5 on medical tasks

Frequently Asked Questions

Q: What makes this model unique?

Meditron-70B stands out for its specialized medical training on a curated corpus of high-quality medical literature and clinical guidelines, achieving superior performance on medical reasoning tasks compared to general-purpose LLMs.

Q: What are the recommended use cases?

The model is recommended for research and experimental purposes in medical contexts, including medical exam preparation and disease information queries. However, it should not be used in production or clinical settings without extensive testing and validation.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.