Meditron-7B

Property	Value
Parameter Count	7 Billion
Model Type	Causal decoder-only transformer
Base Model	Llama-2-7B
License	LLAMA 2 COMMUNITY LICENSE
Context Length	2K tokens
Published	September 2023

What is meditron-7b?

Meditron-7B is a specialized medical language model developed by the EPFL LLM Team, designed specifically for healthcare applications. It's built upon Llama-2-7B through continued pretraining on a comprehensive medical corpus including PubMed articles, medical guidelines, and general domain knowledge from RedPajama-v1. The model represents a significant step forward in medical AI, demonstrating superior performance on various medical reasoning tasks compared to its base model.

Implementation Details

The model utilizes a sophisticated architecture with 4096 hidden dimensions, 32 attention heads, and 32 layers. It was trained using a three-way parallelism scheme on 8x NVIDIA A100 GPUs, implementing data, pipeline, and tensor parallelism for optimal performance. The training process involved 48.1B tokens from medical literature and guidelines, with careful consideration for environmental impact.

Trained on comprehensive medical corpus including 46K clinical guidelines
Implements bf16 precision with cosine learning rate scheduling
Uses advanced distributed training through Megatron-LLM library
Achieves significant performance improvements on medical benchmarks

Core Capabilities

Medical exam question answering with 57.5% average accuracy across benchmarks
Supporting differential diagnosis
Disease information query handling
General health information processing
Enhanced medical reasoning compared to baseline models

Frequently Asked Questions

Q: What makes this model unique?

Meditron-7B stands out due to its specialized medical training data, including a novel dataset of international clinical guidelines, and its superior performance on medical reasoning tasks compared to other 7B parameter models. It achieves a 28.3% accuracy on medical truthfulness tests, significantly outperforming Llama-2-7B's 12.6%.

Q: What are the recommended use cases?

While the model shows promise in medical exam question answering, differential diagnosis support, and health information queries, it's important to note that it's not recommended for direct clinical use without extensive testing and alignment. The model is best suited for research and development purposes in medical AI applications.

meditron-7b