Meta-Llama-3-120B-Instruct

Maintained By
mlabonne

Meta-Llama-3-120B-Instruct

PropertyValue
Parameter Count122 Billion
Model TypeInstruction-tuned LLM
ArchitectureSelf-merged Llama-3
LicenseOther
Tensor TypeFP16

What is Meta-Llama-3-120B-Instruct?

Meta-Llama-3-120B-Instruct is an advanced language model created through an innovative self-merge approach of the Meta-Llama-3-70B-Instruct model using MergeKit technology. It represents a significant advancement in large language model architecture, specifically optimized for creative writing tasks while maintaining a substantial 8K context window.

Implementation Details

The model employs a sophisticated layer-wise merge strategy, utilizing seven different layer ranges from the base 70B model to create a more powerful 122B parameter model. It's implemented using float16 precision and supports multiple quantized versions including GGUF, EXL2, and MLX formats for various deployment scenarios.

  • Multi-layer merge architecture spanning 0-80 layers
  • Passthrough merge method with float16 precision
  • 8K default context window (extensible with rope theta)
  • Available in multiple quantized versions for efficient deployment

Core Capabilities

  • Exceptional creative writing performance
  • Advanced text generation with customizable parameters
  • Support for chat-based interactions using Llama 3 chat template
  • Flexible deployment options through various quantized versions
  • Extended context understanding capabilities

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its self-merge architecture, combining multiple layer ranges of the same base model to create a more powerful system. This approach, coupled with its optimization for creative writing, sets it apart from traditional language models.

Q: What are the recommended use cases?

The model excels primarily in creative writing tasks. While it can handle various applications, it's particularly suited for narrative generation and creative content creation. Users should note that it may exhibit occasional quirkiness, including a tendency for uppercase usage and typos.

The first platform built for prompt engineering