Llama-3.1-Swallow-8B-Instruct-v0.1

Maintained By
tokyotech-llm

Llama-3.1-Swallow-8B-Instruct-v0.1

PropertyValue
Parameter Count8.03B
Model TypeLLaMA Architecture
LicenseMETA LLAMA 3.1 COMMUNITY LICENSE & Gemma Terms of Use
LanguagesJapanese, English
PaperLLaMA 3 Paper

What is Llama-3.1-Swallow-8B-Instruct-v0.1?

Llama-3.1-Swallow-8B-Instruct is an advanced language model that enhances the Japanese language capabilities of Meta's LLaMA 3.1 while maintaining strong English performance. It was developed through continual pre-training using approximately 200 billion tokens from Japanese web corpus, Wikipedia articles, and specialized content.

Implementation Details

The model underwent extensive training using the Megatron-LM framework and was fine-tuned on carefully curated instruction datasets. It leverages both synthetic and human-curated data to ensure high-quality responses in both Japanese and English contexts.

  • Built on LLaMA 3.1 architecture with 8B parameters
  • Trained on Swallow Corpus Version 2 and multilingual content
  • Supports both Japanese and English instruction following
  • Implements advanced tokenization for efficient processing

Core Capabilities

  • Strong performance in Japanese NLP tasks (achieving top scores in multiple benchmarks)
  • Maintains competitive English language capabilities
  • Excels in tasks like translation, summarization, and question-answering
  • Specialized instruction-following abilities in both languages

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its enhanced Japanese language capabilities while maintaining strong English performance, achieving state-of-the-art results in various Japanese NLP benchmarks while preserving LLaMA 3.1's English capabilities.

Q: What are the recommended use cases?

The model is well-suited for bilingual applications including translation, summarization, question-answering, and general instruction following in both Japanese and English contexts. It's particularly effective for tasks requiring deep understanding of Japanese language and culture.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.