Llama-3-Taiwan-8B-Instruct

Maintained By
yentinglin

Llama-3-Taiwan-8B-Instruct

PropertyValue
Parameter Count8.03B
Model TypeInstruction-tuned LLM
Base ModelMeta-Llama-3-70B
LicenseLlama3
Context Length8K tokens
LanguagesTraditional Chinese, English

What is Llama-3-Taiwan-8B-Instruct?

Llama-3-Taiwan-8B-Instruct is a specialized language model designed to bridge the linguistic divide between Traditional Chinese and English users. Built on Meta's Llama-3 architecture, this 8B parameter model has been fine-tuned specifically for Traditional Mandarin and English tasks, with particular attention to Taiwan-specific context and applications.

Implementation Details

The model was developed using NVIDIA's NeMo Framework and supports inference through NVIDIA TensorRT-LLM. It demonstrates impressive performance across various benchmarks, including TMLU (59.50%), Taiwan Truthful QA (61.11%), and Legal Evaluation (53.11%).

  • Training Framework: NVIDIA NeMo and NeMo Megatron
  • Inference Framework: NVIDIA TensorRT-LLM
  • Context Window: 8K tokens (128k version available)
  • Supported Functions: Text generation, multi-turn dialogue, RAG capabilities

Core Capabilities

  • Multi-turn dialogue in Traditional Chinese and English
  • Retrieval-Augmented Generation (RAG) support
  • Formatted output generation
  • Entity recognition
  • Function calling with JSON mode support
  • Legal and domain-specific knowledge processing

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialized optimization for Traditional Chinese language processing while maintaining strong English capabilities. It's particularly notable for its performance on Taiwan-specific tasks and benchmarks, making it ideal for applications requiring cultural and linguistic alignment with Taiwan.

Q: What are the recommended use cases?

The model excels in multiple applications including conversational AI, document analysis, content generation, and specialized tasks requiring Traditional Chinese language understanding. It's particularly effective for RAG implementations and structured output generation.

The first platform built for prompt engineering