openchat

Maintained By
openchat

OpenChat

PropertyValue
Base ModelLLaMA-13B
LicenseNon-commercial (LLaMA license)
Context Length2048 (base) / 8192 (extended)
LanguageEnglish

What is OpenChat?

OpenChat represents a significant advancement in efficient language model fine-tuning, demonstrating that high performance can be achieved with minimal but high-quality data. Built on LLaMA-13B, it achieves remarkable results using only 6,000 carefully filtered GPT-4 conversations from a larger ShareGPT dataset of 90,000 conversations.

Implementation Details

The model implements a specific conversation template system using concatenated tokens and includes a special end-of-turn token. It's available in two variants: the standard 2048 context length version and an extended 8192 context length version (OpenChat-8192).

  • Custom tokenization implementation with special tokens
  • Flexible role prefix system for human/assistant interactions
  • Efficient conversation template generation
  • Compatible with ChatCompletions API

Core Capabilities

  • 105.7% of ChatGPT performance on Vicuna GPT-4 evaluation (base model)
  • 80.9% Win-rate on AlpacaEval
  • Multi-round conversation handling
  • Extended context processing (8192 version)
  • Web UI support for better user interaction

Frequently Asked Questions

Q: What makes this model unique?

OpenChat's most distinctive feature is its ability to achieve superior performance using an extremely small, carefully curated dataset of just 6,000 conversations, demonstrating that quality can outweigh quantity in model training.

Q: What are the recommended use cases?

The model is well-suited for general conversation tasks, multi-round dialogues, and applications requiring high-quality responses. The 8192 context version is particularly useful for tasks requiring longer context understanding.

The first platform built for prompt engineering