OpenChat

Property	Value
Base Model	LLaMA-13B
License	Non-commercial (LLaMA license)
Context Length	2048 (base) / 8192 (extended)
Language	English

What is OpenChat?

OpenChat represents a significant advancement in efficient language model fine-tuning, demonstrating that high performance can be achieved with minimal but high-quality data. Built on LLaMA-13B, it achieves remarkable results using only 6,000 carefully filtered GPT-4 conversations from a larger ShareGPT dataset of 90,000 conversations.

Implementation Details

The model implements a specific conversation template system using concatenated tokens and includes a special end-of-turn token. It's available in two variants: the standard 2048 context length version and an extended 8192 context length version (OpenChat-8192).

Custom tokenization implementation with special tokens
Flexible role prefix system for human/assistant interactions
Efficient conversation template generation
Compatible with ChatCompletions API

Core Capabilities

105.7% of ChatGPT performance on Vicuna GPT-4 evaluation (base model)
80.9% Win-rate on AlpacaEval
Multi-round conversation handling
Extended context processing (8192 version)
Web UI support for better user interaction

Frequently Asked Questions

Q: What makes this model unique?

OpenChat's most distinctive feature is its ability to achieve superior performance using an extremely small, carefully curated dataset of just 6,000 conversations, demonstrating that quality can outweigh quantity in model training.

Q: What are the recommended use cases?

The model is well-suited for general conversation tasks, multi-round dialogues, and applications requiring high-quality responses. The 8192 context version is particularly useful for tasks requiring longer context understanding.

openchat