Vicuna-13B v1.1

Property	Value
License	Apache 2.0
Training Data	70K ShareGPT conversations
Development Team	UC Berkeley, CMU, Stanford, UC San Diego
Release Date	March-April 2023

What is vicuna-13b-1.1?

Vicuna-13B v1.1 is an advanced open-source chatbot developed through fine-tuning the LLaMA architecture on carefully curated conversations from ShareGPT. This version represents a significant improvement over its predecessor, featuring refined tokenization methods and enhanced loss computation for better performance.

Implementation Details

The model implements an auto-regressive transformer architecture with several technical improvements in v1.1, including a shift from "###" separators to EOS token "" for better generation control and cross-library compatibility.

Transformer-based architecture utilizing LLaMA foundation
Improved tokenization system
Enhanced supervised fine-tuning loss computation
Integration with text-generation-inference framework

Core Capabilities

Advanced conversational AI interactions
Research-focused natural language processing
Flexible text generation capabilities
Improved cross-library compatibility

Frequently Asked Questions

Q: What makes this model unique?

Vicuna-13B v1.1 stands out due to its optimization for research purposes, extensive training on real-world conversations, and significant improvements in tokenization and loss computation compared to previous versions. It has been evaluated using GPT-4 as a judge across 80 diverse questions.

Q: What are the recommended use cases?

The model is primarily intended for research in natural language processing, machine learning, and AI. It's particularly suited for researchers and hobbyists working on understanding and advancing conversational AI systems.

vicuna-13b-1.1