vicuna-13b-1.1

Maintained By
eachadea

Vicuna-13B v1.1

PropertyValue
LicenseApache 2.0
Training Data70K ShareGPT conversations
Development TeamUC Berkeley, CMU, Stanford, UC San Diego
Release DateMarch-April 2023

What is vicuna-13b-1.1?

Vicuna-13B v1.1 is an advanced open-source chatbot developed through fine-tuning the LLaMA architecture on carefully curated conversations from ShareGPT. This version represents a significant improvement over its predecessor, featuring refined tokenization methods and enhanced loss computation for better performance.

Implementation Details

The model implements an auto-regressive transformer architecture with several technical improvements in v1.1, including a shift from "###" separators to EOS token "" for better generation control and cross-library compatibility.

  • Transformer-based architecture utilizing LLaMA foundation
  • Improved tokenization system
  • Enhanced supervised fine-tuning loss computation
  • Integration with text-generation-inference framework

Core Capabilities

  • Advanced conversational AI interactions
  • Research-focused natural language processing
  • Flexible text generation capabilities
  • Improved cross-library compatibility

Frequently Asked Questions

Q: What makes this model unique?

Vicuna-13B v1.1 stands out due to its optimization for research purposes, extensive training on real-world conversations, and significant improvements in tokenization and loss computation compared to previous versions. It has been evaluated using GPT-4 as a judge across 80 diverse questions.

Q: What are the recommended use cases?

The model is primarily intended for research in natural language processing, machine learning, and AI. It's particularly suited for researchers and hobbyists working on understanding and advancing conversational AI systems.

The first platform built for prompt engineering