Vicuna-13B v1.1
Property | Value |
---|---|
License | Apache 2.0 |
Training Data | 70K ShareGPT conversations |
Development Team | UC Berkeley, CMU, Stanford, UC San Diego |
Release Date | March-April 2023 |
What is vicuna-13b-1.1?
Vicuna-13B v1.1 is an advanced open-source chatbot developed through fine-tuning the LLaMA architecture on carefully curated conversations from ShareGPT. This version represents a significant improvement over its predecessor, featuring refined tokenization methods and enhanced loss computation for better performance.
Implementation Details
The model implements an auto-regressive transformer architecture with several technical improvements in v1.1, including a shift from "###" separators to EOS token "" for better generation control and cross-library compatibility.
- Transformer-based architecture utilizing LLaMA foundation
- Improved tokenization system
- Enhanced supervised fine-tuning loss computation
- Integration with text-generation-inference framework
Core Capabilities
- Advanced conversational AI interactions
- Research-focused natural language processing
- Flexible text generation capabilities
- Improved cross-library compatibility
Frequently Asked Questions
Q: What makes this model unique?
Vicuna-13B v1.1 stands out due to its optimization for research purposes, extensive training on real-world conversations, and significant improvements in tokenization and loss computation compared to previous versions. It has been evaluated using GPT-4 as a judge across 80 diverse questions.
Q: What are the recommended use cases?
The model is primarily intended for research in natural language processing, machine learning, and AI. It's particularly suited for researchers and hobbyists working on understanding and advancing conversational AI systems.