Nous-Hermes-13b
Property | Value |
---|---|
Base Model | LLaMA 13B |
License | GPL |
Training Data | 300,000+ instructions |
Training Duration | 50+ hours |
Hardware Used | 8x A100 80GB DGX |
What is Nous-Hermes-13b?
Nous-Hermes-13b is a sophisticated language model developed through collaborative efforts between Nous Research, Teknium, and Karan4D. It represents a significant advancement in AI language modeling, built upon the LLaMA 13B architecture and fine-tuned with over 300,000 carefully curated instructions. The model has demonstrated performance levels comparable to GPT-3.5-turbo across various benchmarks.
Implementation Details
The model's training process involved extensive fine-tuning using synthetic GPT-4 outputs, including data from diverse sources such as GPTeacher, roleplay datasets, code instruct datasets, and various specialized instruction sets. The training was conducted with a 2000 sequence length configuration on high-performance hardware.
- Trained primarily on GPT-4 generated content
- Incorporates specialized datasets for biology, physics, chemistry, and mathematics
- Uses Alpaca prompt format for interaction
- Available in FP16 format with planned GGML and GPTQ 4bit quantizations
Core Capabilities
- Exceptional performance in ARC-c, ARC-e, and Hellaswag benchmarks
- Long-form response generation with low hallucination rates
- Robust instruction following capabilities
- Advanced code generation and scientific reasoning
Frequently Asked Questions
Q: What makes this model unique?
The model stands out for its ability to generate long, coherent responses while maintaining a low hallucination rate. It operates without OpenAI's censorship mechanisms and has achieved top rankings in multiple benchmark categories.
Q: What are the recommended use cases?
The model is suitable for a wide range of applications including creative text generation, complex instruction following, code generation, and scientific reasoning tasks. It can be implemented in chatbots, discord bots, and various text generation applications.