NexusRaven-13B
Property | Value |
---|---|
Base Model | CodeLlama-13b-Instruct-hf |
License | Llama 2 |
Paper | View Paper |
Framework | PyTorch 2.0.1 |
What is NexusRaven-13B?
NexusRaven-13B is an advanced open-source function calling language model that represents a significant breakthrough in AI-powered code generation and function manipulation. Built on CodeLlama-13b-Instruct-hf, it achieves remarkable performance in executing complex functions, particularly in cybersecurity applications.
Implementation Details
The model is implemented using PyTorch and Transformers framework, trained with specific hyperparameters including a learning rate of 3e-05 and a total train batch size of 128. It utilizes multi-GPU training across 8 devices with gradient accumulation steps of 16.
- Trained using Adam optimizer with betas=(0.9,0.95)
- Constant learning rate scheduler
- 2-epoch training duration
- Compatible with Transformers 4.33.2 and Tokenizers 0.13.3
Core Capabilities
- 95% success rate in cybersecurity tool usage
- Zero-shot generalization to unseen functions
- Commercial viability with no proprietary LLM data usage
- Efficient function calling with lower cost than GPT-4
- Seamless integration with langchain
Frequently Asked Questions
Q: What makes this model unique?
NexusRaven-13B stands out for its exceptional function calling capabilities, achieving a 95% success rate compared to GPT-4's 64% in cybersecurity applications, while maintaining commercial viability and lower operational costs.
Q: What are the recommended use cases?
The model excels in scenarios requiring function calls, API interactions, and tool usage, particularly in cybersecurity applications. It's ideal for developers needing automated function execution with high accuracy and commercial deployments requiring reliable function calling capabilities.