NSQL-Llama-2-7B
Property | Value |
---|---|
Base Model | Llama-2 7B |
License | Llama2 |
Training Data | 1M SQL queries + Text-to-SQL pairs |
Primary Use | SQL Generation |
What is NSQL-Llama-2-7B?
NSQL-Llama-2-7B is a specialized large language model designed specifically for SQL generation tasks. Built on Meta's Llama-2 7B architecture, this model has been further pre-trained on a massive dataset of general SQL queries and fine-tuned on text-to-SQL pairs, making it particularly adept at converting natural language questions into SQL queries.
Implementation Details
The model underwent a two-stage training process using 80GB A100s with data and model parallelism. The initial pre-training consisted of 3 epochs on 1M SQL queries from The Stack dataset, followed by 10 epochs of fine-tuning on text-to-SQL pairs from over 20 public sources.
- Cross-entropy loss optimization focused on SQL portion generation
- Implements torch.bfloat16 for efficient processing
- Supports customizable generation parameters including max_length and do_sample
Core Capabilities
- Natural language to SQL query conversion
- Schema-aware query generation
- Support for complex database operations
- Optimized for SELECT query generation
Frequently Asked Questions
Q: What makes this model unique?
NSQL-Llama-2-7B stands out due to its specialized training on SQL queries and text-to-SQL pairs, making it particularly effective for database query generation tasks while maintaining the robust capabilities of the Llama-2 architecture.
Q: What are the recommended use cases?
The model is best suited for converting natural language questions into SQL queries, particularly when working with well-defined database schemas. It excels at generating SELECT queries and can handle various database operations while following SQLite syntax.