NSQL-Llama-2-7B

Property	Value
Base Model	Llama-2 7B
License	Llama2
Training Data	1M SQL queries + Text-to-SQL pairs
Primary Use	SQL Generation

What is NSQL-Llama-2-7B?

NSQL-Llama-2-7B is a specialized large language model designed specifically for SQL generation tasks. Built on Meta's Llama-2 7B architecture, this model has been further pre-trained on a massive dataset of general SQL queries and fine-tuned on text-to-SQL pairs, making it particularly adept at converting natural language questions into SQL queries.

Implementation Details

The model underwent a two-stage training process using 80GB A100s with data and model parallelism. The initial pre-training consisted of 3 epochs on 1M SQL queries from The Stack dataset, followed by 10 epochs of fine-tuning on text-to-SQL pairs from over 20 public sources.

Cross-entropy loss optimization focused on SQL portion generation
Implements torch.bfloat16 for efficient processing
Supports customizable generation parameters including max_length and do_sample

Core Capabilities

Natural language to SQL query conversion
Schema-aware query generation
Support for complex database operations
Optimized for SELECT query generation

Frequently Asked Questions

Q: What makes this model unique?

NSQL-Llama-2-7B stands out due to its specialized training on SQL queries and text-to-SQL pairs, making it particularly effective for database query generation tasks while maintaining the robust capabilities of the Llama-2 architecture.

Q: What are the recommended use cases?

The model is best suited for converting natural language questions into SQL queries, particularly when working with well-defined database schemas. It excels at generating SELECT queries and can handle various database operations while following SQLite syntax.

nsql-llama-2-7B