nsql-llama-2-7B

Maintained By
NumbersStation

NSQL-Llama-2-7B

PropertyValue
Base ModelLlama-2 7B
LicenseLlama2
Training Data1M SQL queries + Text-to-SQL pairs
Primary UseSQL Generation

What is NSQL-Llama-2-7B?

NSQL-Llama-2-7B is a specialized large language model designed specifically for SQL generation tasks. Built on Meta's Llama-2 7B architecture, this model has been further pre-trained on a massive dataset of general SQL queries and fine-tuned on text-to-SQL pairs, making it particularly adept at converting natural language questions into SQL queries.

Implementation Details

The model underwent a two-stage training process using 80GB A100s with data and model parallelism. The initial pre-training consisted of 3 epochs on 1M SQL queries from The Stack dataset, followed by 10 epochs of fine-tuning on text-to-SQL pairs from over 20 public sources.

  • Cross-entropy loss optimization focused on SQL portion generation
  • Implements torch.bfloat16 for efficient processing
  • Supports customizable generation parameters including max_length and do_sample

Core Capabilities

  • Natural language to SQL query conversion
  • Schema-aware query generation
  • Support for complex database operations
  • Optimized for SELECT query generation

Frequently Asked Questions

Q: What makes this model unique?

NSQL-Llama-2-7B stands out due to its specialized training on SQL queries and text-to-SQL pairs, making it particularly effective for database query generation tasks while maintaining the robust capabilities of the Llama-2 architecture.

Q: What are the recommended use cases?

The model is best suited for converting natural language questions into SQL queries, particularly when working with well-defined database schemas. It excels at generating SELECT queries and can handle various database operations while following SQLite syntax.

The first platform built for prompt engineering