CodeGen-350M-multi
Property | Value |
---|---|
Author | Salesforce |
License | BSD-3-Clause |
Paper | A Conversational Paradigm for Program Synthesis |
Training Data | 119.2B tokens from multiple programming languages |
What is codegen-350M-multi?
CodeGen-350M-multi is an autoregressive language model specifically designed for program synthesis. Developed by Salesforce, this model represents a sophisticated approach to converting natural language descriptions into executable code. It was initialized from CodeGen-NL 350M and further pre-trained on a diverse dataset of programming languages including C, C++, Go, Java, JavaScript, and Python.
Implementation Details
The model utilizes a transformer-based architecture and was trained using cross-entropy loss to maximize the likelihood of sequential inputs. Training was conducted on TPU-v4-512 hardware, implementing both data and model parallelism for optimal performance.
- 350M trainable parameters
- Multi-language support with 119.2B training tokens
- Built on the transformers architecture
- Implements PyTorch framework
Core Capabilities
- Program synthesis from natural language descriptions
- Code completion and generation
- Multi-language code generation
- Feature extraction from both natural language and programming language texts
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its specialized focus on program synthesis across multiple programming languages. Its pre-training on both natural language and diverse programming languages makes it particularly effective for converting English descriptions into executable code.
Q: What are the recommended use cases?
The model is best suited for generating executable code from English prompts provided as comment strings. It excels at program synthesis tasks and can effectively complete partially-generated code across multiple programming languages.