CodeGen-350M-multi

Property	Value
Author	Salesforce
License	BSD-3-Clause
Paper	A Conversational Paradigm for Program Synthesis
Training Data	119.2B tokens from multiple programming languages

What is codegen-350M-multi?

CodeGen-350M-multi is an autoregressive language model specifically designed for program synthesis. Developed by Salesforce, this model represents a sophisticated approach to converting natural language descriptions into executable code. It was initialized from CodeGen-NL 350M and further pre-trained on a diverse dataset of programming languages including C, C++, Go, Java, JavaScript, and Python.

Implementation Details

The model utilizes a transformer-based architecture and was trained using cross-entropy loss to maximize the likelihood of sequential inputs. Training was conducted on TPU-v4-512 hardware, implementing both data and model parallelism for optimal performance.

350M trainable parameters
Multi-language support with 119.2B training tokens
Built on the transformers architecture
Implements PyTorch framework

Core Capabilities

Program synthesis from natural language descriptions
Code completion and generation
Multi-language code generation
Feature extraction from both natural language and programming language texts

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialized focus on program synthesis across multiple programming languages. Its pre-training on both natural language and diverse programming languages makes it particularly effective for converting English descriptions into executable code.

Q: What are the recommended use cases?

The model is best suited for generating executable code from English prompts provided as comment strings. It excels at program synthesis tasks and can effectively complete partially-generated code across multiple programming languages.