MusicGen Small

Property	Value
Parameter Count	591M
License	CC-BY-NC 4.0
Author	Meta AI (Facebook)
Paper	Simple and Controllable Music Generation
Audio Sample Rate	32kHz

What is musicgen-small?

MusicGen-small is a compact yet powerful text-to-music generation model developed by Meta AI's FAIR team. It represents a significant advancement in AI-powered music creation, utilizing a single-stage auto-regressive Transformer architecture to generate high-quality instrumental music from text descriptions. The model operates with 4 codebooks sampled at 50 Hz and can generate music at 32kHz quality.

Implementation Details

The model employs an EnCodec tokenizer for audio processing and generates all 4 codebooks in a single pass, requiring only 50 auto-regressive steps per second of audio. This efficient architecture allows for parallel prediction of codebooks with minimal delay between them.

Architecture: Single-stage auto-regressive Transformer
Tokenization: EnCodec with 4 codebooks
Sampling Rate: 32kHz
Parameter Size: 591M

Core Capabilities

Text-to-music generation with detailed control
High-quality instrumental music synthesis
Support for various musical styles and genres
Generation of up to 8 seconds of music
Parallel processing of audio codebooks

Frequently Asked Questions

Q: What makes this model unique?

Unlike other models like MusicLM, MusicGen-small doesn't require a self-supervised semantic representation and can generate all codebooks in one pass, making it more efficient and straightforward to use.

Q: What are the recommended use cases?

The model is primarily intended for research purposes in AI-based music generation, including studying generative models' capabilities and experimenting with text-guided music creation. It should not be used for commercial applications without proper licensing and risk evaluation.

musicgen-small