TY - GEN
T1 - VAMPNET
T2 - 24th International Society for Music Information Retrieval Conference, ISMIR 2023
AU - García, Hugo Flores
AU - Seetharaman, Prem
AU - Kumar, Rithesh
AU - Pardo, Bryan
N1 - Publisher Copyright:
© H. Flores García, P. Seetharaman, R. Kumar, and B. Pardo.
PY - 2023
Y1 - 2023
N2 - We introduce VampNet, a masked acoustic token modeling approach to music synthesis, compression, inpainting, and variation. We use a variable masking schedule during training which allows us to sample coherent music from the model by applying a variety of masking approaches (called prompts) during inference. VampNet is non-autoregressive, leveraging a bidirectional transformer architecture that attends to all tokens in a forward pass. With just 36 sampling passes, VampNet can generate coherent high-fidelity musical waveforms. We show that by prompting VampNet in various ways, we can apply it to tasks like music compression, inpainting, outpainting, continuation, and looping with variation (vamping). Appropriately prompted, VampNet is capable of maintaining style, genre, instrumentation, and other high-level aspects of the music. This flexible prompting capability makes VampNet a powerful music co-creation tool. Code 3 and audio samples 4 are available online.
AB - We introduce VampNet, a masked acoustic token modeling approach to music synthesis, compression, inpainting, and variation. We use a variable masking schedule during training which allows us to sample coherent music from the model by applying a variety of masking approaches (called prompts) during inference. VampNet is non-autoregressive, leveraging a bidirectional transformer architecture that attends to all tokens in a forward pass. With just 36 sampling passes, VampNet can generate coherent high-fidelity musical waveforms. We show that by prompting VampNet in various ways, we can apply it to tasks like music compression, inpainting, outpainting, continuation, and looping with variation (vamping). Appropriately prompted, VampNet is capable of maintaining style, genre, instrumentation, and other high-level aspects of the music. This flexible prompting capability makes VampNet a powerful music co-creation tool. Code 3 and audio samples 4 are available online.
UR - http://www.scopus.com/inward/record.url?scp=85200693553&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85200693553&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85200693553
T3 - 24th International Society for Music Information Retrieval Conference, ISMIR 2023 - Proceedings
SP - 359
EP - 366
BT - 24th International Society for Music Information Retrieval Conference, ISMIR 2023 - Proceedings
A2 - Sarti, Augusto
A2 - Antonacci, Fabio
A2 - Sandler, Mark
A2 - Bestagini, Paolo
A2 - Dixon, Simon
A2 - Liang, Beici
A2 - Richard, Gael
A2 - Pauwels, Johan
PB - International Society for Music Information Retrieval
Y2 - 5 November 2023 through 9 November 2023
ER -