COCOLA: Coherence-Oriented Contrastive Learning of Musical Audio Representations

Ruben Ciranni, Emilian Postolache, Giorgio Mariani, Michele Mancusi, Luca Cosmo, Emanuele Rodolà

April, 2024

Abstract

We present COCOLA (Coherence-Oriented Contrastive Learning for Audio), a contrastive learning method for musical audio representations that captures the harmonic and rhythmic coherence between samples. Our method operates at the level of stems (or their combinations) composing music tracks and allows the objective evaluation of compositional models for music in the task of accompaniment generation. We also introduce a new baseline for compositional music generation called CompoNet, based on ControlNet, generalizing the tasks of MSDM, and quantify it against the latter using COCOLA. We release all models trained on public datasets containing separate stems (MUSDB18-HQ, MoisesDB, Slakh2100, and CocoChorales).

Type

Preprint

Publication

arXiv preprint

COCOLA: Coherence-Oriented Contrastive Learning of Musical Audio Representations

Abstract

Emilian Postolache

Senior AI Research Scientist

Giorgio Mariani

PhD Student

Michele Mancusi

PhD Student

Luca Cosmo

Assistant Professor

Emanuele Rodolà

Full Professor