CLaMP: Contrastive Language-Music Pre-Training for Cross-Modal Symbolic Music Information Retrieval

doi:10.5281/zenodo.10265247

Published November 4, 2023 | Version v1

Conference paper Open

CLaMP: Contrastive Language-Music Pre-Training for Cross-Modal Symbolic Music Information Retrieval

We introduce CLaMP: Contrastive Language-Music Pre-training, which learns cross-modal representations between natural language and symbolic music using a music encoder and a text encoder trained jointly with a contrastive loss. To pre-train CLaMP, we collected a large dataset of 1.4 million music-text pairs. It employed text dropout as a data augmentation technique and bar patching to efficiently represent music data which reduces sequence length to less than 10%. In addition, we developed a masked music model pre-training objective to enhance the music encoder's comprehension of musical context and structure. CLaMP integrates textual information to enable semantic search and zero-shot classification for symbolic music, surpassing the capabilities of previous models. To support the evaluation of semantic search and music classification, we publicly release WikiMusicText (WikiMT), a dataset of 1010 lead sheets in ABC notation, each accompanied by a title, artist, genre, and description. In comparison to state-of-the-art models that require fine-tuning, zero-shot CLaMP demonstrated comparable or superior performance on score-oriented datasets. Our models and code are available at https://github.com/microsoft/muzic/tree/main/clamp.

Files

000017.pdf

Files (456.5 kB)

Name	Size	Download all
000017.pdf md5:f34ff63f4d2ac5baf2f9f49342efc652	456.5 kB	Preview Download

216

Views

204

Downloads

Show more details

	All versions	This version
Views	216	216
Downloads	204	204
Data volume	106.8 MB	106.8 MB

More info on how stats are collected....

DOI

Resource type

Conference paper

Publisher

ISMIR

Imprint

Proceedings of the 24th International Society for Music Information Retrieval Conference, 157-165. Milan, Italy.

Conference

International Society for Music Information Retrieval Conference (ISMIR 2023) , Milan, Italy, November 5-9, 2023

Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: December 5, 2023
Modified: July 10, 2024

CLaMP: Contrastive Language-Music Pre-Training for Cross-Modal Symbolic Music Information Retrieval

Creators

Description

Files

000017.pdf

Files (456.5 kB)