Background

Background The paper discusses the current lack of publicly available music-related natural language corpora, which hinders the ability of language models (LMs) to comprehend and generate music.
Existing Work Previous research could not address this issue mainly due to the absence of a dedicated music corpora pre-training dataset, which limits the application of models in the music domain.

Core Contributions

Introduced a pre-training dataset called MusicPile
- Challenge 1: Lack of music-related corpora Researchers curated the MusicPile dataset by selecting from existing large-scale corpora using music-related terms and topics to inject musical capability into LLMs.
- Challenge 2: Limited performance in understanding and generating music The researchers established MusicTheoryBench, a benchmarking test to evaluate LLMs' music understanding and reasoning capabilities. They also demonstrated the use of the ABC musical notation system for effective compression and encoding of musical structures.

Implementation and Deployment

The evaluation results demonstrated that the proposed methods and datasets significantly improve music understanding and generation capabilities. For the understanding capability tests, all systems exceeded the random baseline, with GPT-4 reaching the highest score of 58.2 in the music knowledge metric. In music reasoning, even though performances were modest, ChatMusician-Base and ChatMusician achieved scores of 27.1 and 26.3 in a zero-shot setting, surpassing GPT-4. In terms of music generation, the proposed ABC notation compressing approach achieved a higher compression ratio and was proven by human evaluation to offer superior musicality in its generated music.

Summary

The paper made substantial progress in an under-researched domain by creating the first music pre-training dataset and assessment benchmark for language models, enhancing LLMs' performance in understanding and generating music.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2402.16153.md

2402.16153.md

Background

Core Contributions

Implementation and Deployment

Summary

Files

2402.16153.md

Latest commit

History

2402.16153.md

File metadata and controls

Background

Core Contributions

Implementation and Deployment

Summary