Miron M, Janer J, Gomez E. Monaural score-informed source separation for classical music using convolutional neural networks. ISMIR 2017
Miron M, Janer J, Gomez E. Monaural score-informed source separation for classical music using convolutional neural networks. ISMIR 2017
Miron M, Janer J, Gomez E. Monaural score-informed source separation for classical music using convolutional neural networks. ISMIR 2017
Score information has been shown to improve music source separation when included into non-negative matrix factorization (NMF) frameworks. Recently, deep learning approaches have outperformed NMF methods in terms of separation quality and processing time, and there is scope to extend them with score information. In this paper, we propose a score-informed separation system for classical music that is based on deep learning. We propose a method to derive training features from audio files and the corresponding coarsely aligned scores for a set of classical music pieces. Additionally, we introduce a convolutional neural network architecture (CNN) with the goal of estimating time-frequency masks for source separation. Our system is trained with synthetic renditions derived from the original scores and can be used to separate real-life performances based on the same scores, provided a coarse audio-to-score alignment. The proposed system achieves better performance (SDR and SIR) and is less computationally intensive than a score-informed NMF system on a dataset comprising Bach chorales.
Additional material:
-
The code is available through a github repository.
-
We test our method with a well known classical music dataset, Bach10, which can be found online.
-
The training data is generated from the audio samples in the RWC instrument samples dataset with the code in the github repository.
-
The separated tracks, the CNN trained model and the .mat files corresponding to the results in terms of SDR,SIR,SAR can be found at the zenodo repository.