These last months have been very intense for us – and, as a result, three papers were recently uploaded to arXiv. Two of those have been accepted for presentation in ISMIR, and are the result of a collaboration with Rong – who is an amazing PhD student (also advised by Xavier) working on Jingju music:
This journal article summarizes the most relevant results we found throughout my master thesis research – namely, the results related to popular western music. However, in this thesis we also describe the first attempt of remixing orchestral music for improving CI users classical music experience. Although the results for orchestral music are not conclusive, they provide nice intuition for designing future experiments and might be valuable for researchers who are interested in that topic.
Abstract – Many researchers use convolutional neural networks with small rectangular filters for music (spectrograms) classification. First, we discuss why there is no reason to use this filters setup by default and second, we point that more efficient architectures could be implemented if the characteristics of the music features are considered during the design process. Specifically, we propose a novel design strategy that might promote more expressive and intuitive deep learning architectures by efficiently exploiting the representational capacity of the first layer – using different filter shapes adapted to fit musical concepts within the first layer. The proposed architectures are assessed by measuring their accuracy in predicting the classes of the Ballroom dataset. We also make available the used code (together with the audio-data) so that this research is fully reproducible.
These (preliminary) results denote that the CNNs design for music informatics research (MIR) can be further optimized by considering the characteristics of the music audio data. By designing musically motivated CNNs, a much more interpretable and efficient model can be obtained. These results are the logical culmination of the previously presented discussion, that we recommend to read first.