Three new arXiv articles

These last months have been very intense for us – and, as a result, three papers were recently uploaded to arXiv. Two of those have been accepted for presentation in ISMIR, and are the result of a collaboration with Rong – who is an amazing PhD student (also advised by Xavier) working on Jingju music:

The third paper was done in collaboration with Dario (an excellent master student!) who was interested in using deep learning models operating directly on the audio:

Journal article: Remixing music using source separation algorithms to improve the musical experience of cochlear implant users

This journal article summarizes the most relevant results we found throughout my master thesis research – namely, the results related to popular western music. However, in this thesis we also describe the first attempt of remixing orchestral music for improving CI users classical music experience. Although the results for orchestral music are not conclusive, they provide nice intuition for designing future experiments and might be valuable for researchers who are interested in that topic.

jasa Continue reading

Conference paper: Designing efficient architectures for modeling temporal features with CNNs

Abstract – Many researchers use convolutional neural networks with small rectangular filters for music (spectrograms) classification. First, we discuss why there is no reason to use this filters setup by default and second, we point that more efficient architectures could be implemented if the characteristics of the music features are considered during the design process. Specifically, we propose a novel design strategy that might promote more expressive and intuitive deep learning architectures by efficiently exploiting the representational capacity of the first layer – using different filter shapes adapted to fit musical concepts within the first layer. The proposed architectures are assessed by measuring their accuracy in predicting the classes of the Ballroom dataset. We also make available the used code (together with the audio-data) so that this research is fully reproducible.

figure_big Continue reading

Towards adapting CNNs for music spectrograms: first attempt

These (preliminary) results denote that the CNNs design for music informatics research (MIR) can be further optimized by considering the characteristics of the music audio data. By designing musically motivated CNNs, a much more interpretable and efficient model can be obtained. These results are the logical culmination of the previously presented discussion, that we recommend to read first.

joint Continue reading