This was my second ISMIR, and I am super excited of being part of this amazing, diverse, and so inclusive community. It was fun to keep putting faces (and height, and weight) to these names I respect so much! This ISMIR has been very special for me, because I was returning to the city where I kicked off my academic career (5 years ago I was starting a research internship @ IRCAM!), and we won the best student paper award!
Many things have happened between the pioneering papers written by Lewis and Todd in the 80s and the current wave of GANs composers. Along that journey, connectionists’ work was forgotten during the AI winter, very influential names (like Schmidhuber or Ng) contributed seminal publications and, in the meantime, researchers have made tons of awesome progress.
I won’t be going through every single paper in the field of neural networks for music nor diving into technicalities, but I’ll cover what are the milestones that helped shaping the current state of music AI – this being a nice excuse to give credit to these wild researchers who decided to care about a signal that is nothing else but cool. Let’s start!
Here my first personal AMA interview! But wait, what’s an AMA interview? AMA stands for “Ask Me Anything” in Reddit jargon. After reading this interview you will know a bit more about my life and way of thinking 🙂 This interview is a dissemination effort done by the María de Maeztu program (who funds my PhD research), and the AI Grant (who supports our Freesound Datasets project). Let’s start!
1) Given that enough training data is available: waveform models (sampleCNN) > spectrogram models (musically motivated CNN).
2) But spectrogram models > waveform models when no sizable data are available.
3) Musically motivated CNNs achieve state-of-the-art results for the MTT & MSD datasets.