Slides: Deep learning for music data processing – a personal (re)view

I was invited to give a talk to the Deep Learning for Speech and Language Winter Seminar @ UPC,  Barcelona. Since UPC is the university where I did my undergaduate sudies, it was a great pleasure to give an introductory talk about how our community is using deep learning for approaching music technology problems.

Download the slides!

Overall, the talk was centered in reviewing the state-of-the-art (1988-2016) in deep learning for music data processing in order to boost some discussion about current trends. Several key papers were chronologically listed and briefly described: pioneer papers using MLP [1], RNNs [2], LSTMs [3] and CNNs [4] for music data processing; and pioner papers using symbolic data [1], spectrograms [5] and waveforms [6] – among others.

Throghouht the slides, I present a chronology where some papers are highlighted.

Do you agree with this chronology? Feel free to contact me for any suggestion (or claim) about which are the first papers using well-known deep learning techniques for music data processing.

[1] J. P. Lewis. “Creation by refinement: A creativity paradigm for gradient descent learning networks”. International Conf. on Neural Networks. 1988.

[2] P. M. Todd. “A sequential network design for musical applications”. Proceedings of the Connectionist Models Summer School. 1988.

[3] D. Eck  and J Schmidhuber. “A first look at music composition using lstm recurrent neural networks”. Istituto Dalle Molle Di Studi Sull Intelligenza Artificiale. 2002.

[4] H. Lee, P. Pham, Y. Largman, and A. Y. Ng. “Unsupervised feature learning for audio classification using convolutional deep belief networks”. Advances in neural information processing systems (NIPS). 2009.

[5] M. Marolt, A. Kavcic, and M. Privosnik. “Neural networks for note onset detection in piano music”. International Computer Music Conference (ICMC). 2002.

[6] S. Dieleman and B. Schrauwen. “End-to-end learning for music audio”. International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2014.

Thanks to @DocXavi for the picture!