Today we release “PodcastMix”, a dataset for separating music and speech in podcasts:
Our EUSIPCO 2017 paper got accepted! This paper was done in collaboration with Olga Slizovskaia, Rong Gong, Emilia Gómez and Xavier Serra. And it is entitled: “Timbre Analysis of Music Audio Signals with Convolutional Neural Networks”.
And I have been awarded with one of the AI Grants given by Nat Friedman for creating a dataset of sounds from Freesound and using it in my research. The AI grants are an initiative of Nat Friedman, Cofounder/CEO of Xamarin, to support open-source AI projects. The project I proposed is part of an initiative of the MTG to promote the use of Freesound.org for research. The goal is to create a large dataset of sounds, following the same principles as Imagenet – in order to make audio AI more accessible to everyone. The project will contribute in developing an infrastructure to organize a crowdsource tool to convert Freesound into a research dataset. The following video presents the aforementioned project:
After digging into AudioSet Ontology, we realized that a tree-like visualization might be very useful for further understanding the proposed ontology. To this end, we adapted some code from here: https://bl.ocks.org/mbostock/4339083
Project done in collaboration with Xavier Favory, Eduardo Fonseca and Frederic Font.