Musically motivated CNNs
Waveform-based deep learning for speech denoising & source separation
Freesound DatasetsWe are building the ImageNet for audio AI – visit our website!
Former academic projects
- Improving cochlear implant users music perception @ German Hearing Center
Music appreciation remains rather poor for many Cochlear Implant (CI) users due to their poor pitch perception. Simple music structures with a clear rhythm/beat are well perceived for CI users. By re-mixing the music it is possible to simplify the signal to make it more suitable for implantees. But the multitrack recordings necessary to generate a re-mix are not always accessible. To overcome this limitation, we proposed using source separation techniques to estimate the multitrack recordings. We conducted perceptual studies with non-negative matrix factorization -based separations, and we further provided additional results considering deep recurrent neural networks as a source separation algorithm – see our JASA article.
- Drums Transcription@ IRCAM (Paris)
We studied novel audio event detection methods using non-negative matrix deconvolution and the itakura saito divergence. A new approach for handling background sounds was proposed and moreover, a new detection criteria based on estimating the perceptual presence of the target class sources was introduced. Experimental results obtained for drum detection in polyphonic music and drum solos demonstrate the beneficial effects of the proposed extensions – see our ICASSP paper.
Former non-academic projects
- Chord Profiles – Summer project
A SIMPLE framework where ALL binary chord profiles are properly defined – available on GitHub.
- How big is the smallest pitch difference between 2 consecutive tones that a human listener can detect? @ Master in Sound & Music Computing
We answered the previous question via a perceptual test. We concluded that the smallest pitch difference between 2 consecutive tones that a musician can detect (3.18 MELs) is smaller than the smallest pitch difference perceived by a non-musician (11.41 MELs). Further conclusions are listed here.