ICASSP 2022 paper: “On loss functions and evaluation metrics for music source separation”

During his internship at Dolby, Enric run an exhaustive evaluation of various loss functions for music source separation. After evaluating those losses objectively and subjectively, we recommend training with the following spectrogram-based losses: L2freq, SISDRfreq, LOGL2freq or LOGL1freq with, potentially, phase- sensitive objectives and adversarial regularizers.

link to arXiv!

Preprint: Upsampling layers for music source separation

We investigated various upsampling layers to consolidate the ideas we introduced in our previous paper. We benchmarked a large set of upsampling layers for music source separation: different transposed and subpixel convolution setups, different interpolation upsamplers (including two novel layers based on stretch and sinc interpolation), and different wavelet-based upsamplers (including a novel learnable wavelet layer).

Check our project website, and paper on arXiv!

WASPAA 2021 paper: “Adversarial auto-encoding for packet loss concealment”

PLAAE (packet loss adversarial auto-encoder) is our proposal for packet loss concealment in a non-autoregressive fashion. Our goal is to reconstruct missing speech packets until a new (real) packet is received in a video-call. Our end-to-end non-autoregressive adversarial auto-encoder specially shines at long-term predictions, beyond 60ms. The paper has been accepted for presentation at WASPAA 2021! Check out our arXiv pre-print.

Slides: Towards building an artistic discourse around music AI

On Thursday 13th May from 17:00 – 19:00 (CET) I’ll be part of the workshop ‘Exploring connections between AI and Music’. The live-streamed event is free to watch, and marks the presentation of the AI and Music Festival and its first activity (more information here). To prepare for it, I reviewed previous works by music AI artists and researchers. This slide deck contains a summary of how I perceive the current music AI scene.