Preprint: “GASS – Generalizing Audio Source Separation with Large-scale Data”1 min read By Jordi Pons in Conferences, Deep learning October 15, 2023 This paper summarises the latest work I did at Dolby. We study a single general audio source separation (GASS) model trained to separate speech, music, and sound events in a supervised fashion with a large-scale dataset. Check it on arXiv and listen to our demos!
WASPAA 2023 paper: “CLIPSonic – Text-to-Audio Synthesis with Unlabeled Videos and Pretrained Language-Vision Models” 01 Jul, 2023