This paper summarises the latest work I did at Dolby. We study a single general audio source separation (GASS) model trained to separate speech, music, and sound events in a supervised fashion with a large-scale dataset.
Check it on arXiv and listen to our demos!
