Weighted and Multi-Task Loss for Rare Audio Event Detection

H. Phan, M. Krawczyk-Becker, T. Gerkmann, A. Mertins


We present in this paper two loss functions tailored for rare audio event detection in audio streams. The weighted loss is designed to tackle the common issue of imbalanced data in background/foreground classification while the multi-task loss enables the networks to simultaneously model the class distribution and the temporal structures of the target events for recognition. We study the proposed loss functions with deep neural networks (DNNs) and convolutional neural networks (CNNs) coupled with state-of-the-art phase-aware signal enhancement. Experiments on the DCASE 2017 challenge's data show that our system with the proposed losses significantly outperforms not only the DCASE 2017 baseline but also our baseline which has a similar network architecture and a standard loss function.
Original languageEnglish
Title of host publication2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Number of pages5
Publication date01.04.2018
ISBN (Print)978-153864658-8
Publication statusPublished - 01.04.2018
Event2018 IEEE International Conference on Acoustics, Speech, and Signal Processing - Calgary Telus Convention Center, Calgary, Canada
Duration: 15.04.201820.04.2018


Dive into the research topics of 'Weighted and Multi-Task Loss for Rare Audio Event Detection'. Together they form a unique fingerprint.

Cite this