Weighted and Multi-Task Loss for Rare Audio Event Detection

H. Phan, M. Krawczyk-Becker, T. Gerkmann, A. Mertins


We present in this paper two loss functions tailored for rare audio event detection in audio streams. The weighted loss is designed to tackle the common issue of imbalanced data in background/foreground classification while the multi-task loss enables the networks to simultaneously model the class distribution and the temporal structures of the target events for recognition. We study the proposed loss functions with deep neural networks (DNNs) and convolutional neural networks (CNNs) coupled with state-of-the-art phase-aware signal enhancement. Experiments on the DCASE 2017 challenge's data show that our system with the proposed losses significantly outperforms not only the DCASE 2017 baseline but also our baseline which has a similar network architecture and a standard loss function.
Titel2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Herausgeber (Verlag)IEEE
ISBN (Print)978-153864658-8
PublikationsstatusVeröffentlicht - 01.04.2018
Veranstaltung2018 IEEE International Conference on Acoustics, Speech, and Signal Processing - Calgary Telus Convention Center, Calgary, Kanada
Dauer: 15.04.201820.04.2018


Untersuchen Sie die Forschungsthemen von „Weighted and Multi-Task Loss for Rare Audio Event Detection“. Zusammen bilden sie einen einzigartigen Fingerprint.