Abstract
We present in this paper two loss functions tailored for rare audio event detection in audio streams. The weighted loss is designed to tackle the common issue of imbalanced data in background/foreground classification while the multi-task loss enables the networks to simultaneously model the class distribution and the temporal structures of the target events for recognition. We study the proposed loss functions with deep neural networks (DNNs) and convolutional neural networks (CNNs) coupled with state-of-the-art phase-aware signal enhancement. Experiments on the DCASE 2017 challenge's data show that our system with the proposed losses significantly outperforms not only the DCASE 2017 baseline but also our baseline which has a similar network architecture and a standard loss function.
| Original language | English |
|---|---|
| Title of host publication | 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) |
| Number of pages | 5 |
| Volume | 2018-April |
| Publisher | IEEE |
| Publication date | 01.04.2018 |
| Pages | 336-340 |
| ISBN (Print) | 978-153864658-8 |
| DOIs | |
| Publication status | Published - 01.04.2018 |
| Event | 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing - Calgary Telus Convention Center, Calgary, Canada Duration: 15.04.2018 → 20.04.2018 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 9 Industry, Innovation, and Infrastructure
Fingerprint
Dive into the research topics of 'Weighted and Multi-Task Loss for Rare Audio Event Detection'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver