Abstract
This paper proposes an approach for the efficient automatic jointdetection and localization of single-channel acoustic events us-ing random forest regression. The audio signals are decom-posed into multiple densely overlappingsuperframesannotatedwith event class labels and their displacements to the temporalstarting and ending points of the events. Using the displacementinformation, a multivariate random forest regression model islearned for each event category to map each superframe to con-tinuous estimates of onset and offset locations of the events. Inaddition, two classifiers are trained using random forest clas-sification to classify superframes of background and differentevent categories. On testing, based on the detection of category-specific superframes using the classifiers, the learned regressorprovides the estimates of onset and offset locations in time ofthe corresponding event. While posing event detection and lo-calization as a regression problem is novel, the quantitative eval-uation on ITC-Irst database of highly variable acoustic eventsshows the efficiency and potential of the proposed approach.
Originalsprache | Englisch |
---|---|
Titel | Proc. 15th Annual Conference of the International Speech Communication Association (INTERSPEECH 2014) |
Seitenumfang | 5 |
Erscheinungsort | Singapore |
Herausgeber (Verlag) | International Speech and Communication Association (ISCA) |
Erscheinungsdatum | 01.09.2014 |
Seiten | 2524-2528 |
Publikationsstatus | Veröffentlicht - 01.09.2014 |
Veranstaltung | 15th Annual Conference of the International Speech Communication Association: Celebrating the Diversity of Spoken Languages - Max Atria at Singapore Expo Singapore, Singapore, Singapur Dauer: 14.09.2014 → 18.09.2014 Konferenznummer: 108771 |