A Voting-based Technique for Acoustic Event-Specific Detection

Huy Phan, Alfred Mertins

Abstract

Acoustic event detection has been an active researchtopic during last few years. However, building an acous-tic event detection system still remains a challengingtask. The difficulty stems from the large intra-class vari-ations in terms of different temporal scales and sounds,non-stationary background noise, and, especially, the na-ture of overlapping events.Several works attempted to address the problem. In gen-eral, these employ simple frame-level presentations and avariety of classification algorithms. Typically, individualevents are modelled as Hidden Markov Models (HMM),and a speech recognition framework is employed to detectthem [4]. The audio segments can also be characterizedby the Gaussian population histograms derived from aGaussian Mixture Model (GMM), and the detection isperformed as classification task using GMMs [5]. In an-other work, Support Vector Machines (SVM) are directlyused over feature vectors derived from audio signals [2].In this work we introduce a novel concept ofacoustic su-perframeand how event detection can be accomplishedby recognition of superframes using a simple but efficientclass-specific voting scheme. We employrandom forest[3] to model the event superframes. After detection of in-dividual event superframes, the detection hypotheses forthe events will correspond to majority voting from all su-perframes. The evaluation on the UPC-TALP databasefrom CLEAR 2006 challenge [1] shows that our approachoutperforms the best system submitted to that challenge.
Original languageEnglish
Number of pages2
Publication statusPublished - 01.03.2014
Event40th Annual German Congress on Acoustics - Oldenburg, Germany
Duration: 10.03.201413.03.2014
http://pub.dega-akustik.de/DAGA_2014/data/index.html

Conference

Conference40th Annual German Congress on Acoustics
Abbreviated titleDAGA 2014
Country/TerritoryGermany
CityOldenburg
Period10.03.1413.03.14
Internet address

Fingerprint

Dive into the research topics of 'A Voting-based Technique for Acoustic Event-Specific Detection'. Together they form a unique fingerprint.

Cite this