We present in this paper an efficient approach for acoustic scene classification by exploring the structure of class labels. Given a set of class labels, a category taxonomy is automatically learned by collectively optimizing a clustering of the labels into multiple meta-classes in a tree structure. An acoustic scene instance is then embedded into a low-dimensional feature representation which consists of the likelihoods that it belongs to the meta-classes. We demonstrate state-of-the-art results on two different datasets for the acoustic scene classification task, including the DCASE 2013 and LITIS Rouen datasets.
|Title of host publication||Proceedings of the 2016 ACM on Multimedia Conference|
|Number of pages||5|
|Place of Publication||New York, NY, USA|
|Publication status||Published - 01.10.2016|
|Event||24th ACM Multimedia Conference - Amsterdam, Netherlands|
Duration: 15.10.2016 → 19.10.2016
Conference number: 124107