On using the auditory image model and invariant-integration for noise robust automatic speech recognition

F. Müller, A. Mertins

Abstract

Commonly used feature extraction methods for automatic speech recognition (ASR) incorporate only rudimentary psychoacoustic findings. Several works showed that a physiologically closer auditory processing during the feature extraction stage can enhance the robustness of an ASR system in noisy environments. The “auditory image model” (AIM) is such a more sophisticated computational model. In this work we show how invariant integration can be applied in the feature space given by the AIM, and we analyze the performance of the resulting features under noisy conditions on the Aurora-2 task. Furthermore, we show that previously presented features based on power-normalization and invariant integration benefit from the AIM-based integration features when the feature vectors are combined with each other.
OriginalspracheEnglisch
Titel2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Seitenumfang4
Herausgeber (Verlag)IEEE
Erscheinungsdatum01.03.2012
Seiten4905-4908
ISBN (Print)978-1-4673-0045-2
ISBN (elektronisch)978-1-4673-0046-9
DOIs
PublikationsstatusVeröffentlicht - 01.03.2012
Veranstaltung2012 IEEE International Conference on Acoustics, Speech, and Signal Processing - Kyoto, Japan
Dauer: 25.03.201230.03.2012
Konferenznummer: 93091

Fingerprint

Untersuchen Sie die Forschungsthemen von „On using the auditory image model and invariant-integration for noise robust automatic speech recognition“. Zusammen bilden sie einen einzigartigen Fingerprint.

Zitieren