A study of auditory-filterbank based preprocessing for the generation of Warping-Invariant Features

Jan Rademacher, Alfred Mertins

Abstract

Auditory filterbanks have a long history in the preprocessing stage of automatic speech recognition systems, with the most prominent examples being the mel frequency cepstral coefficients (MFCCs). In this paper, we study the usefulness of auditory-filterbank analyses as a preprocessor for the generation of frequency-warping invariant features. The results indicate, that gammatone-filterbank analyses following the equivalent rectangular bandwidth (ERB) scale yield the most robust feature sets. The performance improvements are most significant when the vocal tract lengths in the training and test sets differ, which is important when, for example, children speech is to be recognized with a system that was mainly trained on adult data.
Original languageEnglish
Pages1-5
Number of pages5
Publication statusPublished - 01.05.2006
EventSpeech Recognition and Intrinsic Variation (SRIV2006)
- Toulouse, France
Duration: 20.05.200620.05.2006

Conference

ConferenceSpeech Recognition and Intrinsic Variation (SRIV2006)
Country/TerritoryFrance
CityToulouse
Period20.05.0620.05.06

Fingerprint

Dive into the research topics of 'A study of auditory-filterbank based preprocessing for the generation of Warping-Invariant Features'. Together they form a unique fingerprint.

Cite this