Frequency-Warping Invariant Features for Automatic Speech Recognition

Alfred Mertins, Jan Rademacher

Abstract

Based on the well-known relationship between vocal tract length (VTL) variation and linear frequency warping, we present a method for generating vocal tract length invariant (VTLI) features. These features are computed as translation invariant, correlation-type features in a log-frequency domain. In phoneme classification and recognition experiments on the TIMIT database, their discrimination capabilities and robustness to mismatches between training and test conditions turned out to be considerably better than for Mel-frequency cepstral coefficients (MFCCs). The best results are obtained when VTLI features and MFCCs are combined

Original languageEnglish
Title of host publication2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings
Number of pages4
PublisherIEEE
Publication date01.12.2006
Pages1025-1028
Article number1661453
ISBN (Print)978-142440469-8
DOIs
Publication statusPublished - 01.12.2006
Event2006 IEEE International Conference on Acoustics, Speech and Signal Processing - Toulouse, France
Duration: 14.05.200619.05.2006
Conference number: 69350

Fingerprint

Dive into the research topics of 'Frequency-Warping Invariant Features for Automatic Speech Recognition'. Together they form a unique fingerprint.

Cite this