Frequency-Warping Invariant Features for Automatic Speech Recognition

Alfred Mertins, Jan Rademacher

Abstract

Based on the well-known relationship between vocal tract length (VTL) variation and linear frequency warping, we present a method for generating vocal tract length invariant (VTLI) features. These features are computed as translation invariant, correlation-type features in a log-frequency domain. In phoneme classification and recognition experiments on the TIMIT database, their discrimination capabilities and robustness to mismatches between training and test conditions turned out to be considerably better than for Mel-frequency cepstral coefficients (MFCCs). The best results are obtained when VTLI features and MFCCs are combined

OriginalspracheEnglisch
Titel2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings
Seitenumfang4
Herausgeber (Verlag)IEEE
Erscheinungsdatum01.12.2006
Seiten1025-1028
Aufsatznummer1661453
ISBN (Print)978-142440469-8
DOIs
PublikationsstatusVeröffentlicht - 01.12.2006
Veranstaltung2006 IEEE International Conference on Acoustics, Speech and Signal Processing - Toulouse, Frankreich
Dauer: 14.05.200619.05.2006
Konferenznummer: 69350

Fingerprint

Untersuchen Sie die Forschungsthemen von „Frequency-Warping Invariant Features for Automatic Speech Recognition“. Zusammen bilden sie einen einzigartigen Fingerprint.

Zitieren