Abstract
Based on the well-known relationship between vocal tract length (VTL) variation and linear frequency warping, we present a method for generating vocal tract length invariant (VTLI) features. These features are computed as translation invariant, correlation-type features in a log-frequency domain. In phoneme classification and recognition experiments on the TIMIT database, their discrimination capabilities and robustness to mismatches between training and test conditions turned out to be considerably better than for Mel-frequency cepstral coefficients (MFCCs). The best results are obtained when VTLI features and MFCCs are combined
Original language | English |
---|---|
Title of host publication | 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings |
Number of pages | 4 |
Publisher | IEEE |
Publication date | 01.12.2006 |
Pages | 1025-1028 |
Article number | 1661453 |
ISBN (Print) | 978-142440469-8 |
DOIs | |
Publication status | Published - 01.12.2006 |
Event | 2006 IEEE International Conference on Acoustics, Speech and Signal Processing - Toulouse, France Duration: 14.05.2006 → 19.05.2006 Conference number: 69350 |