Projects per year
Abstract
Vocal tract length normalization (VTLN) is commonly applied utterance-wise with a warping function that makes the assumption of a linear dependence between the vocal tract length and the location of the formants. In this work we propose a datadriven method for enhancing the performance of systems that already use standard VTLN. The method is based on elastic registration to estimate optimal non-parametric transformations to further reduce inter-speaker variabilities. Results show that the proposed method can increase the performance of monophone systems such that it reaches that of a triphone system.
Original language | English |
---|---|
Title of host publication | Proc. Interspeech-2012 |
Number of pages | 4 |
Place of Publication | Portland, USA |
Publication date | 01.09.2012 |
Pages | 1362-1365 |
ISBN (Print) | 978-162276759-5 |
Publication status | Published - 01.09.2012 |
Event | 13th Annual Conference of the International Speech Communication Association 2012 - Portland, United States Duration: 09.09.2012 → 13.09.2012 Conference number: 97207 |
Fingerprint
Dive into the research topics of 'Enhancing Vocal Tract Length Normalization with Elastic Registration for Automatic Speech Recognition'. Together they form a unique fingerprint.Projects
- 1 Finished
-
Invariant features for automatic speech recognition based on complex models of speech production and auditory perception
01.01.11 → 31.12.13
Project: DFG Projects › DFG Individual Projects