Zur Hauptnavigation wechseln Zur Suche wechseln Zum Hauptinhalt wechseln

Vocal Tract Length Invariant Features for Automatic Speech Recognition

Alfred Mertins, Jan Rademacher

Abstract

The effects of vocal tract length (VTL) variation are often approximated by linear frequency warping of short-time spectra. Based on this relationship, we present a method for generating vocal tract length invariant features. These new features are computed as translation invariant, correlation-type features in a log-frequency domain. In phoneme classification experiments, their discrimination capabilities turned out to be considerably better than for Mel-frequency cepstral coefficients (MFCCs). The best results are obtained when VTL-invariant (VTLI) features and MFCCs are combined. The superiority of the combined feature set and its resilience to VTL variations is also shown for word recognition, using the TIDIGITS corpus and the HTK recognizer.

OriginalspracheEnglisch
Seiten33-37
Seitenumfang5
DOIs
PublikationsstatusVeröffentlicht - 01.12.2005
Veranstaltung2005 IEEE Automatic Speech Recognition and Understanding Workshop - Cancun, Mexico
Dauer: 27.11.200501.12.2005
Konferenznummer: 68918

Tagung, Konferenz, Kongress

Tagung, Konferenz, Kongress2005 IEEE Automatic Speech Recognition and Understanding Workshop
KurztitelASRU 2005
Land/GebietMexico
OrtCancun
Zeitraum27.11.0501.12.05

UN SDGs

Dieser Output leistet einen Beitrag zu folgendem(n) Ziel(en) für nachhaltige Entwicklung

  1. SDG 9 – Industrie, Innovation und Infrastruktur
    SDG 9 – Industrie, Innovation und Infrastruktur

Fingerprint

Untersuchen Sie die Forschungsthemen von „Vocal Tract Length Invariant Features for Automatic Speech Recognition“. Zusammen bilden sie einen einzigartigen Fingerprint.

Zitieren