Abstract
This work presents a feature-extraction method that is based on the theory of invariant integration. The invariant-integration features are derived from an extended time period, and their computation has a very low complexity. Recognition experiments show a superior performance of the presented feature type compared to cepstral coefficients using a mel filterbank (MFCCs) or a gammatone filterbank (GTCCs) in matching as well as in mismatching training-testing conditions. Even without any speaker adaptation, the presented features yield accuracies that are larger than for MFCCs combined with vocal tract length normalization (VTLN) in matching training-test conditions. Also, it is shown that the invariant-integration features (IIFs) can be successfully combined with additional speaker-adaptation methods to further increase the accuracy. In addition to standard MFCCs also contextual MFCCs are introduced. Their performance lies between the one of MFCCs and IIFs.
| Originalsprache | Englisch |
|---|---|
| Zeitschrift | Speech Communication |
| Jahrgang | 53 |
| Ausgabenummer | 6 |
| Seiten (von - bis) | 830-841 |
| Seitenumfang | 12 |
| ISSN | 0167-6393 |
| DOIs | |
| Publikationsstatus | Veröffentlicht - 01.07.2011 |
Fördermittel
This work has been supported by the German Research Foundation under Grant No. ME1170/2-1 .
UN SDGs
Dieser Output leistet einen Beitrag zu folgendem(n) Ziel(en) für nachhaltige Entwicklung
-
SDG 9 – Industrie, Innovation und Infrastruktur
Fingerprint
Untersuchen Sie die Forschungsthemen von „Contextual invariant-integration features for improved speaker-independent speech recognition“. Zusammen bilden sie einen einzigartigen Fingerprint.Projekte
- 1 Abgeschlossen
-
Invariante Merkmale für die automatische Spracherkennung
Mertins, A. (Projektleiter*in (PI))
01.01.07 → 31.12.11
Projekt: DFG Einzelprojekte › DFG Einzelförderungen (Sachbeihilfen)
Zitieren
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver