Analysis and implementation of the speaker adaptation techniques : MAP, MLLR, and MLED

Fanner, Robert M.

Analysis and implementation of the speaker adaptation techniques : MAP, MLLR, and MLED

Files

fanner_analysis_2002.pdf(52.87 MB)

Date

2002-12

Authors

Fanner, Robert M.

Publisher

Stellenbosch : Stellenbosch University

Abstract

ENGLISH ABSTRACT: The topic of this thesis is speaker adaptation, whereby speaker-independent speech models are adapted to more closely match individual speakers by utilising a small amount of data from the targeted individual. Speaker adaptation methods - specifically, the MAP, MLLR and MLED speaker adaptation methods - are critically evaluated and compared. Two novel extensions of the MLED adaptation method are introduced, derived and evaluated. The first incorporates the explicit modelling of the mean speaker model in the speaker-space into the MLED framework. The second extends MLED to use basis vectors modelling inter-class variance for classes of speech models, instead of basis vectors modelling inter-speaker variance. An evaluation of the effect of two different types of feature vector - PLP-cepstra and LPCCs - on the performance of speaker adaptation is made, to determine which feature vector is optimal for speaker-independent systems and the adaptation thereof.
AFRIKAANSE OPSOMMING: Die onderwerp van hierdie tesis is spreker-aanpassing, dit wil sê, die verandering van 'n spreker-onafhanklike spraakmodel om nader aan 'n spreker-afhanklike model vir 'n individu te wees, gegewe 'n klein hoeveelheid spraakdata van die individu. Die volgende sprekeraanpassing-metodes word geëvalueer: MAP, MLLR en MLED. Twee nuwe uitbreidings vir die MLED-metode word beskryf, afgelei en geëvalueer. Die eerste inkorporeer die eksplisiete modellering van die gemiddelde sprekermodel van die sprekerruimte in die MLED metode. Die tweede uitbreiding maak gebruik van basisvektore vir MLED wat vanaf die interklas-variansie tussen 'n stel sprekerklasse in plaas van die interspreker-variansie afgelei is. Die effek van twee tipes kenmerk-vektore - PLP-kepstra en LPCC's - op die prestasie van sprekeraanpassings-metodes word ondersoek, sodat die optimale tipe kenmerk-vektor vir spreker-onafhanklike modelle en hul aanpassing gevind kan word.

Description

Thesis (MScEng)--University of Stellenbosch, 2002.

Keywords

Automatic speech recognition, Speech processing systems, Dissertations -- Electronic engineering, Theses -- Electronic engineering

URI

http://hdl.handle.net/10019.1/52653

Collections

Masters Degrees (Electrical and Electronic Engineering)

Full item page