Investigation of the impact of high frequency transmitted speech on speaker recognition

Pool, Jan (2002-04)

Thesis (MScEng)--Stellenbosch University, 2002.

Some digitised pages may appear illegible due to the condition of the original hard copy.

Thesis

ENGLISH ABSTRACT: Speaker recognition systems have evolved to a point where near perfect performance can be obtained under ideal conditions, even if the system must distinguish between a large number of speakers. Under adverse conditions, such as when high noise levels are present or when the transmission channel deforms the speech, the performance is often less than satisfying. This project investigated the performance of a popular speaker recognition system, that use Gaussian mixture models, on speech transmitted over a high frequency channel. Initial experiments demonstrated very unsatisfactory results for the base line system. We investigated a number of robust techniques. We implemented and applied some of them in an attempt to improve the performance of the speaker recognition systems. The techniques we tested showed only slight improvements. We also investigates the effects of a high frequency channel and single sideband modulation on the speech features of speech processing systems. The effects that can deform the features, and therefore reduce the performance of speech systems, were identified. One of the effects that can greatly affect the performance of a speech processing system is noise. We investigated some speech enhancement techniques and as a result we developed a new statistical based speech enhancement technique that employs hidden Markov models to represent the clean speech process.

AFRIKAANSE OPSOMMING: Sprekerherkenning-stelsels het 'n punt bereik waar nabyaan perfekte resultate verwag kan word onder ideale kondisies, selfs al moet die stelsel tussen 'n groot aantal sprekers onderskei. Wanneer nie-ideale kondisies, soos byvoorbeeld hoë ruisvlakke of 'n transmissie kanaal wat die spraak vervorm, teenwoordig is, is die resultate gewoonlik nie bevredigend nie. Die projek ondersoek die werksverrigting van 'n gewilde sprekerherkenning-stelsel, wat gebruik maak van Gaussiese mengselmodelle, op spraak wat oor 'n hoë frekwensie transmissie kanaal gestuur is. Aanvanklike eksperimente wat gebruik maak van 'n basiese stelsel het nie goeie resultate opgelewer nie. Ons het 'n aantal robuuste tegnieke ondersoek en 'n paar van hulle geïmplementeer en getoets in 'n poging om die resultate van die sprekerherkenning-stelsel te verbeter. Die tegnieke wat ons getoets het, het net geringe verbetering getoon. Die studie het ook die effekte wat die hoë-frekwensie kanaal en enkel-syband modulasie op spraak kenmerkvektore, ondersoek. Die effekte wat die spraak kenmerkvektore kan vervorm en dus die werkverrigting van spraak stelsels kan verlaag, is geïdentifiseer. Een van die effekte wat 'n groot invloed op die werkverrigting van spraakstelsels het, is ruis. Ons het spraak verbeterings metodes ondersoek en dit het gelei tot die ontwikkeling van 'n statisties gebaseerde spraak verbeteringstegniek wat gebruik maak van verskuilde Markov modelle om die skoon spraakproses voor te stel.

Please refer to this item in SUNScholar by using the following persistent URL: http://hdl.handle.net/10019.1/52895
This item appears in the following collections: