Evaluation of modern large-vocabulary speech recognition techniques and their implementation

Swart, Ranier Adriaan (2009-03)

Thesis (MScEng (Electrical and Electronic Engineering))--University of Stellenbosch, 2009.

Thesis

In this thesis we studied large-vocabulary continuous speech recognition. We considered the components necessary to realise a large-vocabulary speech recogniser and how systems such as Sphinx and HTK solved the problems facing such a system. Hidden Markov Models (HMMs) have been a common approach to acoustic modelling in speech recognition in the past. HMMs are well suited to modelling speech, since they are able to model both its stationary nature and temporal e ects. We studied HMMs and the algorithms associated with them. Since incorporating all knowledge sources as e ciently as possible is of the utmost importance, the N-Best paradigm was explored along with some more advanced HMM algorithms. The way in which sounds and words are constructed has been studied extensively in the past. Context dependency on the acoustic level and on the linguistic level can be exploited to improve the performance of a speech recogniser. We considered some of the techniques used in the past to solve the associated problems. We implemented and combined some chosen algorithms to form our system and reported the recognition results. Our nal system performed reasonably well and will form an ideal framework for future studies on large-vocabulary speech recognition at the University of Stellenbosch. Many avenues of research for future versions of the system were considered.

Please refer to this item in SUNScholar by using the following persistent URL: http://hdl.handle.net/10019.1/4050
This item appears in the following collections: