Efficient Decoding of High-order Hidden Markov Models
Most speech recognition and language identification engines are based on hidden Markov models (HMMs). Higher-order HMMs are known to be more powerful than first-order HMMs, but have not been widely used because of their complexity and computational demands. The main objective of this dissertation was to develop a more time-efficient method of decoding high-order HMMs than the standard Viterbi decoding algorithm currently in use. We proposed, implemented and evaluated two decoders based on the Forward-Backward Search (FBS) paradigm, which incorporate information obtained from low-order HMMs. The first decoder is based on time-synchronous Viterbi-beam decoding where we wish to base our state pruning on the complete observation sequence. The second decoder is based on time-asynchronous A* search. The choice of heuristic is critical to the A* search algorithms and a novel, task-independent heuristic function is presented. The experimental results show that both these proposed decoders result in more time-efficient decoding of the fully-connected, high-order HMMs that were investigated. Three significant facts have been uncovered. The first is that conventional forward Viterbi-beam decoding of high-order HMMs is not as computationally expensive as is commonly thought. The second (and somewhat surprising) fact is that backward decoding of conventional, high-order left-context HMMs is significantly more expensive than the conventional forward decoding. By developing the right-context HMM, we showed that the backward decoding of a mathematically equivalent right-context HMM is as expensive as the forward decoding of the left-context HMM. The third fact is that the use of information obtained from low-order HMMs significantly reduces the computational expense of decoding high-order HMMs. The comparison of the two new decoders indicate that the FBS-Viterbi-beam decoder is more time-efficient than the A* decoder. The FBS-Viterbi-beam decoder is not only simpler to implement, it also requires less memory than the A* decoder. We suspect that the broader research community regards the Viterbi-beam algorithm as the most efficient method of decoding HMMs. We hope that the research presented in this dissertation will result in renewed investigation into decoding algorithms that are applicable to high-order HMMs.