Hidden Markov model for acoustic syllabification and phoneme class labelling in continuous speech
The use of a constrained hidden Markov model for the segmentation of syllables and syllable classes in continuous speech is described. The implementation involves a combination of phonological knowledge, speaker-adaptable features, vector quantization, and a hidden Markov modeling technique. Syllable classes are represented as hidden Markov model states, and phonotactic constraints are used to define state connections. A robust observation sequence is obtained from the decision planes of two single-layer neural networks, which are used to make voiced/unvoiced and (syllabic) nucleus/nonnucleus classifications, respectively. Speaker-independent tests, conducted on a broadcast quality database, revealed a syllable class recognition rate of 96.5% and a detection rate of 97.1% for syllables.