Automatic discovery of subword units and pronunciations for automatic speech recognition using TIMIT
Date
2010-11
Authors
Goussard, George
Niesler, Thomas
Journal Title
Journal ISSN
Volume Title
Publisher
PRASA
Abstract
We address the automatic generation of acoustic subword
units and an associated pronunciation dictionary for speech recognition.
The speech audio is first segmented into phoneme-like units by detecting
points at which the spectral characteristics of the signal change abruptly.
These audio segments are subsequently subjected to agglomerative
clustering in order to group similar acoustic segments. Finally, the
orthography is iteratively aligned with the resulting transcription in terms
of audio clusters in order to determine pronunciations of the training
words. The approach is evaluated by applying it to two subsets of the
TIMIT corpus, both of which have a closed vocabulary. It is found that,
when vocabulary words occur often in the training set, the proposed
technique delivers performance that is close to but lower than a system
based on the TIMIT phonetic transcriptions. When vocabulary words
are not repeated often in the training set, the best system is able to
outperform its counterpart based on the TIMIT phonetic transcriptions,
although recognition performance in both cases is poor.
Description
Both authors from Stellenbosch University.
Proceedings of the twenty-first annual symposium of the Pattern Recognition Association of South Africa (PRASA), Stellenbosch, South Africa, November 2010.
Proceedings of the twenty-first annual symposium of the Pattern Recognition Association of South Africa (PRASA), Stellenbosch, South Africa, November 2010.
Keywords
Automatic subword unit discovery, Automatic speech recognition, TIMIT
Citation
Goussard, GW & Niesler, TR 2010. Automatic discovery of subword units and pronunciations for automatic speech recognition using TIMIT. Proceedings of the twenty-first annual symposium of the Pattern Recognition Association of South Africa (PRASA), Stellenbosch, South Africa, November 2010.