Overview: Previous sections of the tutorial have explained how to extract features from speech and how to use them to build acoustic models. The speech recognizer references these models to determine phonemes that comprise words. The language model provides additional knowledge for the recognizer by specifying the order in which those words are likely to occur. Early attempts at language modeling used isolated word recognition, in which the speaker was required to pause after each word spoken. Modern recognizers can decode continuous speech, consisting of sequences of words that are not necessarily separated by a pause. Two popular language models used to build these recognizers include Network and N-gram. The ISIP software supports development of either of these language models. Continue to Section 6.1.1 for further theoretical overview. Contents:
|