6.3.4 N-grams: Decoding Using an N-gram Now that you have a basic understanding of N-Grams, let's use them in a couple of decoding examples. The results from these examples will be used later in scoring examples. Here, we will use N-Gram decoding to decode TIDigits utteraces using acoustic models from two different levels of training. The first example will use context-independent acoustic models from the short-pause training level. The second example will use state-tied context-dependent cross-word acoustic models. Monophones Go to the directory:
Notice the message, "no transcription database file was specified." The transcription database will be used later to score the results. Open the parameter file params_ngram_phone.sof. The parameter file should look similar to the parameter files used in Section 4. One important parameter difference is the config_file parameter. The contents of the file specified here contain information vital to the ngram decoding procedure. Open the configuration file config_sp.sof. The nsymbol_model parameter contains the file which contains a list of unigrams, bigrams, and trigrams along with their probability. Go back to the parameter file params_ngram_sp.sof and notice that the results will be stored in the database file specified by the output_file parameter. This is the file that will later be used along with the transcription database to score the results. State-Tied Cross-Word Context-Dependent Phones From within the same directory, run the command:
isip106_[1]: isip_recognize -param ./params_ngram.sof -list ../../lists/identifiers_test.sof -verbose all Command: isip_recognize -parameter_file ./params_ngram.sof -list ../../lists/identifiers_test.sof -verbose all Version: 1.21 (not released) 2003/04/09 19:47:51 loading audio database: $ISIP_TUTORIAL/section_0./databas./audio_database.sof *** no transcription database file was specified *** loading front-end: $ISIP_TUTORIAL/section_0./database/frontend.sof loading configuration file: $ISIP_TUTORIAL/section_06/exp/ngram/config_sp.sof loading language model: $ISIP_TUTORIAL/section_06/exp/ngram/lm_model_init.sof loading acoustic model: $ISIP_TUTORIAL/section_06/exp/ngram/ac_model_init.sof opening the output file: $ISIP_TUTORIAL/section_06/exp/ngram/ngram_sp.db processing file 1 (ah_111a): $ISIP_TUTORIAL/section_06/features/ah_111a.sof hyp: ONE ONE ONE score: -12034.0302734375 frames: 138 processing file 2 (ah_1a): $ISIP_TUTORIAL/section_06/features/ah_1a.sof hyp: ONE score: -7914.1611328125 frames: 79 .....There isn't much difference between this parameter file, params_ngram_xword.sof, and the parameter file used for the last experiment. Obviously, the acoustic and language model files are different. The context_mode parameter has been added and set to CROSS_SYMBOL since we're decoding using cross-word models. The contents of the configuration file, config_xword, have also been changed to include the context dependency parameters. Again, an output database file has been created containing the results of this experiment. This file will be used later to score the results. You will notice a significant difference between the score of the short-pause trained monophones and the state-tied cross-word context-dependent models. |