5.4.1 Word Internal CD Models: Triphone Generation The previous experiments explained how to train context independent acoustic models. Our production system also supports context dependent acoustic models. Context dependent models take into consideration surrounding context making results more accurate. This topic is explained in more detail in Section 4.2.5 The experiments that follow will explain how to train word-internal context-dependent phones. Word-internal models model surrounding context within words but do not cross word boundaries. Consider the sentence, "The boy ran." The figure below shows how word-internal models will be used to model this sentence. Notice that the acoustic unit at the beginning and ending of each word is a biphone instead of a triphone. This type of modeling is necesarry since word-internal models cannot cross word boundaries. To being, we must first generate a context list. The context list will contain a list of triphones seen in the training transcription database. In other words, a list of triphones will be constructed by examining the transcriptions of the training data and breaking them down into the correct corresponding triphonetic units. Go to the directory:
Command: isip_recognize -param params_context.sof -list $ISIP_TUTORIA./databases/lists/identifiers_train.sof -verbose all Version: 1.23 (not released) 2003/05/21 23:10:45 loading audio database: $ISIP_TUTORIA./databases/db/tidigits_audio_db.sof *** no symbol graph database file was specified *** loading transcription database: $ISIP_TUTORIA./databases/db/tidigits_trans_word_db.sof loading front-end: $ISIP_TUTORIAL/recipes/frontend.sof loading language model: $ISIP_TUTORIAL/models/winternal_phone_models/lm_winternal_jsgf.sof loading statistical model pool: $ISIP_TUTORIAL/models/winternal_phone_models/smp_winternal.sof loading configuration file: $ISIP_TUTORIAL/sections/s05/s05_04_p01/config.sof processing file 1 (ae_12a): $ISIP_TUTORIA./databases/sof_8k/train/ae_12a.sof retrieving annotation graph for identifier: ae_12a, level: word transcription: ONE TWO processing file 2 (ae_1a): $ISIP_TUTORIA./databases/sof_8k/train/ae_1a.sof retrieving annotation graph for identifier: ae_1a, level: word transcription: ONE ....Open the parameter file used in this step, params_context.sof. and notice the two parameters:
implementation = "SYMBOL_GENERATION";
Command: isip_recognize -param params_generate.sof -verbose all Version: 1.23 (not released) 2003/05/21 23:10:45 loading audio database: $ISIP_TUTORIA./databases/db/tidigits_audio_db.sof *** no symbol graph database file was specified *** loading transcription database: $ISIP_TUTORIA./databases/db/tidigits_trans_word_db.sof loading front-end: $ISIP_TUTORIAL/recipes/frontend.sof loading language model: $ISIP_TUTORIAL/models/winternal_phone_models/lm_winternal_jsgf.sof loading statistical model pool: $ISIP_TUTORIAL/models/winternal_phone_models/smp_winternal.sof loading configuration file: $ISIP_TUTORIAL/sections/s05/s05_04_p01/config.sof ....As mentioned before, the algorithm parameter for this step is also CONTEXT_GENERATION. The implementation parameter, however, is MODEL_GENERATION. For this step, we are generating the word-internal language model file and the new statistical model pool. Now that the acoustic and language model files have been created and seeded, we can continue with training in the next section. |