/ Acoustic / Fundamentals / Production / Tutorials / Software / Home

5.4.1 Word Internal CD Models: Triphone Generation

The previous experiments explained how to train context independent acoustic models. Our production system also supports context dependent acoustic models. Context dependent models take into consideration surrounding context making results more accurate. This topic is explained in more detail in Section 4.2.5 The experiments that follow will explain how to train word-internal context-dependent phones. Word-internal models model surrounding context within words but do not cross word boundaries. Consider the sentence, "The boy ran." The figure below shows how word-internal models will be used to model this sentence.

Notice that the acoustic unit at the beginning and ending of each word is a biphone instead of a triphone. This type of modeling is necesarry since word-internal models cannot cross word boundaries.

To being, we must first generate a context list. The context list will contain a list of triphones seen in the training transcription database. In other words, a list of triphones will be constructed by examining the transcriptions of the training data and breaking them down into the correct corresponding triphonetic units. Go to the directory:

$ISIP_TUTORIAL/sections/s05/s05_04_p01/

From this directory, run the following command:

isip_recognize -param params_context.sof -list $ISIP_TUTORIA./databases/lists/identifiers_train.sof -verbose all

Expected Output

Command: isip_recognize -param params_context.sof -list $ISIP_TUTORIA./databases/lists/identifiers_train.sof -verbose all
Version: 1.23 (not released) 2003/05/21 23:10:45
  
  loading audio database: $ISIP_TUTORIA./databases/db/tidigits_audio_db.sof
  
  *** no symbol graph database file was specified ***
  
  loading transcription database: $ISIP_TUTORIA./databases/db/tidigits_trans_word_db.sof
  
  loading front-end: $ISIP_TUTORIAL/recipes/frontend.sof
  
  loading language model: $ISIP_TUTORIAL/models/winternal_phone_models/lm_winternal_jsgf.sof
  
  loading statistical model pool: $ISIP_TUTORIAL/models/winternal_phone_models/smp_winternal.sof
  
  loading configuration file: $ISIP_TUTORIAL/sections/s05/s05_04_p01/config.sof
  
  processing file 1 (ae_12a): $ISIP_TUTORIA./databases/sof_8k/train/ae_12a.sof
  
  retrieving annotation graph for identifier: ae_12a, level: word
  
  transcription: ONE TWO 
  
  processing file 2 (ae_1a): $ISIP_TUTORIA./databases/sof_8k/train/ae_1a.sof
  
  retrieving annotation graph for identifier: ae_1a, level: word
  
  transcription: ONE 

  ....

Open the parameter file used in this step, params_context.sof. and notice the two parameters:

For the two steps in this section, the algorithm parameter will be set to CONTEXT_GENERATION. For this particular step, the implementation parameter is SYMBOL_GENERATION since we are generating a list of context symbols. The context list generated by the step above can now be used to generate the triphone model file. From the same directory, run the command:

isip_recognize -param params_generate.sof -verbose all

Expected Output:

Command: isip_recognize -param params_generate.sof -verbose all 
Version: 1.23 (not released) 2003/05/21 23:10:45
  
  loading audio database: $ISIP_TUTORIA./databases/db/tidigits_audio_db.sof
  
  *** no symbol graph database file was specified ***
  
  loading transcription database: $ISIP_TUTORIA./databases/db/tidigits_trans_word_db.sof
  
  loading front-end: $ISIP_TUTORIAL/recipes/frontend.sof
  
  loading language model: $ISIP_TUTORIAL/models/winternal_phone_models/lm_winternal_jsgf.sof
  
  loading statistical model pool: $ISIP_TUTORIAL/models/winternal_phone_models/smp_winternal.sof
  
  loading configuration file: $ISIP_TUTORIAL/sections/s05/s05_04_p01/config.sof
  
  ....

As mentioned before, the algorithm parameter for this step is also CONTEXT_GENERATION. The implementation parameter, however, is MODEL_GENERATION. For this step, we are generating the word-internal language model file and the new statistical model pool.

Now that the acoustic and language model files have been created and seeded, we can continue with training in the next section.

Glossary / Help / Support / Site Map / Contact Us / ISIP Home