/ Acoustic / Fundamentals / Production / Tutorials / Software / Home

5.4.2 Word Internal CD Models: Triphone Training

Now that the language and acoustic model files hve been created, the models must be trained again. For a detailed discussion of reestimation, revisit section Section 5.2.2. To train the new models, four passes of baum welch will be applied to the training data.

Go to the directory:

$ISIP_TUTORIAL/sections/s05/s05_04_p02/

From this directory, run the command:

isip_recognize -param params_train.sof -list $ISIP_TUTORIAL/research/isip/databases/lists/identifiers_train.sof -verbose all

Expected Output:

   Command: isip_recognize -param params_train.sof -list $ISIP_TUTORIAL/research/isip/databases/lists/identifiers_train.sof -verbose all
Version: 1.23 (not released) 2003/05/21 23:10:45
  
  loading audio database: $ISIP_TUTORIAL/research/isip/databases/db/tidigits_audio_db.sof
  
  *** no symbol graph database file was specified ***
  
  loading transcription database: $ISIP_TUTORIAL/research/isip/databases/db/tidigits_trans_word_db.sof
  
  loading front-end: $ISIP_TUTORIAL/recipes/frontend.sof
  
  loading language model: $ISIP_TUTORIAL/models/winternal_phone_models/lm_winternal_jsgf_gen.sof
  
  loading statistical model pool: $ISIP_TUTORIAL/models/winternal_phone_models/smp_winternal_gen.sof
  
  loading configuration file: $ISIP_TUTORIAL/sections/s05/s05_04_p02/config.sof
  
  starting iteration: 0
  
  processing file 1 (ae_12a): $ISIP_TUTORIAL/research/isip/databases/sof_8k/train/ae_12a.sof
  
  retrieving annotation graph for identifier: ae_12a, level: word
  
  transcription: ONE TWO 
  
  average utterance probability: -70.569262448515573, number of frames: 110
  
  processing file 2 (ae_1a): $ISIP_TUTORIAL/research/isip/databases/sof_8k/train/ae_1a.sof
  
  retrieving annotation graph for identifier: ae_1a, level: word
  
  transcription: ONE 
  
  ...

Open the parameter file params_triphone_train.sof and note the following parameters.

Since the context symbols we are dealing with are word internal, the context_mode parameter is set to SYMBOL_INTERAL. This tells isip_recognize how to handle the models. The trancription database used for this step contains trancriptions at the word level. Why don't we use phone-level transcriptions? The word-to-phone definitions are contained in the language model which allows us to use word level transcriptions while training. Another important parameter is the configuration parameter. The configuration file defines the right and left context length of context dependent models.

Once the word-internal models have been trained, we can move on to the state-tying process in the next section.

Glossary / Help / Support / Site Map / Contact Us / ISIP Home