Overview Downloads  Tutorials
HTK Tutorials
Tutorials

TI Digits Short: Word Internal Triphones - State Tying

In this portion of training we continue working with our word internal (wint) models. We tie states together since there are such a large number of triphones and it is computationally inefficient to train each individual one. Most triphones have similar characteristics though so it makes sense to train a group of tied triphones all at once. The decision tree is built by a series of binary questions (i.e. the answer to each question is yes or no) which help choose which cluster to assign a triphone to. As more and more are assigned to each cluster, we eventually begin to create additional branches. Thus, in the end we have a top down tree structure. Now, we will discuss how to build the decision tree that we will use for a more comprehensive state tying than we have previously done. First we will discuss how to generate the file tree.hed which controls how the decision tree is made.

Procedure


  1. The first step is to create the file tree.hed which dictates how the decision tree is built. There are a lot of pieces that go into this file and we will only touch on some of the more important ones. To begin, we need a list of all possible phone combinations so that they can later be assigned to the tied clusters. The first two lines below first make a list of monophones without "sil" or "sp" and then a list of all possible phone combinations is generated with CreateFullList.pl. The list of questions used to cluster the phones can be found in tree_ques.hed and are concatenated to our tree.hed file. In the line, "mkclscript TB 750 mono_ex_spsil.list >> tree.hed", 750 is the parameter that controls the threshold for when new clusters are formed. When we actually build the tree, the newly tied states are saved in tiedlist.

    • From the directory isip/exp/htk_tutorial/train type:
    
    		egrep -v 'sil|sp' monophones1 > mono_ex_spsil.list
    		export list=triphones1
    		perl CreateFullList.pl monophones0 >fulllist
    		echo 'RO 200.0 stats' > tree.hed
    		echo "TR 0" >>tree.hed
    		cat  tree_ques.hed >> tree.hed
    		echo "TR 12" >> tree.hed
    		mkclscript TB 750 mono_ex_spsil.list >> tree.hed 
    		echo "TR 1" >> tree.hed
    		echo 'AU "fulllist"' >> tree.hed
    		echo 'CO "tiedlist"' >> tree.hed
    		echo 'ST "trees_wint"' >> tree.hed
    		
  2. Before we begin our re-estimations we have to perform cloning again except this time we will use our tree.hed file instead of mktri.hed.

    • From the same directory as the previous step type:

      HHEd -H hmm20/macros -H hmm20/hmmdefs -M hmm21 tree.hed triphones1

  3. Now we're ready to begin our final round of reestimations. Now we use our newly created tiedlist from the previous cloning step.

    • From the same directory as above type:

    • HERest -C config_wint -I ../data_preparation/trans/train_trans_wintri.mlf -t 250.0 150.0 3000.0 -S train_list2.list -H hmm21/macros -H hmm21/hmmdefs -M hmm22 tiedlist

      HERest -B -C config_wint -I ../data_preparation/trans/train_trans_wintri.mlf -t 250.0 150.0 3000.0 -S train_list2.list -H hmm22/macros -H hmm22/hmmdefs -M hmm23 tiedlist

      HERest -B -C config_wint -I ../data_preparation/trans/train_trans_wintri.mlf -t 250.0 150.0 3000.0 -S train_list2.list -H hmm23/macros -H hmm23/hmmdefs -M hmm24 tiedlist

      HERest -B -C config_wint -I ../data_preparation/trans/train_trans_wintri.mlf -t 250.0 150.0 3000.0 -S train_list2.list -H hmm24/macros -H hmm24/hmmdefs -M hmm25 tiedlist


Data Preparation Training Decoding