TI Digits Short: Word Internal Triphones - State Tying
In this portion of training we continue working with our word internal (wint) models. We tie states
together since there are such a large number of triphones and it is computationally inefficient to train
each individual one. Most triphones have similar characteristics though so it makes sense to train a
group of tied triphones all at once. The decision tree is built by a series of binary questions (i.e. the
answer to each question is yes or no) which help choose which cluster to assign a triphone to. As more and
more are assigned to each cluster, we eventually begin to create additional branches. Thus, in the
end we have a top down tree structure.
Now, we will discuss how to build the decision tree that we will use for a more comprehensive state tying
than we have previously done. First we will discuss how to generate the file tree.hed which controls how
the decision tree is made.
Procedure
- The first step is to create the file tree.hed which dictates how the decision tree is built. There
are a lot of pieces that go into this file and we will only touch on some of the more important ones. To
begin, we need a list of all possible phone combinations so that they can later be assigned to the tied
clusters. The first two lines below first make a list of monophones without "sil" or "sp" and then a
list of all possible phone combinations is generated with CreateFullList.pl. The list of questions used to
cluster the phones can be found in tree_ques.hed and are concatenated to our tree.hed file. In the line,
"mkclscript TB 750 mono_ex_spsil.list >> tree.hed", 750 is the parameter that controls the threshold for
when new clusters are formed. When we actually build the tree, the newly tied states are saved in tiedlist.
- From the directory isip/exp/htk_tutorial/train type:
egrep -v 'sil|sp' monophones1 > mono_ex_spsil.list
export list=triphones1
perl CreateFullList.pl monophones0 >fulllist
echo 'RO 200.0 stats' > tree.hed
echo "TR 0" >>tree.hed
cat tree_ques.hed >> tree.hed
echo "TR 12" >> tree.hed
mkclscript TB 750 mono_ex_spsil.list >> tree.hed
echo "TR 1" >> tree.hed
echo 'AU "fulllist"' >> tree.hed
echo 'CO "tiedlist"' >> tree.hed
echo 'ST "trees_wint"' >> tree.hed
- Before we begin our re-estimations we have to perform cloning again except this time we will
use our tree.hed file instead of mktri.hed.
- From the same directory as the previous step type:
HHEd -H hmm20/macros -H hmm20/hmmdefs -M hmm21 tree.hed triphones1
- Now we're ready to begin our final round of reestimations. Now we use our newly created tiedlist
from the previous cloning step.
- From the same directory as above type:
HERest -C config_wint -I ../data_preparation/trans/train_trans_wintri.mlf -t 250.0
150.0 3000.0 -S train_list2.list -H hmm21/macros -H hmm21/hmmdefs -M hmm22 tiedlist
HERest -B -C config_wint -I ../data_preparation/trans/train_trans_wintri.mlf -t 250.0
150.0 3000.0 -S train_list2.list -H hmm22/macros -H hmm22/hmmdefs -M hmm23 tiedlist
HERest -B -C config_wint -I ../data_preparation/trans/train_trans_wintri.mlf -t 250.0
150.0 3000.0 -S train_list2.list -H hmm23/macros -H hmm23/hmmdefs -M hmm24 tiedlist
HERest -B -C config_wint -I ../data_preparation/trans/train_trans_wintri.mlf -t 250.0
150.0 3000.0 -S train_list2.list -H hmm24/macros -H hmm24/hmmdefs -M hmm25 tiedlist
Data Preparation
|
Training
|
Decoding
|
|