TI-Digits Short: Monophones

TI Digits Short: Monophones - Realignment

Now that we've included sp in our acoustic models, we need to perform realignment. Since there can be several different pronunciations of any given word (especially since we now put "sp" at the end of each word) we need to align our models to best match the data. Once everything is properly aligned, we again use the EM algorithm to train our models.

Procedure

Create a new folder "aligned" in the isip/exp/htk_tutorial/train directory

HTK's function HVite performs the realignment using the Viterbi algorithm. "SWT" refers to a formatting parameter (more info can be found in the HTK book), "SENT-END" is defined as a word boundary, config0 refers to the format of the data, hmm10's macros and hmmdefs are the input models that we want to realign, and aligned.mlf is our new file containing the aligned transcriptions. The remaining arguments also serve as necessary inputs for the realignment.
- From the directory isip/exp/htk_tutorial/train type (all one line):
  
  HVite -A -D -T 1 -l '*' -o SWT -b SENT-END -C config0 -H hmm10/macros -H hmm10/hmmdefs -i aligned/aligned.mlf -m -t 250.0 150.0 3000.0 -y lab -a -I ../data_preparation/trans/train_trans.mlf -S train_list.list
  ../data_preparation/dictionary/dict monophones1> aligned/HVite_log

Now that we are realigned, we regenerate a new list of training data (now without any corrupted files). We use our perl script, new_train_list.pl, and the logfile created by the realignment, HVite_log, to do so.
- From the directory isip/exp/htk_tutorial/train type:
  
  new_train_list.pl aligned/HVite_log train_list2.list

Finally, we run four more iterations of the EM algorithm to train the acoustic models but now instead of train_trans_phones1.mlf we will use our newly aligned transcriptions, aligned.mlf.
- From the directory isip/exp/htk_tutorial/train type (all one line from the command line):
  
  HERest -B -A -D -T 1 -C config0 -I aligned/aligned.mlf -t 250.0 150.0 3000.0 -S train_list2.list -H hmm10/macros -H hmm10/hmmdefs -M hmm11 monophones1
  
  HERest -B -A -D -T 1 -C config0 -I aligned/aligned.mlf -t 250.0 150.0 3000.0 -S train_list2.list -H hmm11/macros -H hmm11/hmmdefs -M hmm12 monophones1
  
  HERest -B -A -D -T 1 -C config0 -I aligned/aligned.mlf -t 250.0 150.0 3000.0 -S train_list2.list -H hmm12/macros -H hmm12/hmmdefs -M hmm13 monophones1
  
  HERest -B -A -D -T 1 -C config0 -I aligned/aligned.mlf -t 250.0 150.0 3000.0 -s hmm14/stats -S train_list2.list -H hmm13/macros -H hmm13/hmmdefs -M hmm14 monophones1

Data Preparation

Language Model Preparation

Dictionary Preparation & Phone Lists

Feature Extraction

Transcription Preparation

Training

Monophones: Flat Start

Monophones: Adding sp

Monophones: Realignment

Generating Triphone Lists and Initial Training

Word Internal Triphones: State Tying

Decoding

Creating Test Transcriptions and Final Results