TI Digits Short: Creating Test Transcriptions and Final Results
We have finally completed all of the acoustic model training and are ready to evaluate our models. To
do this, we create a set of transcriptions for our test data using our acoustic models. We then compare
these transcriptions to the true transcriptions to calculate a word error rate (WER) for the system.
HTK's function HVite is used to generate the test data's transcriptions and HResults outputs our evaluation
in a clear and concise format.
Monophone Decoding Procedure
-
We first use HVite to generate transcriptions for our testing data. We pass our acoustic models
(hmmdefs and macros), the list of testing data (test_list.list), and the language model (wdnet) as inputs.
We also must include a value for the word insertion penalty and a weight for the language model as
indicated by the values after "-p" and "-s" respectively. These parameters can be tuned to improve
results for specific tasks but generally it's better to test your system without tuning. The output
transcriptions are saved as "results.mlf".
**For this experiment the language model does not affect our results. This is because the
likelihood of one digit following another is completely random. In other tasks consisting of regular
conversational speech, the language model plays a much larger role.
- From the directory isip/exp/htk_tutorial/decode type (all one line):
HVite -H ../train/hmm14/macros -H ../train/hmm14/hmmdefs -S
../train/test_list.list -l '*' -i mono_results.mlf -w ../data_preparation/grammar/wdnet
-p -0.0 -s 10.0 ../data_preparation/dictionary/dict ../train/monophones1
- With our newly created test transcriptions we now use HTK's function, HResults, to evaluate the
system. In this step, we pass the true transcriptions (test_trans.mlf), a list of the monophones used
for the acoustic models (monophones1 since sp is included), and our generated results (results.mlf). The
"???" indicates not to include "SENT-START" or "SENT-END" in the evaluations.
- From the directory isip/exp/htk_tutorial/decode type (all one line from the command line):
HResults -c -h -t -e ??? 'SENT-END' -e ??? 'SENT-START' -I
../data_preparation/trans/test_trans.mlf ../train/monophones1 mono_results.mlf
Word Internal Triphone Decoding Procedure
- We follow the exact same procedure for decoding triphone models as we did for the monophones. We now use
a few different files though. We now specify that we are using wint models by including config_wint, we pass
the updated triphone models in hmm25 as inputs, and finally we use the list of tied state triphones rather
than monophones1.
- From the directory isip/exp/htk_tutorial/decode type:
HVite -C ../train/config_wint -H ../train/hmm25/macros -H ../train/hmm25/hmmdefs -S
../train/test_list.list -l '*' -i wint_results.mlf -w ../data_preparation/grammer/wdnet
-p -0.0 -s 10.0 ../data_preparation/dictionary/dict ../train/tiedlist
HResults -c -h -t -e ??? 'SENT-END' -e ??? 'SENT-START' -I
../data_preparation/trans/test_trans.mlf ../train/tiedlist wint_results.mlf
Data Preparation
|
Training
|
Decoding
|
|