4.4.2 Forced Alignment: Example
Forced alignment can be a very useful debugging tool. Finding out where
the system thinks certain words are being spoken in an utterace provides
us with useful information that we can use to improve performance of the
system.
In this experiment, we will use forced alignment to generate a
transcription database for our test data that contains the start and
stop times of each word spoken. This is something that our original
transciption database for test data doesn't provide.
Go to the directory:
$ISIP_TUTORIAL/sections/s04/s04_04_p02/
Run the command
isip_recognize -param params_align.sof -list $ISIP_TUTORIAL/research/isip/databases/lists/identifiers_test.sof -verbose all
Expected output:
Command: isip_recognize -parameter_file params_align.sof -list /ftp/pub/research/isip/projects/speech/software/tutorials/production/ fundamentals/current/examples/research/isip/databases/lists/identifiers_test.sof -verbose all
Version: 1.23 (not released) 2003/05/21 23:10:45
loading audio database: $ISIP_TUTORIAL/research/isip/databases/db/tidigits_audio_db_test.sof
*** no symbol graph database file was specified ***
loading transcription database: $ISIP_TUTORIAL/research/isip/databases/db/tidigits_trans_word_test_db.sof
loading front-end: $ISIP_TUTORIAL/recipes/frontend.sof
loading language model: $ISIP_TUTORIAL/models/ci_phone_models/compare/lm_phone_jsgf_8mix.sof
loading statistical model pool: $ISIP_TUTORIAL/models/ci_phone_models/compare/smp_phone_8mix.sof
*** no configuration file was specified ***
opening the output file: $ISIP_TUTORIAL/sections/s04/s04_04_p02/results.out
processing file 1 (ah_111a): $ISIP_TUTORIAL/research/isip/databases/sof_8k/test/ah_111a.sof
retrieving annotation graph for identifier: ah_111a, level: word
transcription: ONE ONE ONE
hyp: ONE ONE ONE
score: -9122.6484375 frames: 138
processing file 2 (ah_1a): $ISIP_TUTORIAL/research/isip/databases/sof_8k/test/ah_1a.sof
retrieving annotation graph for identifier: ah_1a, level: word
transcription: ONE
hyp: ONE
score: -5187.28173828125 frames: 79
....
The parameter file
params_align.sof
contains a few significant differences than the decoding parameter files
we've used earlier in this section. First of all, the algorithm parameter
has been changed from DECODE to FORCED_ALIGNMENT to indicate that we are
using forced alignment for this experiment. We also have to include a
transcription database that contains the true transcriptions of the test
data. We add a new parameter, transciption_database, and set it to the
path of the correct database. We also have to add a parameter to indicate
the level of the transcriptions. This parameter, transcription_level, is
set to "word" since we're using word transcriptions.
|