/ Acoustic / Fundamentals / Production / Tutorials / Software / Home
5.6.2 Parallel Training: Word Models
black fading line

Parallel training follows a very similar procedure as training on a single processor. There are, however, a few significant differences that should be kept in mind. First, instead of the isip_recognize utility, a utility called isip_run is used. Next, you will need to specify which machines you wish to divide the task onto by listing their network addresses in a machine list. You will need to make two more lists: an accumulator list and an identifier list. These two lists specify where the intermediate output will be stored. Examples of these lists are below. Note that the number of lines in the accumulator, identifier, and machine lists must match.

Machine List
Accumulator List
Identifier List


An important observation to make from these files is that the full path to the accumulators and the identifiers are given in the accumulator and identifier lists. You must do this because the machines on the network will not know what the relative path to these files is.

An example of word model training using parallel training follows. Note that for this example to work, you must change the names in the machine list to network names of computers on the network you are connected to.

The command below is an example execution of parallel training.
    isip_run -param params_partrain_wm.sof -list identifiers.sof -id lists/identifier_list.sof -ma lists/machine_list.sof -acc lists/accumulator_list.sof -verbose brief
Expected Output:
 Version: 1.3 (not released) 2002/09/25 00:21:14
 cannot find attributes for machine: isip221.isip.msstate.edu using default values
 cannot find attributes for machine: isip221.isip.msstate.edu using default values
 cannot find attributes for machine: isip220.isip.msstate.edu using default values
 cannot find attributes for machine: isip220.isip.msstate.edu using default values
 cannot find attributes for machine: isip218.isip.msstate.edu using default values
 cannot find attributes for machine: isip218.isip.msstate.edu using default values
 cannot find attributes for machine: isip217.isip.msstate.edu using default values
 cannot find attributes for machine: isip217.isip.msstate.edu using default values

 starting iteration: 0
 waiting on child processes...
 waiting on child processes...
 waiting on child processes...
 waiting on child processes...
 waiting on child processes...
 waiting on child processes...
 waiting on child processes...
 waiting on child processes...
 waiting on child processes...
 waiting on child processes...
 waiting on child processes...
 waiting on child processes...
 child process on isip217.isip.msstate.edu has finished...
 waiting on child processes...
 child process on isip218.isip.msstate.edu has finished...
 child process on isip218.isip.msstate.edu has finished...

 child process on isip221.isip.msstate.edu has finished...
 waiting on child processes...
 child process on isip217.isip.msstate.edu has finished...
 waiting on child processes...
 child process on isip221.isip.msstate.edu has finished...
 waiting on child processes...
 waiting on child processes...
 child process on isip220.isip.msstate.edu has finished...
 waiting on child processes...
 child process on isip220.isip.msstate.edu has finished...
 waiting on child processes...
 parent process has finished...

 starting iteration: 1
 waiting on child processes...
 waiting on child processes...
 waiting on child processes...
 waiting on child processes...
 waiting on child processes...
 waiting on child processes...
 waiting on child processes...
 waiting on child processes...

...(continues on similarly through four iterations)

 child process on isip221.isip.msstate.edu has finished...
 child process on isip220.isip.msstate.edu has finished...
 waiting on child processes...
 parent process has finished...

The output begins by stating that it can't find the attributes for the other computers. This is simply because we didn't use the option to list machine inputs in a file. Next the computer prints the message "waiting on child processes" at periodic intervals while it waits for the other machines to finish their computations. The number of these messages you see will vary depending on the speed of your computers and your network. The trained model files will be in the files specified in the parameter file (lm_model_update.sof and ac_model_update.sof). To monitor all the commands issued by the program during the training, you can use the parameter "-verbose DETAILED" instead of "-verbose BRIEF". To suppress the output, you can use "-verbose NONE".

The isip_run command is run similarly to the isip_recognize command. The parameter file is specified with the $PWD shell variable so that the full path will be included for reasons explained above.

Look at the parameter file params_partrain_wm.sof Notice that the parameter file is identical in format to the parameter files used for training on a single CPU. In fact, the only real difference with parallel training is the specification of the other computers to share the computations with.

Once again, make sure that before you run these examples, you change the machine names in the machine list to names of machines on your network, and if you alter the number of machines, you change the number of identifier and accumulator filenames in the identifier and accumulator lists to match that number.
   
Table of Contents   Section Contents   Previous Page Up Next Page
      Glossary / Help / Support / Site Map / Contact Us / ISIP Home