Overview Downloads  Tutorials
HTK Tutorials
Downloads


Downloads:
  • (11/12/01) Multiple-CPU Eval Package (v1.4.2): We have included a simple utility that allows you to strip silence from feature files. We also fixed a minor bug in the command line interface that prevented lm_scale from being changed from its default value. This latter bug fix will not in any way affect your previous results.

  • (10/31/01) Short Training Set Definition (v1.4.1): In this update to v1.4, we have included file lists that define the two 7,138 utterance training sets - filtered (Training Set #1) and multi-condition (Training Set #2). There were no other substantial changes to the file lists.

  • (10/23/01) Multiple-CPU Eval Package (v1.4.1): We fixed a bug involving mixture generation during training for mixture orders greater than one. Experiments involving mixtures orders equal to one are unaffected.

  • (10/22/01) Utterance Endpoints (v1.0): This release contains utterance endpoints (start/stop times in secs) for all data sets used in our evaluations. Future experiments will be performed on the speech-only portions of the files. The data was automatically endpointed using our best WSJ recognition system, and the utterances were padded with 200 msec of silence.

  • (10/19/01) Multiple-CPU Eval Package (v1.4): This package allows users to easily run complete experiments using multiple cpus. In addition to the parameters available in the single-cpu scripts, the user can specify which computers are to be used for training and testing. Please be sure that you have already installed the ISIP prototype system (v5.11) before running this application. To install this package, follow the instructions below. Detailed instructions are included in the release's AAREADME.text file.

    • tar xzvf asr_wsj_tutorial_v1.4.tar.gz
    • cd asr_wsj_tutorial_v1.4
    • source <install directory for v5.11>ISIP_ENV.sh
    • ./configure --prefix=.
    • make
    • make install
    • source ISIP_WSJ_ENV.sh
    • wsj_run -help

    The interface for this package is largely the same as the single-cpu scripts, except that we have added arguments to specify which cpus are to be used. A typical command line for training will look like this:

      wsj_run -num_features 13 -mode acc -feature_format htk -mixtures 1 -train_mfc_list ./train_1792_v1.4.multi.list -cpus_train isip216 isip210 isip210 isip211

    A typical command line for testing will look like this:

      wsj_run -num_features 13 -mode acc -feature_format htk -lm devel -test_state_beam_pruning 200 -test_model_beam_pruning 150 -test_word_beam_pruning 150 -test_mfc_list ./devtest_0030_v1.4.list -models_path ../exp_085 -cpus_test isip210 isip211 isip212 isip212

    The options "-cpus_train" and "-cpus_test" instruct the utility which computers are to be used for training. If a cpu has multiple processors, you can repeat its name N times for N processors.

    For benchmarking purposes, we have conducted a few experiments on several short sets:

    Train List Test List No. of
    (Mixtures)
    No. CPUs
    (Train)
    No. CPUs
    (Test)
    WER
    train_1792_v1.4.filtered.list devtest_0030_v1.4.list (filtered) 1 1 1 26.9%
    train_1792_v1.4.filtered.list devtest_0030_v1.4.list (filtered) 1 2 2 26.9%
    train_1792_v1.4.filtered.list devtest_0030_v1.4.list (filtered) 1 10 6 28.2%
    train_1792_v1.4.multi.list devtest_0030_v1.4.list (filtered) 1 1 1 42.2%
    train_1792_v1.4.multi.list devtest_0030_v1.4.list (filtered) 1 2 2 43.1%
    train_1792_v1.4.multi.list devtest_0030_v1.4.list (filtered) 1 10 6 42.2%
    train_7138_v1.4.multi.list eval01_0166_v1.4.list 4 14 7 23.0%


    All tests were conducted using the ETSI front end and the clean filtered data. The clean filtered dev test set can be downloaded here. Due to numerical precision issues the results will fluctuate very slightly depending on the number of processors used. These fluctuations are statistically insignificant.

  • (10/12/01) Short Training Set Definition (v1.4): A new evaluation set was added that is half the size of the original evaluation set. This will significantly speed up the overall evaluation process since there are 14 noise conditions for each sample frequency.

  • (10/09/01) Short Training Set Definition (v1.3.1): This release contains three new lists that facilitate correlating the short set definition with the noise conditions contained on the Aurora CDs that were released through ELRA.

  • (10/05/01) Short Training Set Definition (v1.3): This release contains a new short training set definition that includes data from 1/4 of each of 14 noise conditions being evaluated. Results on this set will be available shortly.

  • (10/02/01) ETSI Frontend (v2.0): An implementation of the ETSI standard MFCC-based front end that that is being used in the WSJ baseline experiments. Please see the files AAREADME.text and Readme included in the distribution for detailed instructions on how to build the software and extract features from audio data.

  • (09/28/01) Single-CPU Eval Package (v1.3): This version adds a new option that makes it easy to run a fast 1-mixture experiment. You can use this option to do quick evaluations on the short set. The instructions how to do an evaluation can be obtained by typing "wsj_run -help". For many of you, the key command will be:

      wsj_run -num_features 39 -feature_format htk -mixtures 1 -lm devel -train_mfc_list train.list -test_mfc_list test.list
    Performance on the 30-utterance short dev test set should be 25.5%.

  • (09/27/01) Short Training Set Definition (v1.2): This release contains a new short training set definition that includes 1/4 of the entire WSJ training data (1,785 utterances). Performance on the 30-utterance short dev test set with a 1-mixture cross-word triphone system is 25.5% WER.

  • (09/24/01) Single-CPU Eval Package (v1.2): This package is a minor update of v1.1. The scoring software was modified to include special preprocessing of the transcriptions. This change only affects performance on the development test set, which contains special lexical items such as ".PERIOD".

  • (09/16/01) Single-CPU Eval Package (v1.1): This package demonstrates how to build a complete recognition system using a single processor. The user can specify the dimension of the feature vector, the type of features, file lists, and other relevant parameters as arguments. Please be sure that the you have already installed the ISIP prototype system before running this application. Installation instructions are given below. Detailed instructions are included in the release's AAREADME.text file.

    • tar xzvf asr_wsj_tutorial_v1.1.tar.gz
    • cd asr_wsj_tutorial_v1.1
    • source <install directory for v5.11>/ISIP_ENV.sh
    • ./configure [--prefix=/<install directory>]
    • cd <install directory>
    • make
    • make install
    • source ISIP_WSJ_ENV.sh
    • wsj_run -help

  • (09/14/01) Baseline Recognition System (v5.11): Feature extraction has been integrated into the decoder utility. For more information, please see v5.11 release. Installation and verification instructions are given below. Detailed instructions are included in the release's AAREADME.text file.

    • tar xzvf isip_proto_v5.11.tar.gz
    • ./configure [--prefix=/<install directory>]
    • cd <install directory>
    • source ISIP_ENV.sh
    • make
    • make install
    • make test

  • (08/29/01) Single-CPU Evaluation Package (v1.0): This package demonstrates how to build a complete recognition system from scratch. This package trains 16-mixture, context-dependent, cross-word triphone models and decodes using the baseline system parameters on a single processor. It accepts as arguments the number of features, a training list, and a test list. Please be sure that the you have already installed the ISIP prototype system before running this application. Installation instructions are given below.

    • tar xzvf asr_wsj_tutorial_v1.0.tar.gz
    • cd asr_wsj_tutorial_v1.0
    • source <install directory for v5.10>/ISIP_ENV.sh
    • ./configure [--prefix=/<install directory>]
    • cd <install directory>
    • make
    • make install
    • test_suite.sh number_of_features train_mfc.list test_mfc.list

  • (08/29/01) Sample Decoding Package (v1.2): This release includes the final tuned system to be used as the baseline evaluation system. This system was tuned using the SI-84 training database, and the 330 utterances from the Nov'92 development test set. Please be sure that the you have already installed the ISIP prototype system before running this sample decoding experiment. Installation instructions are given below.

    • tar xzvf aurora_release_v1.2.tar.gz
    • cd aurora_release_v1.2
    • source <install directory for v5.10>/ISIP_ENV.sh
    • ./configure [--prefix=/<install directory>]
    • cd <install directory>
    • make
    • make install
    • make test

  • (08/16/01) WSJ Subset (v1.1): A short data set that contains 415 training utterances and 30 dev test utterances. This set is designed to produce results that are indicative of what you will get when you process the full training and dev test sets. It has been designed to match the gross statistics of the larger set. It should take about 12 hours to train and about 8 hours to decode on a single 800 MHz Pentium III processor.

  • (08/15/01) WSJ Subset (v1.0): A short data set that contains 415 training utterances and 30 dev test utterances. This set is designed to produce results that are indicative of what you will get when you process the full training and dev test sets. It has been designed to match the gross statistics of the larger set. It should take about 12 hours to train and about 8 hours to decode on a single 800 MHz Pentium III processor.

  • (08/13/01) Sample Decoding Package (v1.1): This is a sample package that decodes five WSJ Eval'92 Test utterances. It also has an integrated verifying mechanism. Please be sure that the you have already installed the ISIP prototype system before running this sample decoding experiment. Installation and running instructions are given below.

    • tar xzvf aurora_release_v1.1.tar.gz
    • cd aurora_release_v1.1
    • source <install directory for v5.10>/ISIP_ENV.sh
    • ./configure [--prefix=/<install directory>]
    • cd <install directory>
    • make
    • make install
    • make test


  • (07/27/01) Sample Decoding Package (v1.0): This is a sample package that decodes five WSJ Eval'92 Test utterances. Please be sure that the you have already installed the ISIP prototype system before running this sample decoding experiment. Installation and running instructions are given below.

    • tar xzvf aurora_release_v1.0.tar.gz
    • cd aurora_release_v1.0
    • source <install directory for v5.10>/ISIP_ENV.sh
    • ./configure [--prefix=/<install directory>]
    • cd <install directory>
    • make
    • make install
    • make test


  • (07/27/01) Baseline Recognition System (v5.10): This is an enhanced version of our prototype system that can read binary HTK-formatted features. We have also added the capability to compute delta and acceleration coefficients on the fly at run time. Installation and verification instructions are given below. Detailed instructions are included in the release's AAREADME.text file.

    • tar xzvf isip_proto_v5.10.tar.gz
    • ./configure [--prefix=/<install directory>]
    • cd <install directory>
    • source ISIP_ENV.sh
    • make
    • make install
    • make test