Aurora Front End Evaluations

What's New:

(11/12/01) Multiple-CPU Eval Package (v1.4.2): We have included a simple utility that allows you to strip silence from feature files. We also fixed a minor bug in the command line interface that prevented lm_scale from being changed from its default value. This latter bug fix will not in any way affect your previous results.
(10/31/01) Short Training Set Definition (v1.4.1): In this update to v1.4, we have included file lists that define the two 7,138 utterance training sets - filtered (Training Set #1) and multi-condition (Training Set #2). There were no other substantial changes to the file lists.
(10/23/01) Multiple-CPU Eval Package (v1.4.1): We fixed a bug involving mixture generation during training for mixture orders greater than one. Experiments involving mixtures orders equal to one are unaffected.
(10/22/01) Utterance Endpoints (v1.0): This release contains utterance endpoints (start/stop times in secs) for all data sets used in our evaluations. Future experiments will be performed on the speech-only portions of the files. The data was automatically endpointed using our best WSJ recognition system, and the utterances were padded with 200 msec of silence.
(10/19/01) Multiple-CPU Eval Package (v1.4): This package allows users to easily run complete experiments using multiple cpus. In addition to the parameters available in the single-cpu scripts, the user can specify which computers are to be used for training and testing.
(10/12/01) Short Training Set Definition (v1.4): A new evaluation set was added that is half the size of the original evaluation set. This will significantly speed up the overall evaluation process since there are 14 noise conditions for each sample frequency.
(10/09/01) Short Training Set Definition (v1.3.1): This release contains three new lists that facilitate correlating the short set definition with the noise conditions contained on the Aurora CDs that were released through ELRA.
(10/05/01) Short Training Set Definition (v1.3): This release contains a new short training set definition that includes data from 1/4 of each of 14 noise conditions being evaluated. Results on this set will be available shortly.
(10/02/01) ETSI Frontend (v2.0): An implementation of the ETSI standard MFCC-based front end that that is being used in the WSJ baseline experiments. Please see the files AAREADME.text and Readme included in the distribution for detailed instructions on how to build the software and extract features from audio data.
(09/28/01) Single-CPU Eval Package (v1.3): This version adds a new option that makes it easy to run a fast 1-mixture experiment. You can use this option to do quick evaluations on the short set.
(09/27/01) Short Training Set Definition (v1.2): This release contains a new short training set definition that includes 1/4 of the entire WSJ training data (1,785 utterances). Performance on the 30-utterance short dev test set with a 1-mixture cross-word triphone system is 25.5% WER.
(09/24/01) Single-CPU Eval Package (v1.2): This package is a minor update of v1.1. The scoring software was modified to include special preprocessing of the transcriptions. This change only affects performance on the development test set, which contains special lexical items such as ".PERIOD".
(09/16/01) Single-CPU Eval Package (v1.1): This package demonstrates how to build a complete recognition system using a single processor. The user can specify the dimension of the feature vector, the type of features, file lists, and other relevant parameters as arguments.
(09/14/01) Baseline Recognition System (v5.11): Feature extraction has been integrated into the decoder utility. For more information, please see v5.11 release.
(08/29/01) Single-CPU Evaluation Package (v1.0): This package demonstrates how to build a complete recognition system from scratch. This package trains 16-mixture, context-dependent, cross-word triphone models and decodes using the baseline system parameters on a single processor. It accepts as arguments the number of features, a training list, and a test list.
(08/29/01) Sample Decoding Package (v1.2): This release includes the final tuned system to be used as the baseline evaluation system. This system was tuned using the SI-84 training database, and the 330 utterances from the Nov'92 development test set.
(08/16/01) WSJ Subset (v1.1): A short data set that contains 415 training utterances and 30 dev test utterances. This set is designed to produce results that are indicative of what you will get when you process the full training and dev test sets. It has been designed to match the gross statistics of the larger set. It should take about 12 hours to train and about 8 hours to decode on a single 800 MHz Pentium III processor.
(08/15/01) WSJ Subset (v1.0): A short data set that contains 415 training utterances and 30 dev test utterances. This set is designed to produce results that are indicative of what you will get when you process the full training and dev test sets. It has been designed to match the gross statistics of the larger set. It should take about 12 hours to train and about 8 hours to decode on a single 800 MHz Pentium III processor.
(08/13/01) Sample Decoding Package (v1.1): We have added a verification mechanism that is automatically run during installation.
(07/27/01) Sample Decoding Package (v1.0): An easy to use example demonstrating how to run the baseline recognition system.
(07/27/01) Baseline Recognition System (v5.10): Our prototype recognition system, which now processes HTK features directly.