| 
 Downloads:
 
 (09/30/05)
       
     isip_proto_v5.18_creare: This release has same functionality as v5.17 except 
     that the likelihood scores for the one best output are time normalized.
     
       
 To install this package, follow the instructions below.
 
 
 
	 tar xzvf isip_proto_v5.18_creare.tar.gz
        cd isip_proto_v5.18_creare
        ./configure
        gmake
        gmake install
        source ISIP_ENV.sh
        A typical command line for training crossword 8-mixture triphone
       models will look like this:
 
 
 
         cd <some_exp_train_directory>;
	 wsj_run -mixtures 8 -model_type xwrd_triphone -train_mfc_list
	 
	 
	 all_mfcc_features.list  -split_threshold 10 -merge_threshold 10 -num_occ_threshold 50
 -cpus_train redeye
 
 
 A typical command line for testing will look like this:
 
 
 
         cd <some_exp_test_directory>;
	 wsj_run -mixtures 8 -model_type xwrd_triphone -test_mfc_list 
	  all_mfcc_features.list  
	 -cpus_test redeye \
 -models_path <path_to_some_exp_train_directory>
 The above commandline will output 1-best output with normalized 
	likelihood scored at phone level. If word level normalized likelihood
	score is required then say "-align_mode word" in the commandline used
	for decoding. The optimum threshold on the DET plot was found to be 
	-69.81 i.e. anything above -69.81 can be considered less
	likely than anything below the threshold.
 
 
 (09/23/05)
       
     Models for Bravo data: These are 8-mixture crossword triphone models
     that were trained on the 499 utterances from the Bravo data set. If there is an experiment run 
     previous and the user wants to replace the old models with these models
     then replace the following directory: $ISIP_WSJ/exp/train/baum_welch/xwrd_tri/final_models. If tested on the same utterances the WER 
     will be 0.3%. The features were provided by Creare.
     
       
 To install this package, follow the instructions below.
 
 
 
	 tar xzvf bravo_final_models.tar.gz
        cp -rf final_models $ISIP_WSJ/exp/train/baum_welch/xwrd_tri/final_models
        
 
 (09/23/05)
       
     isip_proto_v5.17: This release has same functionality as v5.16 except 
     for the bug that caused "nan" and "inf" values to appear as confidence 
     scores has been fixed.
     
       
 To install this package, follow the instructions below.
 
 
 
	 tar xzvf isip_proto_v5.17_creare.tar.gz
        cd isip_proto_v5.17_creare
        ./configure
        gmake
        gmake install
        source ISIP_ENV.sh
        
 
 (05/10/05)
       
     Production System (r00_n11_t03): Production System release
     with the endpoint detection utility. This utility can operate in
     two modes: 1) "signal_only(Default)": In this mode the utility
     writes only the endpointed data to the output files. 2) "all": In
     this mode the utility will chop the entire utterance into smaller
     segments and saves them to files.
     
       
 To install this package, follow the instructions below.
 
 
 
	  tar xzvf isip_r00_n11_t03.tar.gz
	  cd isip_r00_n11_t03
	  ./configure [--prefix=/<install directory>] [--with-audiofile-prefix=/<audiofile install directory>] [--with-sphere-prefix=/<sphere install directory>] [--with-sctk-prefix=/<sctk install directory>]
	  source ISIP_BASE_ENV.sh
	  make depend
	  make install
        
 
     (03/30/05)
     
     isip_proto_v5.16 :
     In this release, we have added the capability to compute and
     output the average posteriori score per frame for each link in
     the lattice (word graph). Similarly, the average posteriori score per
     frame for each word in the 1-best hypothesis is also computed.
     To install this package, follow the instructions given below. Detailed
     instructions are included in the release's AAREADME.text file.
     
     
 
 
        tar xzvf isip_proto_v5.16_creare.tar.gz
        cd isip_proto_v5.16_creare
        ./configure
        gmake
        gmake install
        source ISIP_ENV.sh
      The instructions to compute the posteriori score
     assume that the acoustic models have already been generated using
     the
     Multiple-CPU ASR Tutorial (v5.0)
     package. See the instructions with the Multiple-CPU ASR Tutorial
     (v5.0) release on how to train the models.
     
     Once the acoustic models are trained, the same directory setup
     that is created by Multiple-CPU ASR Tutorial (v5.0) is used for
     lattice generation, and then, for posteriori scores computation
     from these lattices.
 
 Steps to generate lattices:
 
 
 
       
       Download the
	    output_lattice.list
	    file and move it to the
	    $ISIP_WSJ/exp/decode/baum_welch/xwrd_tri/grammar_decoding/lists/
	    directory, where $ISIP_WSJ is a shell environment
	    variable that points to the Multiple-CPU ASR Tutorial (v5.0).
	    
 
Download the
	    params_lattice.text
	    
	    file and move it to the
	    $ISIP_WSJ/exp/decode/baum_welch/xwrd_tri/grammar_decoding/
	    directory.
	    
	    
 
Generate lattices using the following commandline:
	     trace_projector -p \ $ISIP_WSJ/exp/decode/baum_welch/xwrd_tri/grammar_decoding/params_lattice.text
 
 Steps to generate posteriori scores using the lattices generated
      in the previous step:
 
 
 
       Download the
	    
	    input_lattice_posterior.list 
	    file and move it to the
	    $ISIP_WSJ/exp/decode/baum_welch/xwrd_tri/grammar_decoding/lists/
	    directory.
 
 
Download the
	    
	    output_lattice_posterior.list 
	    file and move it to the
	    $ISIP_WSJ/exp/decode/baum_welch/xwrd_tri/grammar_decoding/lists/
	    directory.
	    
 
Download the
	    
	    output_posterior.list
	    
	    file and move it to the
	    $ISIP_WSJ/exp/decode/baum_welch/xwrd_tri/grammar_decoding/lists/
	    directory.
	    
 
Download the
	    
	    params_lattice_posterior.text
	    
	    file and move it to the
	    $ISIP_WSJ/exp/decode/baum_welch/xwrd_tri/grammar_decoding/
	    directory.
	    
 
Generate posteriori scores using the following commandline:
	     
	     trace_projector -p \ $ISIP_WSJ/exp/decode/baum_welch/xwrd_tri/grammar_decoding/params_lattice_posterior.text
 The posteriori score per frame for each word is output as the
      third column in the one-best hypotheses given by the
      $ISIP_WSJ/exp/decode/baum_welch/xwrd_tri/grammar_decoding/lists/output_posterior.list list. A sample output hypothesis may look like
      
      this file.
      From the experiments conducted on the FAA data, it was
      empirically observed that the words with average posteriori
      score per frame greater than the threshold of around -68 can be
      considered true with high confidence.
 
 
 
     (03/24/05)
          
     isip_questions.text: This file is used during state tying, a
decision tree based framework is used to cluster phonetically similar
sounds. This download consists of:
 a) a master isip_questions_master.text file which consists of questions
corresponding to the 41 monophones in the monophones_master.text file.
 b) monophones_master.text file.
 c) a isip_questions_29.text file which contains the phonetic
question sets for the 29 monophones in the monophones_29.text file
 d) monophones_29.text file.
 
 How to create the isip_questions.text file?
 
 This file can be easily obtained from the master questions file
(isip_questions_master.text), the reduced monophones
set(monophones_#.text file) and the full monophones_master.text file
which has 41 monophones. "diff" the monophones in the reduced set and
the full monophones set and remove the missing monophones from the
master isip_questions file to create the new reduced
isip_questions.text file. Suppose, there are no monophones present for
a particular set of phonetic questions then the questions can be
removed from the file.
 
 
 
        tar xzvf isip_question_files.tar.gz
      
 
     (03/02/05)
     
     isip_proto_v5.16_beta : This package is an upgraded version of prototype system v5.15. This system has an additional feature that is not present in v5.15 which is to compute posteior scores of every word in the wordgraph. These posteriors can be used as a confidence estimate.
     
     
 
 
        tar xzvf isip_proto_v5.16_beta.tar.gz
        cd isip_proto_v5.16_beta
        ./configure
        make
        make install
        source ISIP_ENV.sh
      A typical command line for generating lattices would look like this:
 
 
 
       
       trace_projector -p 
       params_lattice.text 
       
 The above commandline will generate lattices which will be used as input for posterior computation. The commandline for computing posterior lattice is as follows:
 
 trace_projector -p  params_lattice_posterior.text
 
 As you would have noticed the parameter files look very similar for both the commandlines above. But there are 3 main differences:
 1) There is a new parameter called compute_posterior. By default compute_posterior is set as 'no', but for posterior generation it should be specified as 'yes'.
 2) The input_lattice list for posterior computation will be the output_lattice list used by the first commandline.
 3) The output_lattice list for posterior computation will point to the files into which we would like to write the lattice along with the posterior score.
 
 
 (01/31/05)
     
     lexicon_and_monophone_files.tar.gz (v1.0) :
     This tar package contains the following:
 1) 'monophones.text' file which has all the monophones corresponding to
these 18 words:
 Bravo,
   Delta,
   Echo,
   FoxTrot,
   Golf,
   Hotel,
   India,
   Juliet,
   Kilo,
   Mike,
   November,
   Oscar,
   Papa,
   Quebec,
   Tango,
   Victor,
   Whiskey,
   Yankee.
 
 2) 'lexicon.text' file which has the monophone mapings for the above words.
 
 3) 'master_lexicon.text' file which contains the monophone mapings for
   arounf 30,000 words.
 
 4) 'create_triphones.pl' perl script that uses the monophones as the
   input to generate a all_xwrd_triphones.list file.
 Command: 
   perl create_triphones.pl notags_monophones.text > all_xwrd_triphones.list
 
 5) 'notags_monophones.text' file contains simply the monophones from
   the monophones.text file without the comments. This file is used by
   the perl script to generate the all_xwrd_triphones.list file.
 
 
 
 (12/08/04)
     
     gen_trans_with_sp.pl (v1.0) :
     This script will generate the monophone transcription files for the
     corresponding word transcription files. This script will generate the
     transcriptions with 'sp' between word boundaries. In order to create the
     'no sp' monophone transcription file, just remove the sp from the file
     created by the script.
     
     
 A typical command line for scoring a lattice will look like this:
 
 gen_trans_with_sp.pl lexicon.text all_word_transcription.text output_file
 
 Note: the all_word_transcription.text file in this case will not
     contain the utterance id. i.e it will contain just the word
     transcriptions.
 
 
 
     (11/07/04)
     
     Multiple-CPU ASR Tutorial (v5.0): This package is used to run
     recognition experiments on the FAA data. A word error rate of 0.8% will
     be obtained if we train and test on the same data with state tying
     thresholds as described in the commandline below. The decoding is
     performed using a loop grammar.
     
     
 
 
        tar xzvf asr_va_tutorial_v5.0.tar.gz
        cd asr_package
        cd asr_va_tutorial_v5.0   
        source <install directory for v5.15>ISIP_ENV.sh	      
        ./configure --prefix=.
        make
        make install
        source ISIP_WSJ_ENV.sh
        wsj_run -help
      A typical command line for training crossword 8-mixture triphone
       models will look like this:
 
 
 
         cd <some_exp_train_directory>;
	 wsj_run -mixtures 8 -model_type xwrd_triphone -train_mfc_list
	 
	 
	 all_mfcc_features.list  -split_threshold 10 -merge_threshold 10 -num_occ_threshold 50
 -cpus_train isip218 isip218
 
 If the test data is going to be unseen during training, then it is
	 recomemded to use an num_occ_threshold of 400 and the merge and split
	 thresholds around 20. These thresholds were found by cross validating
	 on the FAA data.
 A typical command line for testing will look like this:
 
 
 
         cd <some_exp_test_directory>;
	 wsj_run -mixtures 8 -model_type xwrd_triphone -test_mfc_list 
	  all_mfcc_features.list  
	 -cpus_test isip218 isip218 \
 -models_path <path_to_some_exp_train_directory>
 
 The above commandline will generate output files which contain
       the triphone hypothesis. If required the triphone results can be
       converted to their corresponding monophone equavalent using the
       newly added utility to this package called 'convert_tri_to_mono'.
       The new utility is a perl script that gets installed along with
       the other utilities in the package.
 
 The commandline to convert the triphone result to monophone is as
       follows:
 
 convert_tri_to_mono <triphone output filename> <monophone
       output filename >
 
 
 
     (10/05/04)
     
     Models trained on the segmented FAA data (Prototype):
     The models trained on the segmented FAA data can be downloaded from here.
     These tar package contains the entire train directory. The extracted
     train directory must replace the old train directory in your $ISIP_WSJ/exp
     directory.
     
     
 
 
        tar xzvf models_faa.tar.gz
      
 
     (10/05/04)
     
     Segmented FAA features (mfcc):
     The segmented FAA features can be downloaded by clicking the link above.
     
 
 
        tar xzvf segmented_faa_features.tar.gz
      
 
     (10/05/04)
     
     Segmented FAA raw data:
     The segmented FAA data can be downloaded by clicking the link above.
     
 
 
        tar xzvf segmented_faa_raw.tar.gz
      
 
 (09/27/04)
     
     Det curve plotting package:
     This package is provided by NIST(National Institute of Standards and
     Technology) for plotting the DET curves. It has been slightly modified
     to suit specific requirements. This software requires Matlab.
     
     
 
 
        tar xzvf det_package.tar.gz
      
 
 (09/27/04)
     
     gen_wer.pl (v1.0) :
     This is a scoring script that post processes the lattice generated using
     the prototype system. Please be sure you have Perl installed on your system.
     
 A typical command line for scoring a lattice will look like this:
 
 gen_wer.pl lattices_path output_path delta_value format_level alignment_file
 
 
 
 (07/09/04)
       
       Multiple-CPU ASR Tutorial (v4.0):
       The fourth release of a package that is a modified version of
       the Aurora scripts. This package is primarily meant for word spotting
       experiments.
       Please be sure that you have already installed the 
       
       
       ISIP prototype system (v5.14) before running this application.
       To install this package, follow the instructions below.
       Detailed instructions are included in the release's AAREADME.text
       and the INSTRUCTIONS.text files.
       
       
 
 
	  tar xzvf asr_va_tutorial_v4.0.tar.gz
	  cd asr_va_tutorial_v4.0
	  source <install directory for v5.14>ISIP_ENV.sh	      
	  ./configure --prefix=.
	  make
	  make install
	  source ISIP_WSJ_ENV.sh
	  wsj_run -help
        A typical command line for decoding 1-mixture monophone models will
       look like this:
 
 
 
         cd <exp_directory>;
	 wsj_run -mixtures 1 -model_type monophone -decode_mode
	 grammar_decoding -align_mode phone -test_mfc_list
	 
	 
	 test_1247_v4.0_mfc.list  \
 -cpus_test isip218 isip218 -models_path .
 
 
 (11/07/03)
       
       Multiple-CPU ASR Tutorial (v3.0):
       The third release of a package that is a modified version of
       the Aurora scripts. This package supports a few new features
       including ngram decoding that can be used to generate the
       alignments for the unseen phrases. The ngram decoding is based
       on our Switchboard language model. Note that the decoding will
       require about 700 MB of main memory because of the large
       vocabulary size.
       Please be sure that you have already installed the 
       
       
       ISIP prototype system (v5.14) before running this application.
       To install this package, follow the instructions below.
       Detailed instructions are included in the release's AAREADME.text file.
       
       
 
 
	  tar xzvf asr_va_tutorial_v3.0.tar.gz
	  cd asr_va_tutorial_v3.0
	  source <install directory for v5.14>ISIP_ENV.sh	      
	  ./configure --prefix=.
	  make
	  make install
	  source ISIP_WSJ_ENV.sh
	  wsj_run -help
        A typical command line for training crossword 4-mixture triphone
       models will look like this:
 
 
 
         cd <some_exp_train_directory>;
	 wsj_run -mixtures 4 -model_type xwrd_triphone -train_mfc_list
	 
	 
	 train_1249_v3.0_mfc.list  \
 -cpus_train isip218 isip218
 A typical command line for testing (generating alignments) will
       look like this:
 
 
 
         cd <some_exp_test_directory>;
	 wsj_run -mixtures 4 -model_type xwrd_triphone -decode_mode bigram_decoding -align_mode word -test_mfc_list 
       	 
	 devtest_364_v3.0_mfc.list 
	 
	 -cpus_test isip218 isip218 \
 -models_path <path_to_some_exp_train_directory>
 
 
 (11/07/03)
       
       Multiple-CPU ASR Tutorial (v2.0):
       The second release of a package that is a modified version of
       the previous version. It supports training cross-word models
       and network decoding for the 1249 pre-transcribed speech files
       provided in the Creare Phase 02 data.
       Please be sure that you have already installed the 
       
       
       ISIP prototype system (v5.14) before running this application.
       To install this package, follow the instructions below.
       Detailed instructions are included in the release's AAREADME.text file.
       
       
 
 
	  tar xzvf asr_va_tutorial_v2.0.tar.gz
	  cd asr_va_tutorial_v2.0
	  source <install directory for v5.14>ISIP_ENV.sh	      
	  ./configure --prefix=.
	  make
	  make install
	  source ISIP_WSJ_ENV.sh
	  wsj_run -help
        A typical command line for training crossword 4-mixture triphone
       models will look like this:
 
 
 
         cd <some_exp_train_directory>;
	 wsj_run -mixtures 4 -model_type xwrd_triphone -train_mfc_list
	 
	 
	 train_1249_v2.0_mfc.list \
 -cpus_train isip218 isip218
 A typical command line for testing (generating alignments) will
       look like this:
 
 
 
         cd <some_exp_test_directory>;
	 wsj_run -mixtures 4 -model_type xwrd_triphone -align_mode phone
       -test_mfc_list
       	 
	 devtest_1249_v2.0_mfc.list 
	 
	 -cpus_test isip218 isip218 \
 -models_path <path_to_some_exp_train_directory>
 This should result in a WER of 2.0%.
 
 
 (11/07/03)
       
       Frontend for Waveform Audio Files (v1.0): The frontend
       which converts the audio files in WAV format, sampled at 8 kHz
       into MFCC features. An AAREADME.text file included in the
       package provides detailed instructions.
       
 
 (11/07/03)
       
       Production System (r00_n11_t02): Production System release
       that supports both read and write the WAV format with ADPCM
       compression, supported by SGI's audiofile library.
       
 To install this package, follow the instructions below.
 
 
 
	  tar xzvf isip_r00_n11_t02.tar.gz
	  cd isip_r00_n11_t02
	  ./configure [--prefix=/<install directory>] [--with-audiofile-prefix=/<audiofile install directory>] [--with-sphere-prefix=/<sphere install directory>] [--with-sctk-prefix=/<sctk install directory>]
	  source ISIP_BASE_ENV.sh
	  make depend
	  make install
        
 
 (10/30/03)
       
       Forced Alignments (v1.0): The forced alignments of the
       data collection phase_02.
       
 
 (09/29/03)
       
       Verification System (v1.0): The first release of a verification
       toolkit based on the production system. An AAREADME file
       included in the release provides detailed instructions.
       
 
 (08/06/03)
       
       Production System (r00_n11_t01): Production System release
       that supports both read and write in the NIST's Sphere format
       and the formats supported by SGI's audiofile library.
       
 To install this package, follow the instructions below.
 
 
 
	  tar xzvf isip_r00_n11_t01.tar.gz
	  cd isip_r00_n11_t01
	  ./configure [--prefix=/<install directory>] [--with-audiofile-prefix=/<audiofile install directory>] [--with-sphere-prefix=/<sphere install directory>] [--with-sctk-prefix=/<sctk install directory>]
	  source ISIP_BASE_ENV.sh
	  make depend
	  make install
        
 
 (07/24/03)
       
       Production System (r00_n11_t00): Production System release
       that supports NIST's Sphere format and the formats supported by
       SGI's audiofile library.
       
 To install this package, follow the instructions below.
 
 
 
	  tar xzvf isip_r00_n11_t00.tar.gz
	  cd isip_r00_n11_t00
	  ./configure [--prefix=/<install directory>] [--with-audiofile-library=/<audiofile lib directory>] [--with-audiofile-includes=/<audiofile include directory>] [--with-sp-library=/<sphere lib directory>] [--with-sp-includes=/<sphere include directory>]  
	  source ISIP_BASE_ENV.sh
	  make depend
	  make install
        
 
 (07/07/03)
       
       MFCC Features (v00): A recipe for converting 16 kHz raw
       files to MFCC features stored in raw files.
       
 
 (06/26/03)
       
       Monophone Tutorial Overview (v01): This contains a brief
       synopsis of each step required to train and evaluate a
       single-mixture context-independent (monophone) system
       implemented in the Multiple-CPU ASR Tutorial (v1.0) package.
       
 
 (06/25/03)
       
       Monophone Training Overview (v00): This page gives the
       overview of recipe used in the monophone training process
       implemented in the Multiple-CPU ASR Tutorial (v1.0) package.
       
 
 (06/17/03)
       
       Creare Phase-01 Set (v1.0): This package contains the
       training set, test set and devtest definitions. These
       definitions will allow you to train and decode.
       
 
 (06/16/03)
       
       Multiple-CPU ASR Tutorial (v1.0):
       The first release of a package that is a modified version of the
       Aurora scripts, and supports a few new features including
       network decoding.
       Please be sure that you have already installed the 
       
       
       ISIP prototype system (v5.14) before running this application.
       To install this package, follow the instructions below.
       Detailed instructions are included in the release's AAREADME.text file.
       
       
 
 
	  tar xzvf asr_va_tutorial_v1.0.tar.gz
	  cd asr_va_tutorial_v1.0
	  source <install directory for v5.14>ISIP_ENV.sh	      
	  ./configure --prefix=.
	  make
	  make install
	  source ISIP_WSJ_ENV.sh
	  wsj_run -help
        A typical command line for training single mixture monophone
       models will look like this:
 
 
 A typical command line for testing will look like this:
 
 
 
         cd <some_exp_test_directory>;
	 wsj_run -test_mfc_list 
       	 
	 devtest_255_v1.0_mfc.list 
	 
	 -cpus_test isip206 isip207 isip208 isip209 -models_path \ <path_to_some_exp_train_directory>
 This should result in a WER of 3.4%.
 
 
 (06/04/03)
       
       Project Bibliography (v00): This list below gives a good
       overview of various approaches that might be relevant to this
       project.
  
 |