Steps to follow for building a speech recognizer to recognize
--------------------------------------------------------------
the telelphone numbers :
------------------------

1. First recording is to be done on the DAT machine for obtaining audio
   file you will get the audio file in raw format.
   Command : narecord -s 8000 test.raw

2. Then you convert this raw file to wav file because this is the format
   the feature extraction program accepts
   Command : /ftp/pub/resources/courses/ece_8993_speech/homework/1998/
   problem_07/balakrishnama/scripts/raw_to_nist.sh

3. Once you obtain the .wav file (thought the scripts is raw_to_nist
   we get the wav format) all you have to do is extract the features of this 
   file. These are required by our decoder as input.
   Command 1 :feature/cparam -m -w 25 -p 12 -d -g -e -H NIST data/four_digit/
   test.wav data/four_digit/test.mfcc
   Command 2 : feature/cview -h -n 39 test.mfcc > test.dat

4. The program cview prints out all the values in different format our decoder
   requires it in 39 dimensional features, convert to this format using
   the script. (You will have to also use some macros because there will be 
   other things like column number 
   etc.)
   The data you obtain from cview will be of format :
   [1] 9.3333 [2] 5.4444 [3] 4.55555
   [4] 3.4444 [5] 4.5555 [6] 4.56778
   ......
   ......
  
   You need to strip all of the [..] numbers which are frame numbers
   and just retain the feature values and they all should be in one
   line. Save it to a file this becomes your input.text

5. Then, the final step use the decoder trace_projection to recognize and 
   obtain the output (spech to text file).
   nice -19 /ftp/pub/resources/courses/ece_8993_speech/homework/1998/
   utilities/decoder/trace_projection/bin/i386_SunOS_5.6/trace_projector 
   -p data/input_files/params.text -n 5 -c 3 -g 2 -demo

   Before executing you have to make ready your params.text file please
   refer to :
   /ftp/pub/resources/courses/ece_8993_speech/homework/1998/problem_07/
   balakrishnama/data/input_files/params.text

   Most of the files will be used from our main decoder version but these
   are the files we need to make ready for our experiment :
   grammar.text
   lexicon.text
   input.text    - this is ready from step (4)
   
   For grammar.text and lexicon.text please refer to my directory.

6. If you are running the decoder in -demo mode then after the command
   you need to key in the number of frames, no. of frames of your input
   file can be found by doing wc on your input file which is input.text.
   So as you key the no. of frames you will get the output in your 
   output file which is shown in params.text.

7. That's about it ! The output may or may not be the same as you spoke
   but silences and sp may be easily recognized. This is because our 
   system is trained for telephonic data so it wouldn't recognize 
   accurately the data obtained from a DAT machine.


All the commands used :
-----------------------

# Record the file from DAT machine
in raw format

# convert the raw file to wav file using the script
scripts/raw_to_nist.sh

# To obtain the mfcc file from wav file to get the feature values to be
used as input for the decoder
feature/cparam -m -w 25 -p 12 -d -g -e -H NIST data/four_digit/test.wav data/four_digit/test.mfcc
 
# To print the mfcc feature values to a file
feature/cview -h -n 39 data/four_digit/test.mfcc > data/four_digit/test.dat