/ Features / Fundamentals / Production / Tutorials / Software / Home
3.3.3 Rapid Prototyping: Absolute MFCC Computations
Section 3.3.3: Absolute MFCC Computations

While the Fourier Transform provides a valuable method for analyzing the frequency spectrum of a digital signal, additional methods are needed to fully measure the features needed by a speech recognizer. The Mel-Frequency Cepstrum Coefficients (MFCC) representation is an example of a method that further analyzes the Fast Fourier Transform of the speech signal. The value of the method is attributed to its similarity to the functioning of the human auditory system.

MFCC's use a mathematical transformation called the cepstrum which computes the inverse Fourier transform of the log-spectrum of the speech signal. The logarithmic nature of the technique is significant since the human auditory system perceives sound on a logarithmic scale above certain frequencies.

A state-of-the-art speech recognizer can be built using 12 MFCC's plus the first and second order derivatives of those coefficients. This section shows how to use Transform Builder to create the signal flow graph shown below to compute the 12 basc MFCC's.

Input Window Spectrum Cepstrum Output
Section 3.3.3: Configuration

This section is organized as follows: See Section 3.5 to learn how to build a complete front-end using these absolute MFCC's and their first and second order derivatives. For a more detailed theoretical description of MFCC's, see our on-line workshop notes.


Configuration: View individual component configurations.

Click on any of the components below to view their configuration.

Section 3.3.3: Configuration

Once you have configured each object, save the configuration to a file as explained in Section 3.3.1. Compare your new recipe file to this one: recipe_freq_compare.sof

Verification: Verify the previous step using this exercise. Section 3.3.3: Verification

From the directory $ISIP_TUTORIAL/sections/s03/s03_03_p03/, run the command:

    isip_transform -param recipe_freq.sof -type text -suffix _freq -verbos BRIEF speech.sof

    Output:
    
    total recipes 1
    loading parameter file: recipe_freq.sof
      processing file 1: speech.sof
        processing pfile 1: recipe_freq.sof
    isip_transform: processed 1 file(s), attempted 1 file(s).
                
The feature measurements are stored in the file speech_freq.sof. Compare your feature file to speech_freq_compare.sof.

Note the format of the file. The top portion contains header information from the configuration, such as number of features extracted, frame duration, and sample frequency. The data values are shown in the latter part of the file. Since these Sof files are text files, you may view them from any text editor or software which displays text. Normally, however, we will output feature files as binary files.
   
Table of Contents   Section Contents   Previous Page Up Next Page
      Glossary / Help / Support / Site Map / Contact Us / ISIP Home