3.2.2 Signal Flow Graphs:
Frequency Domain Analysis You have now viewed signal flow graph representations for extracting energy, an important time-domain feature needed by the speech recognizer. As discussed, frequency-domain features are also needed by the recognizer. The signal flow graph below corresponds to the block diagram for the frequency domain example given in Section 3.1.3. Note that the block previously labeled Spectrum is shown as a component labeled Spec. This component represents the algorithm to be used for frequency spectrum analysis, such as a Fourier Transform. Click on any of the components in the graph for further details. While the Fourier Transform provides a valuable method for analying the frequency spectrum of a signal, additional methods are needed to fully measure the features needed by a speech recognizer. Mel-Frequency Cepstrum Coefficients (MFCC) are an example of a method that further analyzes the Fast Fourier Transform of the speech signal. The value of the method is attributed to its similarity to the functioning of the human auditory system. MFCC's use a mathematical transformation called the cepstrum which computes the inverse Fourier transform of the log-spectrum of the speech signal. The logarithmic nature of the technique is significant since the human auditory system perceives sound on a logarithmic scale above certain frequencies. The signal flow graph below includes a component labeled Ceps for cepstral analysis. Click on any of the components of the graph for further description. See the workshop notes on signal processing for a more detailed theoretical description. of MFCC's. |