PROBLEM 4:
|
FEATURE ANALYSIS
|
The goal of this assignment is to teach you the fundamentals of
the signal processing component of a speech recognition system.
We will focus on two techniques:
mel frequency-spaced cepstral coefficients (MFCCs) and
filter bank amplitudes (FBAs).
-
Generate a synthetic vowel by summing three sinewaves of equal
amplitudes at frequencies of 500, 1500, and 2500 Hz.
Process these through a standard MFCC front-end (details will
follow in class), and through a front-end using log spectral
amplitudes directly (followed by a global principal components
analysis of course). Show that the results make sense
based on theoretically predicted values of the data.
-
Generate a bandpass filter with the above center frequencies
and 200 Hz bandwidths. Process Guassian white noise through this
filter. Perform feature analysis on this data using the two
front-ends above, and show that the results make sense.
-
Compute MFCC features on a typical set of SWB files, and determine
if the diagonal covariance matrix approximation used in
variance-weighting makes sense.
Details on the locations of data, etc., will follow shortly.