homework #5


Linear Prediction Analysis


EE 8993: Fundamentals of Speech Recognition


March 11, 1999


submitted to:

Dr. Joseph Picone


submitted by:

Suresh Balakrishnama


Institute for Signal and Information Processing
Department of Electrical and Computer Engineering
Mississippi State University
MS 39762, USA
Email: balakris@isip.msstate.edu
1.  INTRODUCTION
Linear Prediction Analysis has been among the most popular methods for extracting spectral information from speech. Linear Prediction analysis is an important method for finding the shape of a spectrum. In linear prediction the signal is modeled as a linear combination of the its past values and present and past values of a hypothetical input to a system whose output is the given signal. Each continuous-time signal  is sampled to obtain a discrete-time signal ,also known as time-series, where n is an integer variable and  is the sampling interval. 
2.  Problem Description 
Implement a capability to plot a signal's FFT spectrum, and the gain-matched spectrum produced by a linear prediction model. The tool must read speech from a binary file (assume 16 bit linear sampling), and allow the user to select the following:
	хх							лллллхллллл			sample frequency of the signal
		хpreemphasis constant
		window duration in secs
	center time for the window is secs
	a rectangular or hamming window
	the linear prediction order

You can approach this problem one of two ways
	implement the signal processing in matlab and figure out how to manipulate binary 			files 					    i	nto 																																																	matlab
 				implement everything in C++ (preferred)

In the latter case, the interface should be something like this:
        my_prog 8000.0 0.95 0.03 28.7 1 10 foo.raw | xmgr -source stdin
   The net result should be a plot of the signal spectrum computed using the following parameters:		
     fs = 8 kHz
     preemphasis = argv[1]
     window_duration = argv[2]
     center time of the window = argv[3]
     hamming window = yes
     lp_order = argv[4]

and plotted on a log amplitude vs. linear frequency scale. The spectra of the corresponding linear prediction model should be plotted as well. Xmgr accepts multiple sets of data, so simply print your xy points for both plots to stdout, with the second set separated by a newline, and xmgr will take care of the rest.
You can use a DFT to compute the spectrum of the signal, or a zero-stuffed   fft. The important thing is to only use window_duration number of samples of real data (note that window_duration is specified in secs).
For example,
my_prog.exe 8000.0 0.95 0.03 3.0 12 | xmgr -source stdin
should produce a signal and lp model spectrum for a 30 msec window of the signal centered at 3 secs. The lp analysis will be of order 12. A preemphasis filter 1 - 0.95z** is applied to the data. For most of you, this should be a useful tool to have around. Feel free to pull the LP analysis software of the net. The main thing is to get the visualization component working - and to understand gain matching of the two spectra.
The resulting plots will typically have about a 60 dB dynamic range for studio quality data.
3.  Description of Algorithms
One of the most powerful models currently in use is that where a signal  is considered to be the output of some system with some unknown input  such that the following relation holds:
(1) 
where  and , , , , and the gain  are the parameters of the hypothesized system. The output from equation (1)  is a linear function of past outputs and present and past inputs. The signal  is predictable from linear combinations of past outputs and inputs. This is the reason for this system to be called linear prediction. The predicted value is a linear combination of previous values in the signal. Linear prediction error is an important term and the parameters chosen in LP analysis to determine prediction coefficients should be such as to minimize linear prediction error. For a speech signal , predicted values is given by
(2)  
and the prediction error is given by 
(3)
According to Parseval's theorem, if error is small in time domain error is small in frequency domain also and this error should be minimized to the least. The error can be minimized by finding the best or optimal value of . To explain the computation involved for  let us consider a short-time prediction error:
(4)
(5) 
The error can be minimized with respect to  for each  by differentiating  and setting the result equal to zero.
(6)
Rearranging terms we get,
(7)
Equation (7) is known as linear prediction equation and  are known as linear prediction coefficients or predictor coefficients.
Levinson-Durbin's Recursion Method
The L-D recursion is a recursive-in-model-order solution for the autocorrelation equations. The solution for desired order-M model is successively built from lower-order models, beginning with the 0th order predictor which is no predictor coefficient at all. This method uses autocorrelation coefficients to determine the prediction coefficients and reflection coefficients. The prediction coefficients can be computed using the following equations:

(8)
with  and  where  indicates the current iteration,  indicates the previous iteration,  is the total number of iterations, and  is the order of the prediction.  is the error term,  is the autocorrelation coefficient,  is the reflection coefficient, and  is the predictor coefficient.

DFT Method
Like the previous case, there are many algorithms to calculate the DFT coefficients. However, here since we focus on the linear prediction task and not the DFT, we did not use any fast implementation of the DFT calculation but just straight implementation from the DFT equation, Equation (9).

(9)
with  where  indicates the current iteration,  is the total number of iterations, and  is the order of the DFT.


4.  Results
Figure 1.  Plot showing DFT spectrum of speech file 


5.  Conclusions
The DFT spectrum was obtainable but the computation of LP derived spectrum became difficult and its plot could not be obtained demonstrating the effect of LP derived spectrum over DFT spectrum. The error between the LP-derived and DFT spectrum could not be analyzed. But based on theory, the error becomes smaller as the LP model order gets higher and higher.
6.  References
[1]	J.Makhoul, "Linear Prediction: A Tutorial Review", Proceedings of the IEEE, Vol. 63, April 1975.
[2]	J.Picone, "ECE 8993: Fundamentals of Speech Recognition Lecture Notes", Mississippi State University, May 1998.