--===----=====------=======--------=========--------=======------=====----===-- HW #8: Perform a simple language modeling experiment. For the large text corpus provided, do the following: 1. Generate a histogram of word unigrams, bigrams, and trigrams. Compute the entropy of the distribution and discuss the nature of the distributions. Plot the OOV rate as a function of the N most frequent words. 2. Select the most frequent 1000 words. Compute the trigram coverage using this vocabulary. 3. Partition the data into a kept set and a held-out set. Use 80% of the data for the kept set. Build a trigram LM for this data. Compute the coverage of this trigram LM for the held-out set. Repeat this for three more partitions of the data. 4. From the results of no. 3, suggest a reasonable interpolated LM. A completed assignment will include a report. Target due date: 5/01/98 --===----=====------=======--------=========--------=======------=====----===-- HW #7: Using the ISIP recognizer, build a system that recognizes spoken telephone numbers. Your system must accommodate 4, 7, and 10 digit strings. You must use as many constraints about telephone numbers as you can. For acoustic models, use the ISIP context-dependent phone models currently packaged as part of the demo. You will need to build your own language model, and to find a way to interface audio to the system. A completed assignment will include a report and a (non)real-time demo. Target due date: 4/15/98 --===----=====------=======--------=========--------=======------=====----===-- HW #6: Using the discrete HMM tool, train the best model you can for the following data sequence: HHTTTHHHHTTTTTHHHHHHTTTTTTTHHHHHHHHTTTTTTTTTHHHHHHHHTTTTTTTH HHHHHTTTTTHHHHTTTHHTHHTTHHHHHHHHHHTHHHHHTHHHHHHHHHHTHHHTHHHH HHHHHHHHHTHHHHHHHHHTTHHHHHHTHHHTHHHHHTHHHHHHHHTHHHHHHTHTHHHH HHHHHHHTTHHHHHHHHHHHHHHHTHHHHHTTHHHHHHHHHHTTHHHHHHHHHHHHHTTH HHHHHHHHHHHTTHHHTHHHHHHHHHHHTHHHHHHTTHHHHHTHHHTHHHHHTHTHHHHH THTHHHHHHHHHTHTHHHHHHTHTHHHHHTHTTHHHHHHHHHHHHHTHTHHHHHHHHHHH HTHHHHHHHHHHHHHHHTHHHHHHHHHHHHHHHTHHHHHHHHHHTTHHHHHHHTHHHHHH HHHHHTHHHHTHHHHHHHHHHTHHHHHHHHHHHHTTHHHHTTHHHHHHHHHHHHHHTTHH With this model, compute the probability of the following sequences: HTHTHTHTHTHTHTHTHTHT HHHHHHHHHHHHHHHHHHHH TTTTTTTTTTTTTTTTTTTT Explain the results. Target due date: 3/15/98 --===----=====------=======--------=========--------=======------=====----===-- HW #5: Implement a capability to plot a signal's FFT spectrum, and the gain-matched spectrum produced by a linear prediction model. The tool must read speech from a binary file (assume 16-bit linear sampling), and allow the user to select the following: - sample frequency of the signal - preemphasis constant - window duration in secs - center time for the window in secs - a rectangular or hamming window - the linear prediction order You can approach this problem one of two ways: - implement the signal processing in matlab and figure out how to manipulate binary files into matlab - implement everything in C++ (preferred) In the latter case, the interface should be something like this: my_prog 8000.0 0.95 0.03 28.7 1 10 foo.raw | xmgr -source stdin The net result should be a plot of the signal spectrum computed using the following parameters: fs = 8 kHz preemphasis = argv[1] window_duration = argv[2] center time of the window = argv[3] hamming window = yes lp_order = argv[4] and plotted on a log amplitude vs. linear frequency scale. The spectra of the corresponding linear prediction model should be plotted as well. Xmgr accepts multiple sets of data, so simply print your xy points for both plots to stdout, with the second set separated by a newline, and xmgr will take care of the rest. You can use a DFT to compute the spectrum of the signal, or a zero-stuffed fft. The important thing is to only use window_duration number of samples of real data (note that window_duration is specificed in secs). For example, my_prog.exe 8000.0 0.95 0.03 3.0 12 | xmgr -source stdin should produce a signal and lp model spectrum for a 30 msec window of the signal centered at 3 secs. The lp analysis will be of order 12. A preemphasis filter 1 - 0.95z**-1 is applied to the data. For most of you, this should be a useful tool to have around. Feel free to pull the LP analysis software off of the net. The main thing is to get the visualization component working - and to understand gain matching of the two spectra. The resulting plots will typically have about a 60 dB dynamic range for studio quality data. Target due date: 2/29/98 --===----=====------=======--------=========--------=======------=====----===-- HW #4: Implement the algorithm described in class to compute the signal to noise ratio using a histogram of the energy distribution. Validate this design by: 1. Processing the four files below: ece_8993_speech/homework/1996/data/710_b_8k.raw ece_8993_speech/homework/1996/data/710_s_8k.raw ece_8993_speech/homework/1996/data/711_g_8k.raw ece_8993_speech/homework/1996/data/712_f_8k.raw and comparing your answers to the results from the class of 1996. First, plot the average SNR of the four files for the following conditions (do a scatter plot): - frame duration of 5, 10, 20, and 40 msec - window duration of 10, 20, 30, 60 msec Use a signal threshold of 80% and a noise threshold of 20%. Next, for the best set of parameters above, plot the average SNR as a function of the thresholds: - signal threshold 80%, 85%, 90%, 95%; - noise threshold 10%, 15%, 20%, 25% 2. Processing a large chunk of Switchboard: /isip/d02/switchboard/data/... See Janna Shaffer for an explanation of the format of these files. Process only the left channel (the first sample of the two samples that make up a stereo sample). (Have fun with Matlab on this one...) Target due date: 2/6/98 --===----=====------=======--------=========--------=======------=====----===-- HW #3: Repeat problem no. 2 using linear discriminant analysis (LDA) as the classification technique. See the 1997 DSP course project on for a mathematical description of LDA. Feel free to use the matlab code that is available from that project. For class dependent multitransformation method : ------------------------------------------------ Math operations involved in LDA (HW#3) : Let N1 = first data set N2 = second data set N = entire data (first,second & third merged) Test vectors = [x1 x2 x3 x4 x5 x6 x7 y1 y2 y3 y4 y5 y6 y7 z1 z2 z3 z4 z5 z6 z7] Step 1 : Find mean : Mu = mean(N) Mu1 = mean(N1) Mu2 = mean(N2) Step 2 : Find within class covariance & between class covariance cov0 = cov(N) cov1 = cov(N1) cov2 = cov(N2) Step 3 : Find new feature vectors new_f1 = [inv(cov1)(cov0)] new_f2 = [inv(cov2)(cov0)] Step 4 : Find eigen values and eigen vectors [eigen_vect_1,eigen_val_1] = eig(new_f1) [eigen_vect_2,eigen_val_2] = eig(new_f2) Step 5 : Order the eigen vector matrix in increasing order and take the most significant eigen vector values [reduced_eig_vect_1] [reduced_eig_vect_2] Detailed description for Step 5 : [eigen_vect_1,eigen_val_1] = eig(new_f1) [eigen_vect_2,eigen_val_2] = eig(new_f2) To get the reduced eigen vector matrix, do the following: For row 1 eigen_val in eigen_val_1 matrix corresponding eigen vector will be in col 1 of eigen_vect_1 matrix and so on for all other rows and columns. * Assign all eigen value matrices to a temporary matrix * Find the maximum of the eigen value of eigen value matrix then grab the corresponding column of eigen vector matrix and store it in a new matrix. * The new matrix consists of all significant eigen vectors. (The reason for obtaining only the significant values of eigen vectors is to avoid the problem of singularity matrix.) Step 6 : Transform the data into a new space for set 'n' where n is a matrix of ixj sum_T_11 = sum_T_12 = sum_T_13 = 0.0 T_1 = [[reduced_eig_vect_1]transpose[N1(i,:)]] T_11 = transpose(T_1) sum_T_11 = sum_(T_11) + (T_11) mean_T_1 = sum_T_11/j T_2 = [[reduced_eig_vect_2]transpose[N2(i,:)]] T_12 = transpose(T_2) sum_T_12 = sum_(T_12) + (T_12) mean_T_2 = sum_T_12/j Step 7 : Transformation of test vectors and calculation of Euclidean distance D_1 = transpose[(reduced_eig_vect_1)transpose(x1)] - (mean_T_1) D_2 = transpose[(reduced_eig_vect_2)transpose(x1)] - (mean_T_2) Step 8 : Calculate the magnitude of the distance If D_1 = [a1 a2 a3 a4 a5 a6 a7] If D_2 = [b1 b2 b3 b4 b5 b6 b7] dist_1 = sqrt[a1*a1 + a2*a2 + a3*a3 + ........] dist_2 = sqrt[b1*b1 + b2*b2 + b3*b3 + ........] Step 9 : Compare the two distances - dist_1, dist_2 Minimum distance among the three classifies the test vector to that particular class For class independent single transformation method : --------------------------------------------------- Math operations involved in this case (HW#3) : Do for three types of data : Case 1 : data used in HW#2 Case 2 : /u0/hamaker/text/1998/ece_8993_speech/homework/homework_03/ case2_set_2.dat Case 3 : /u0/hamaker/text/1998/ece_8993_speech/homework/homework_03/ case3_set_3.dat Let N1 = first data set N2 = second data set N = entire data (first,second & third merged) Test vectors = [x1 x2 x3 x4 x5 x6 x7 y1 y2 y3 y4 y5 y6 y7 z1 z2 z3 z4 z5 z6 z7] Step 1 : Find mean : Mu = mean(N) Mu1 = mean(N1) Mu2 = mean(N2) Step 2 : Find within class covariance & between class covariance cov0 = cov(N) cov1 = cov(N1) cov2 = cov(N2) Step 3 : Find new feature vectors new_f1 = [inv((cov1)+(cov2))*transpose(cov0)] Step 4 : Find eigen values and eigen vectors [eigen_vect_1,eigen_val_1] = eig(new_f1) Step 5 : Order the eigen vector matrix in increasing order and take the most significant eigen vector values [reduced_eig_vect_1] Detailed description for Step 5 : [eigen_vect_1,eigen_val_1] = eig(new_f1) To get the reduced eigen vector matrix, do the following: For row 1 eigen_val in eigen_val_1 matrix corresponding eigen vector will be in col 1 of eigen_vect_1 matrix and so on for all other rows and columns. * Assign all eigen value matrices to a temporary matrix * Find the maximum of the eigen value of eigen value matrix then grab the corresponding column of eigen vector matrix and store it in a new matrix. * The new matrix consists of all significant eigen vectors. (The reason for obtaining only the significant values of eigenvectors is to avoid the problem of singularity matrix.) Step 6 : Tranform the data into a new space for set 'n' where n is a matrix of ixj sum_T_11 = sum_T_12 = sum_T_13 = 0.0 T_1 = [[reduced_eig_vect_1]transpose[N1(i,:)]] T_11 = transpose(T_1) sum_T_11 = sum_T_11 + (T_11) mean_T_1 = sum_T_11/j T_2 = [[reduced_eig_vect_1]transpose[N2(i,:)]] T_12 = transpose(T_2) sum_T_12 = sum_T_12 + (T_12) mean_T_2 = sum_T_12/j Step 7 : Transformation of test vectors and calculation of Euclidean distance D_1 = transpose[(reduced_eig_vect_1)transpose(x)] - (mean_T_1) D_2 = transpose[(reduced_eig_vect_1)transpose(x)] - (mean_T_2) (where x represents the first row of test vector similarly do for y and z test vectors also) Step 8 : Calculate the magnitude of the distance If D_1 = [a1 a2 a3 a4 a5 a6 a7] If D_2 = [b1 b2 b3 b4 b5 b6 b7] dist_1 = sqrt[a1*a1 + a2*a2 + a3*a3 + ........] dist_2 = sqrt[b1*b1 + b2*b2 + b3*b3 + ........] Step 9 : Compare the two distances - dist_1, dist_2 Minimum distance among the three classifies the test vector to that particular class Target due date: 1/30/98 --===----=====------=======--------=========--------=======------=====----===-- HW #2: 1. Define two sets such that the distributions approximate an ellipse and a pear shape. The elliptical distribution should stretch from lower-left to upper-right at about a 45 degree angle and should be longer in that direction than it is wide. Set 2 should look like a pear with the stem pointing at -45 degrees. Set 1 should have a mean of approximately m1=(-2, 2) and Set 2 should have a mean of approximately m2=(2, -2). Each set contains 100 points. Plot both sets in a single figure. 2. Define a test set of four points: x1 = (-1, -1); x2 = (0, 0); x3 = (1/2, 1/2); x4 = (1/2, -1/2); 3. Compute the Euclidean distances: d(x1, m1), d(x1, m2); d(x2, m1), d(x2, m2); d(x3, m1), d(x3, m2); d(x4, m1), d(x4, m2); 4. Classify each sample point as a member of set 1 or set 2 based on minimum Euclidean distance. 5. Draw the decision region on the plot containing set 1 and set 2. 6. Compute the whitening transform for each set given as: T_n = sqrt(inv(E_val))* transpose(E_vec), where E_val and E_vec are the eigenvalues and eigenvectors respectively of the covariance matrix for set n. 7. Transform each set into the transformed space by using its respective transform and plot these sets. 8. Recompute the means of these transformed sets: m_xform(n) = mean(T_n * set n) 9. Transform the sample points by T_n and compute the distance from m_xform(n) for all `n'. Reclassify the sample points based on minimum distance. 10. Draw the new decision region in the plot containing the untransformed sets 1 and 2. 11. Compute the covariance of the transformed data for both set 1 and set 2 (These should be identity matrices). 12. Plot T_n(x - m_n) for all in set `n' for n = 1, 2. This should approximate a circle. Target due date: 1/23/98 --===----=====------=======--------=========--------=======------=====----===-- HW #1 Q1: What is the difference between a random vector and a random process? Q2: What does wide sense stationary mean? strict sense stationary? Q3: What does it mean to have an ergodic random process? Q4: How does this influence our signal processing algorithms? Target due date: 1/16/98