Figure 1: An example of a speech signal. Figure 2: Energy histogram of signal given in Figure 1. Figure 3: Cdf obtained from pdf given in Figure 2. Figure 4: SNR as a function of window and frame size (for file 710_b_8k.raw). Figure 5: SNR as a function of window and frame size (for file 710_s_8k.raw). Figure 6: SNR as a function of window and frame size (for file 711_f_8k.raw). Figure 7: SNR as a function of window and frame size (for file 712_g_8k.raw). Figure 8: SNR as a function of noise and signal thresholds (20/30 msec frame/window) for file 710_b_8k.raw. Figure 9: SNR as a function of noise and signal thresholds (20/30 msec frame/window) for file 710_s_8k.raw. Figure 10: SNR as a function of noise and signal thresholds (20/30 msec frame/window) for file 711_f_8k.raw. Figure 11: SNR as a function of noise and signal thresholds (20/30 msec frame/window) for file 712_g_8k.raw. Figure 1: SNR as a function of window and frame size (for file 710_b_8k.raw). Figure 2: SNR as a function of window and frame size (for file 710_s_8k.raw) COMPUTING SIGNAL-TO-NOISE RATIO (SNR) USING HISTOGRAM OF ENERGY DISTRIBUTION Program #4 EE 8993: Speech Processing Audrey Le Professor: Dr. Joseph Picone June 13, 1998 1. PROBLEM DESCRIPTION In this project we are to implement an algorithm that calculates signal-to-noise ratio (SNR) using the histogram of energy distribution method. Several sets of experiments will be performed and the results of these experiments will be compared to those obtained from the class of 1996. The speech files used in the experiments consists of one and two-channel data. Both type of data used in the experiments are 16-bit linear data sampled at 8000Hz. The following files are used in the experiments. Filename Type 710_b_8k.raw one-channel 710_s_8k.raw one-channel 711_g_8k.raw one-channel 712_f_8k.raw one-channel sw2001.raw two-channel Table 1: Files that are used in the experiments. The one-channel data can be obtained from www.isip.msstate.edu/resources/courses/ece_8993_speech/homework/1996/data. The two-channel data is located at isip/d00/switchboard/data/20/2001. The one-channel data has the following format: < chan 0 byte 1> etc... while the two-channel data use an interleave format: < chan 1 byte 0> etc... The first set of experiments includes calculating the SNR of the above raw speech files as a function of the frame and window durations using a signal and noise thresholds of 80% and 20%, respectively. The frame and window durations used are given below: - frame duration of 5, 10, 20, and 40 msec - window duration of 10, 20, 30, 60 msec In the next set of experiments, the SNR is calculated as a function of the signal and noise thresholds using the best set of parameters obtained from the first experiment. The thresholds are given as follows: - signal threshold 80%, 85%, 90%, 95% - noise threshold 10%, 15%, 20%, 25% In the third set of experiments, the SNR is calculated for a chunk of Switchboard data using the best set of parameters obtained from the first and second experiments. The Switchboard file chosen for this experiment is: 2. ALGORITHM The signal-to-noise ratio also shorted for SNR is a measure of signal quality. It is defined as the ratio of the amplitude of the desired signal to the amplitude of the noise signal. There are many variations of signal-to-noise calculation. In this project, however, we implement the calculation of SNR by using the histogram of energy distribution method. In this method, the energy of the signal is calculated on a frame by frame basis. A histogram of the energy values computed is compiled. So given a signal that looks like the one in Figure 1, the energy histogram might look like one in Figure 2. Since SNR can be defined as: (1) where is the signal energy and is the noise energy. Equation (1) can be alternatively expressed as: (2) If we can compute the cumulative distribution function (cdf) we can use Equation (2) to get the SNR. The cdf can be obtained by taking the accumulative of the probability function (pdf). The pdf can be obtained by normalizing the histogram. So back to our example, the cdf of the energy of the signal given in Figure 1 might look like this: From the cdf, we can "read-off" the signal level and noise level . The following steps summarize the calculation of the SNR using histogram of energy distribution. 1. Get a frame of data centered on a window. If the window falls out of range, zero pad the window 2. Pre-emphasize the window of data using Equation (3). (3) where is the pre-emphasis value, is the signal value, is the signal value from the previous window, and is the pre-emphasis constant. Pre-emphasis is used to cancel DC bias in the signal. 3. Hamming window the window of data using Equation (4). (4) where is the Hamming windowed value, is the signal value, and is the total number of samples in the window. 4. Calculate the energy of the window using Equation (5). Save it to calculate the histogram later on. (5) where is the energy value of that window, is the signal value, and is the total number of samples in the window. 5. Repeat step 1-4 until end of file. 6. Compute the histogram of the energy distribution. 7. Compute the pdf of the histogram. 8. Compute the cdf of the histogram. 9. Compute the SNR using Equation (2). 3. EXPERIMENTAL RESULTS AND DISCUSSION In this project, three sets of experiments were conducted. The results were compared with those obtained from the class of 1996. In the first set of experiments, we wanted to determine if the frame or window durations have any effect on the SNR. The SNR was calculated for the files given in section 1 with the frame and window duration parameters varied while the other paramters were kept constant. The frame duration was varied from 5 msec to 40 msec, and the window duration was varied from 10 msec to 60 msec with the noise and signal thresholds remain at 0.20% and 0.80%, respectively. The results are given in Table 2-Table 17. We can see that the frame duration does not have a major effect on the SNR values. Likewise, the window duration has little effect on the SNR values. However, the combination of these two parameters contributes has greater effect on the SNR values. In addition, we can see that the files used for the experiments have varying degree of signal quality with 710_s_8k.raw has the best signal quality of the four files used in testing and 711_f_8k.raw has the worse. In the second set of experiments, we chose the best parameters obtained from the first experiment, that is, a set of paramters that gives the best SNR for the files tested and determined if noise and signal thresholds have an effect on the SNR. A frame size of 20 msec and a window size of 30 msec give the best results two out of four files tested. These values were used while the noise and signal thresholds were varied from 0.10 to 0.25 and 0.80 to 0.95, respectively. The results are given in Table 18-Table 21. We can see that as the noise and signal thresholds increase the SNR increases. This occurs because the thresholding values that we have chosen have not reached the nominal noise and signal levels. However, this behavior ceases to be true when noise and signal thresholds are above 0.25 and 0.95, respectively. This indicates that there are not much signal energy after 0.90 signal threshold to offset the noise threshold. Figure 8-Figure 11 show scatter plots of SNR as a function of noise and signal thresholds. In the third set of experiments, we evaluated the SNR on a Switchboard file sw2001.raw using the best paramters from the previous two experiments. sw2001.raw was evaluated using a frame of 20 msec, a window of 30 msec, a noise threshold of 0.20% and a signal threshold of 0.90%. The SNRs for both channels were found to be 36.022926 dB and 30.418713 dB. file window (msec) frame(msec) SNR 710_b_8k.raw 10 5 16.244802 710_b_8k.raw 10 10 15.866564 710_b_8k.raw 10 20 15.450395 710_b_8k.raw 10 40 15.442480 Table 2: SNR as a function of frame duration with respect to a window of 10 msec for 710_b_8k.raw. file window (msec) frame(msec) SNR 710_b_8k.raw 20 5 16.414530 710_b_8k.raw 20 10 16.588209 710_b_8k.raw 20 20 16.439322 710_b_8k.raw 20 40 15.943925 Table 3: SNR as a function of frame duration with respect to a window of 20 msec for 710_b_8k.raw. file window (msec) frame(msec) SNR 710_b_8k.raw 30 5 16.596828 710_b_8k.raw 30 10 16.552950 710_b_8k.raw 30 20 16.691147 710_b_8k.raw 30 40 15.986185 Table 4: SNR as a function of frame duration with respect to a window of 30 msec for 710_b_8k.raw. file window (msec) frame(msec) SNR 710_b_8k.raw 60 5 16.460327 710_b_8k.raw 60 10 16.393894 710_b_8k.raw 60 20 16.294994 710_b_8k.raw 60 40 16.275549 Table 5: SNR as a function of frame duration with respect to a window of 60 msec for 710_b_8k.raw. file window (msec) frame(msec) SNR 710_s_8k.raw 10 5 30.861126 710_s_8k.raw 10 10 31.472828 710_s_8k.raw 10 20 31.183504 710_s_8k.raw 10 40 30.692507 Table 6: SNR as a function of frame duration with respect to a window of 10 msec for 710_s_8k.raw. file window (msec) frame(msec) SNR 710_s_8k.raw 20 5 31.318970 710_s_8k.raw 20 10 31.384382 710_s_8k.raw 20 20 30.903053 710_s_8k.raw 20 40 30.940752 Table 7: SNR as a function of frame duration with respect to a window of 20 msec for 710_s_8k.raw. file window (msec) frame(msec) SNR 710_s_8k.raw 30 5 31.395359 710_s_8k.raw 30 10 31.369610 710_s_8k.raw 30 20 31.507902 710_s_8k.raw 30 40 31.188627 Table 8: SNR as a function of frame duration with respect to a window of 30 msec for 710_s_8k.raw. file window (msec) frame(msec) SNR 710_s_8k.raw 60 5 26.745348 710_s_8k.raw 60 10 26.933214 710_s_8k.raw 60 20 27.142965 710_s_8k.raw 60 40 28.054449 Table 9: SNR as a function of frame duration with respect to a window of 60 msec for 710_s_8k.raw. file window (msec) frame(msec) SNR 711_f_8k.raw 10 5 9.119938 711_f_8k.raw 10 10 9.402864 711_f_8k.raw 10 20 9.066444 711_f_8k.raw 10 40 10.096010 Table 10: SNR as a function of frame duration with respect to a window of 10 msec for 711_f_8k.raw. file window (msec) frame(msec) SNR 711_f_8k.raw 20 5 9.788088 711_f_8k.raw 20 10 10.13671 711_f_8k.raw 20 20 9.911191 711_f_8k.raw 20 40 9.977847 Table 11: SNR as a function of frame duration with respect to a window of 20 msec for 711_f_8k.raw. file window (msec) frame(msec) SNR 711_f_8k.raw 30 5 9.949892 711_f_8k.raw 30 10 10.145644 711_f_8k.raw 30 20 9.931570 711_f_8k.raw 30 40 9.815060 Table 12: SNR as a function of frame duration with respect to a window of 40 msec for 711_f_8k.raw. file window (msec) frame(msec) SNR 711_f_8k.raw 60 5 9.900345 711_f_8k.raw 60 10 9.969599 711_f_8k.raw 60 20 9.951087 711_f_8k.raw 60 40 9.900935 Table 13: SNR as a function of frame duration with respect to a window of 60 msec for 711_f_8k.raw. file window (msec) frame(msec) SNR 712_g_8k.raw 10 5 10.397092 712_g_8k.raw 10 10 10.233241 712_g_8k.raw 10 20 10.007339 712_g_8k.raw 10 40 10.456532 Table 14: SNR as a function of frame duration with respect to a window of 10 msec for 712_g_8k.raw. file window (msec) frame(msec) SNR 712_g_8k.raw 20 5 10.545899 712_g_8k.raw 20 10 10.391139 712_g_8k.raw 20 20 10.244164 712_g_8k.raw 20 40 10.392828 Table 15: SNR as a function of frame duration with respect to a window of 20 msec for 712_g_8k.raw. file window (msec) frame(msec) SNR 712_g_8k.raw 30 5 10.493260 712_g_8k.raw 30 10 10.442673 712_g_8k.raw 30 20 10.301928 712_g_8k.raw 30 40 10.326337 Table 16: SNR as a function of frame duration with respect to a window of 30 msec for 712_g_8k.raw. file window (msec) frame(msec) SNR 712_g_8k.raw 60 5 10.462374 712_g_8k.raw 60 10 10.472594 712_g_8k.raw 60 20 10.447512 712_g_8k.raw 60 40 10.516697 Table 17: SNR as a function of frame duration with respect to a window of 60 msec for 712_g_8k.raw. file noise/signal thresholds SNR 710_b_8k.raw 0.10/0.80 17.236237 710_b_8k.raw 0.15/0.85 17.959270 710_b_8k.raw 0.20/0.80 16.691147 710_b_8k.raw 0.20/0.90 19.948442 710_b_8k.raw 0.25/0.95 22.077831 Table 18: SNR as a function of noise and signal thresholds for 710_b_8k.raw. file noise/signal thresholds SNR 710_s_8k.raw 0.10/0.80 31.507902 710_s_8k.raw 0.15/0.85 32.866573 710_s_8k.raw 0.20/0.80 31.507902 710_s_8k.raw 0.20/0.90 34.465740 710_s_8k.raw 0.25/0.95 32.752346 Table 19: SNR as a function of noise and signal thresholds for 710_s_8k.raw. file noise/signal thresholds SNR 711_f_8k.raw 0.10/0.80 10.525761 711_f_8k.raw 0.15/0.85 11.181015 711_f_8k.raw 0.20/0.80 9.931570 711_f_8k.raw 0.20/0.90 12.684457 711_f_8k.raw 0.25/0.95 13.755866 Table 20: SNR as a function of noise and signal thresholds for 711_f_8k.raw. file noise/signal thresholds SNR 712_g_8k.raw 0.10/0.80 11.087926 712_g_8k.raw 0.15/0.85 12.195965 712_g_8k.raw 0.20/0.80 10.301928 712_g_8k.raw 0.20/0.90 13.049687 712_g_8k.raw 0.25/0.95 13.929175 Table 21: SNR as a function of noise and signal thresholds for 712_g_8k.raw.