Echo Canceller / Legacy Software / Software / Home

Speech data collected over the telephone, such as the SWITCHBOARD conversational speech data corpus, contains echo caused by the process of converting a two-wire signal used in the local loop to a four-wire signal used in network transmission. See system overview for more details.

This echo could be used by the speech recognizer to gather important cues regarding the ID of the speaker or the channel conditions, thereby making the job that much easier. To eliminate this problem, we need an efficient echo cancellation technology.

We have developed an FIR echo-canceller for this purpose. A detailed description of this technology can be found in the following reference:

Messerschmitt, David; Hedberg, David; Cole, Christopher; Haoui, Amine; Winship, Peter; "Digital Voice Echo Canceller with a TMS32020," in Digital Signal Processing Applications with the TMS320 Family, pp. 415-437, Texas Instruments, Inc., 1986.

This document can be retrieved from the Texas Instruments web site. A copy can be found on this web site as well.

We have deviated from the standard implementation of an LMS echo-canceller at places to accommodate certain problems we face. Some of the main problems we encountered during the development of the system are:

Double talk: This is a condition when both the speakers talk simultaneously. If we adapt the FIR filter coefficients during double talk, the filter will diverge, causing "blips" in the output. This can be avoided by having an efficient voice activity detector (VAD). When the VAD detects near-end speech the adaptation process is suspended. This avoids the divergence problem.
Complex echo: The echo-canceller performs poorly in some cases of double talk. It fails to cancel the far-end speech effectively. We attribute this to the possibility of the existence of complex echo patterns.
Residual Error Suppression: We know that due to the non-linearities of the echo path of the telephone network the maximum suppression possible is limited to about 40dB. So, in cases when the return signal power falls below a threshold based on the reference signal power, it is suggested that we zero the output. This process however creates a choppyness in the background. To make the background more uniform, we decided to make the output equal to a scaled version of the reference signal when the near-end signal is not present.
Length of the filter: Unfortunately the length of the FIR filter has to depend on the maximum delay in echo signal in the data set we are using. If we consider international telephone conversations, the round trip delay is typically an order of magnitude more than that for domestic calls. We would like our system to automatically choose the length of the filter depending on the maximum round-trip delay the user specifies. Also another unanswered question is the relationship between the adaptation rate constant and the filter length. From the experiments we performed, there seems to an inverse relationship between the two quantities.

This program is easy to use. After you download, compile and link the code, simply type:

ec.exe < input_file > output_file

The input signal must be 16-bit interleaved stereo data. The output signal will be the same. You can download the following from our site:

Tar File: download a C++ implementation in compressed gzip format.
Source Code: view the C++ source code distribution.
Example Data: some example data to verify your implementation.
System Overview: a system overview in pdf format.
TI DSP Application Note: an excellent application note describing the theory and implementation of an LMS echo canceller. The implementation included here is based on this application note and references it heavily.