/ Recognition / Fundamentals / Production / Tutorials / Software / Home

4.1.1 Overview: Bayes' Rule

In this section, we describe how to use the recognition utility, isip_recognize to implement the search portion of the speech recognition problem, and to produce the overall most probable transcription of the input utterance. In order to decode the spoken words, certain search algorithms are required to narrow the possibilities. Because the complexity of an optimal or exhaustive search solution is prohibitive for speech recognition, suboptimal search techniques are vital to the decoding process.

The primary search algorithm used in our software is a time-synchronous, Viterbi beam search. This is essentially a breadth-first search algorithm. Section 4.1.2 describes search algorithms in more detail. Note that the terms "decoding" and "recognition" are often used interchangeably.

The search algorithm essentially integrates constraints imposed by the language model and probabilities computed in the acoustic models using a probabilistic framework based on Bayes' Rule:

where

	: acoustic model (hidden Markov models, mixture of Gaussians)
	: language model (finite state machines, N-grams)
	: acoustics (ignore during maximizations)

We can ignore the term, P(A), which represents the likelihood of the acoustic channel. This reduces the task of finding the most probable word sequence to a maximization (or optimization) of:

The goal of the language model, represented in the term P(W), is to constrain the number of allowable word sequences. The role of the language model is described in detail in Section 6. The acoustic model provides a way of computing a probability for each feature vector. This is described in detail in Section 5. The recognition component, which is often referred to as a decoder, implements a variety of search algorithms in an attempt to find this optimal word sequence in the most efficient manner. For a detailed discussion of Bayes' Rule, see this lecture on the noisy communication channel model from our on-line speech recognition course notes.

Next, let's review the search process in a little more detail.

Glossary / Help / Support / Site Map / Contact Us / ISIP Home