Logo


Decoder Strategies


Neeraj Deshmukh

Institute for Signal and Information Processing
Mississippi State University
Phone: (601) 325-8335 Fax: (601) 325-3149
Email: deshmukh@isip.msstate.edu
URL: www.isip/research/isip/publications/seminars/1997/isip_decoder_strategies/

Abstract:

Perhaps the most challenging problem in state-of-the-art large vocabulary continuous speech recognition (LVCSR) is to evaluate the most likely hypotheses (sequences of words) for an unknown utterance given the speech signal, acoustic models and the language model for the task. The total number of possible hypotheses is prohibitively large to perform an exhaustive search, and various sub-optimal techniques are necessary to allow for a reasonably efficient and accurate generation of the most probable hypotheses. This problem is referred to as search or decoding. In this seminar we will present an overview of the decoding strategies prevalently used in LVCSR. Time-synchronous or breadth-first techniques such as the Viterbi algorithm, and state-synchronous (i.e. depth-first or best-first) methods such as the A* stack decoder will be reviewed, along with extensions to forward-backward multipass search algorithms, N-best searching and other hybrid decoders. Relative merits of each algorithm will also be discussed.