N-GRAM DECODING AND LATTICE GENERATION


Jie Zhao
Institute for Signal and Information Processing
Mississippi State University
Phone/Fax: 601-325-8335/3149 Email: zhao@isip.msstate.edu

ABSTRACT:

N-Gram decoding and lattice generation are some of the new features being added to the ISIP public domain speech recognition system. As compared to processing a lattice, N-Gram decoding is far more resource intensive. Hence, great care must be taken while designing the data structures and algorithmic flow.
In N-Gram decoding, the decoder generally outputs the most likely hypothesis, i.e the 1-best. In lattice generation, the decoder needs to generate an N-best list of hypotheses. The N-best list is then converted into a lattice (word graph) format. The generated lattice can be applied to a second pass of decoding (typically referred to as lattice rescoring), in which the search space is restricted to those paths generated in the N-best pass through the N-Gram language model. Lattice rescoring can be used to efficiently test new acoustic models or language models at a reduced computational complexity.
In this talk, we will present the concepts on N-Gram language model, N-Gram decoding and lattice generation, and the way these features were efficiently implemented in the ISIP recognition system.