N-GRAM DECODING
- Two special data structures - Ngram and Ngram_node.
- The basic data members of Ngram class are
| Hash_table** ngram_table_d; |
int_4 ngram_order_d; |
- The basic data members of Ngram_node class are
| int_4 history_length_d; |
Word** history_d; |
| Word* current_word_d; |
float_4 gramscore_d; |
| float_4 back_off_d; |
Use hash-table to store N-Gram nodes for quickly accessing
the N-Gram node and efficiently computing the LM score
- lexical tree.
- In N-Gram decoding, each word (except the sentence start
word) can be followed by any other word, so for large vocabulary
speech recognition the lexical tree would be very large.
A small start lexical tree is built specially for the sentence
start word.
A big N-Gram lexical tree is built only once. For all
other words except sentence start, we reuse this lexical tree,
and compute the LM scores on the fly while decoding.