SRILM Manual Pages
Programs
These are the top-level executables that are currently part of SRILM:
- ngram-count
- count N-grams and estimate language models
- ngram-merge
- merge N-gram counts
- ngram
- apply N-gram language models
- ngram-class
- induce word classes from N-gram statistics
- disambig
- disambiguate text tokens using an N-gram model
- hidden-ngram
- tag hidden events between words
- nbest-lattice
- rescore N-best lists and lattices
- nbest-mix
- interpolate N-best posterior probabilities
- segment
- segment text using N-gram language model
- segment-nbest
- rescore and segment N-best lists using N-gram language models
Utility Scripts
Additional tools implemented as scripts:
- training-scripts
- miscellaneous conveniences for language model training
- lm-scripts
- manipulate N-gram language models
- ppl-scripts
- manipulate perplexities
- pfsg-scripts
- create and manipulate finite-state networks
- nbest-scripts
- rescore and evaluate N-best lists
File Formats
Some of the data formats used by SRILM:
- ngram-format
- ARPA backoff N-gram models
- classes-format
- Word class definitions
- pfsg-format
- Decipher(TM) probabilistic finite-state grammars
- nbest-format
- N-best hypotheses lists
LM Library Classes
These are some of the basic classes of the SRILM library.
Note that this list is woefully incomplete, as this part of the documentation
is largely yet to be written.
- LM
- Generic language model
- Vocab
- Vocabulary indexing for SRILM
- Prob
- Probabilities for SRILM
- File
- Wrapper for stdio streams
Speech Group Home Page
SRI Home Page
Last updated $Date: 2006/07/11 00:24:06 $ by
stolcke@speech.sri.com