|
This is a discrete HMM toolkit that we hope will be useful in learning
about the fundamental properties of HMMs. The current release contains
a program run from the command line. Future releases will provide a
Java-based web interface, including a GUI that will help visualize
various aspects of HMM theory.
The
code
closely parallels the theory presented in:
J.R. Deller, Jr., J.G. Proakis, and J.H.L. Hansen,
Discrete-Time Processing of Speech Signals,
MacMillan, 1993, ISBN: 0-02-328301-7.
It is written entirely in C++ (using GNU's gcc compiler), and should
fairly easily compile on most machines. To build the code, change
directories to src, and execute make. You will need a make utility
that supports GNU's make extensions (we recommend GNU's make).
The binary found in the bin directory, hmm.exe, is compiled on
a Sun Sparc running Solaris 2.4. The binary supports four major
modes of operation:
-
Training
The training mode is typically enabled by first creating a file
containing sequences of symbols. For example, to emulate coin toss
experiments, you might consider creating a file of random sequences
of heads and tails:
HHHHTHTHTHHHT
THTHTHHTHTHHT
...
You can train an HMM on this data by using the following command line:
hmm.exe -train -K 2 -S 2 -P 2 -viterbi file.data file.model
The arguments are described in the help message (hmm.exe -help).
The above line generates a 2-state HMM with a 2-symbol codebook,
using two passes of training based on the Viterbi algorithm
(Baum-Welch training is also available).
The input data is contained in file.data; the output model will be
found in file.model.
-
Generation
Once you have a model, you can generate data from the model
using the command line:
hmm.exe -generate -L 100 file.data file.model
Here, the model in file.model is loaded, and 100 random
sequences are generated from the model. These can be used
to train a new model, or evaluate an existing model.
-
Testing
Given a model, you can evaluate the data set:
hmm.exe -test -viterbi file.data file.model
This will compute the probability of the data given the model.
This allows you to compare various models on the same data,
or investigate the effects of various training and testing
algorithms.
-
Update
Given a model and a data set, you can update a model:
hmm.exe -update -viterbi -P 10 file.data file.model
In this mode, the model contained in file.model is loaded,
10 passes of reestimation of the parameters are performed,
and the model in file.model is replaced (beware, this
overwrites the file).
We hope this is the beginning of some fairly easy to use tools to
understand and apply HMMs. This code was written primarily to support
simple experiments typically found in introductory chapters on HMMs
(coin tosses, selecting colored balls from urns, etc.). It is not
intended to be a full-blown speech recognizer - but then again, hopefully,
that will be coming soon.
Comments, feedback, bug reports are encouraged. Please send
them to
help@isip.piconepress.com.
|
|
|
|