In this talk we will present the maximum entropy framework in a
ground-up form -- start with the motivation for using this model and
build our way up to applying the framework to solve a real problem
(estimating the probabilities of bigram language model). The
probability distributions that result from this method are exponential
in nature. The distribution contains one factor per constraint that we
place on the data. The ease with which new knowledge can be added to
the modeling paradigm is one of the most compelling reasons to use
maximum entropy models. However, maximum entropy comes with its bag
of problems. The iterative procedure (generalized iterative scaling)
to estimate the parameters of the model is typically very
expensive. Issues concerning this problem will also be discussed.
Additional items of interest: