In this talk we will review the motivation and methodology behind
these methods. Much of the time will be spent in describing one
popular method which uses a maximum likelihood linear regression
(MLLR) approach to speaker adaptation. MLLR builds a transform for
the model parameters using linear regression so that the transformed
parameters of each model better represent the new speaker. Applying
this approach to all of the models in an LVCSR system (particularly
when using mixture models) would require an unreasonable number of
additional parameters and a large amount of training data for full
coverage. To attack this problem a small number of transforms are
built and tying is used. MLLR has become a standard feature in most
LVCSR systems and has proven successful in every major
speaker-independent speech recognition task to which it has been
applied.
Additional items of interest: