June / Monthly / Tutorials / Software / Home

Hi. I'm Aldebaro Klautau. I recently joined the Federal University of Para (UFPA) in Brazil after graduating with my Ph.D. from University of California at San Diego. I am part of LaPS, the Signal Processing Laboratory, at UFPA. LaPS promotes research in digital signal processing (DSP), which includes speech recognition, image coding, seismic signals and DSP techniques for monitoring power systems. LaPS was created in 1993 and is one of the research laboratories of the Electrical and Computer Engineering Department at UFPA. UFPA is the Brazilian public federal university with the largest number of undergrad students, and is located in Belem, a city close to the Amazon forest in North Brazil.

Three faculty members and more than 30 students (grad and undergrad) are affiliated with the lab, which is funded by Governmental agencies such as CNPq and CAPES, and companies such as ELETRONORTE and CELPA. Current projects target robust speech recognition, speaker verification based on support vector machines, DSP applied to predictive maintenance of circuit breakers and image coding based on wavelets.

Our current plans include two things:

development of maximum likelihood linear regression (MLLR) and maximum mutual information estimation (MMIE) for the production system;
hosting the ISIP summer training workshops in Brazil starting in 2004.

A good overview of MLLR and other such adaptation techniques can be found here:

ECE 7000 Lecture (Jon Hamaker)
ECE 8463 Lecture (Joe Picone)
X. Huang, A. Acero, and H.W. Hon, Spoken Language Processing - A Guide to Theory, Algorithm, and System Development, Prentice Hall, Upper Saddle River, New Jersey, USA, ISBN: 0-13-022616-5, 2001.

The current release of the production system supports a single transform MLLR capability often referred to as a global mean and variance transform. This is implemented using a function named adapt in a class named HiddenMarkovModel. This class encapsulates all our Hidden Markov modeling functionality including accumulation of likelihoods during training. Our singel transform MLLR implementation has been tested and verified to give results comparable to those published in [1-3]. Our focus will be to extend this implementation to allow multiple transforms to be shared across acoustics models. Computing Cluster

Our second task will be the implementation of a particular approach to discriminative training known as MMIE. In addition to the textbook cited above, here is a useful overview of discriminative approaches to HMM training. More details will follow on this in the fall.

If you want to know more about our work or our lab, feel free to contact me at aldebaro@ufpa.br.

C. J. Leggetter, and P. C. Woodland, "Flexible Speaker Adaptation Using Maximum Likelihood Linear Regression," Proceedings of the ARPA Spoken Language Technology Workshop, Barton Creek, 1995.
C. J. Leggetter, Improved Acoustic Modeling for HMMs using Linear Transformations, Ph. D. Thesis, Cambridge University, 1996.
M. Gales and P.C. Woodland, "Variance Compensation Within the MLLR Framework," Technical Report CUED/F-INFENT/TR242, Cambridge University Engineering Department, February 1996.