A Speech Recognition Bibliography

This page contains a collection of research papers, journal publications and dissertations / theses that we find as useful reference material for speech and digital signal processing research. The references on this page should conform to our standard format. Please submit your suggestions for other links to ies_help@cavs.msstate.edu.

Acoustic Modeling	Language Modeling	Speech Recognition
Dialog Modeling	Machine Learning	Miscellaneous

Acoustic Modeling:

H. Christensen, Speaker Adaptation of Hidden Markov Models using Maximum Likelihood Linear Regression, M.S. Thesis, Aalborg University, 1996.
L. Deng, G. Ramsay and D. Sun, " Production Models as a Structural Basis for Automatic Speech Recognition," Speech Communication, 1996.
S. Greenberg, " Speaking in Shorthand - A Syllable-Centric Perspective for Understanding Pronunciation Variation," Proceedings of the ESCA Workshop on Modeling Pronunciation Variation for Automatic Speech Recognition, 1998.
J. Hillenbrand, et al, "Acoustic Characteristics of American English Vowels," Journal of the Acoustical Society of America, vol. 97, pp 3099-3111, May 1995.
A. Kannan, Robust Estimation of Stochastic Segment Models for Word Recognition, M.S. Thesis, Boston University, 1992.
N. Kumar and A. Andreou, " On Generalizations of Linear Discriminant Analysis," JHU/ECE-96-07, Johns Hopkins University, 1996.
B. Mak and E. Barnard, " Phone Clustering using the Bhattacharyya Distance," Center for Spoken Language Understanding, Oregon Graduate Institute, 1998.
J. Picone, " Signal Modeling Techniques in Speech Recognition," Proceedings of the IEEE, 1993.
M. Schuster, On Supervised Learning from Sequential Data with Applications for Speech Recognition, Nara Institute of Science and Technology, 1999.
A. Stolcke and S. Omohundro, " Best-first Model Merging for Hidden Markov Model Induction," TR-94-003, International Computer Science Institute, 1994.
P. Zhan and A. Waibel, Vocal Tract Length Normalization for Large Vocabulary Continuous Speech Recognition, Carnegie Mellon University, May 1997.

Language Modeling:

S.F. Chen and J. Goodman, " An Empirical Study of Smoothing Techniques for Language Modeling," 1997.
R. Iyer, M. Ostendorf and M. Meteer, " Analyzing and Predicting Language Model Improvements," 1997.
S. Ortmanns, H. Ney and A. Eiden, " Language Model Look-Ahead for Large Vocabulary Speech Recognition," 1997.
S. Ortmanns, H. Ney, A. Eiden and N. Cosnen, " Look-Ahead Techniques for Improved Beam Search," 1997.
F.C.N. Pereira and M.D. Riley, " Speech Recognition by Composition of Weighted Finite Automata," 1996.
A. Stolcke, " Bayesian Learning of Probabilistic Language Models," Ph.D. Thesis, University of California, Berkeley, 1994.
C. Wooters and A. Stolcke, " Multiple Pronunciation Lexical Modeling in a Speaker Independent Speech Understanding System," International Conference on Spoken Language Processing, 1994.

Speech Recognition:

D. Jurafsky and J. Martin, Speech and Language Processing, 2000.
M.K. Ravishankar, Efficient Algorithms for Speech Recognition, Ph.D. Thesis, Carnegie Mellon University, 1996.
J. Odell, The Use of Context in Large Vocabulary Speech Recognition, Ph.D. Thesis, Cambridge University, 1995.
S.J. Young, "The HTK Hidden Markov Model Toolkit: Design and Philosophy," CUED/F-INFENG/TR.152, Cambridge University, 1994.
S.J. Young, N.H. Russell and J.H.S. Thornton, Token Passing: a Simple Conceptual Model for Connected Speech Recognition Systems, Cambridge University, 1989.
G. Williams, "A Study of the Use and Evaluation of Confidence Measures in Automatic Speech Recognition," CS-98-02, University of Sheffield, 1998.

Dialog Systems:

D. Buhler, W. Minker, J. Haubler, S. Kruger, "Flexible Multilmodal Human-Machine Interaction in Mobile Environments," Proceedings of the 2002 International Conference on Spoken Language (ICSLP-2002), Denver, CO, USA, September 2002.
J. Glass, "Challenges for Spoken Dialogue Systems," Proceedings of the 1999 IEEE ASRU Workshop, Keystone, Colorado, USA, September 1999.
P. Geutner, M. Denecke, U. Meier, M. Westphal, and A. Waibel, "Conversational Speech Systems for On-Board Car Navigation and Assistance," Proceedings of the 1998 International Conference on Spoken Language (ICSLP-98), Sydney, Australia, December 1998.
B. Pellom, W. Ward, J. Hansen, K. Hacioglu, and J. Zhang, X. Yu, and S. Pradhan, "University of Colorado Dialog Systems for Travel and Navigation," Proceedings of the 2001 Human Language Technology Conference (HLT-2001), San Diego, California, USA, March 2001.
S. Pradhan and W. Ward, "Estimating Semantic Confidence for Spoken Dialogue Systems ," Proceedings of the 2002 International Conference on Acoustic Speech and Signal Processing (ICASSP-2002), Orlando, Florida, USA, May 2002.
R. Solsona, E. Fosler-Lussier, H.J. Kuo, A. Potamianos, and I. Zitouni, "Adaptive Language Models for Spoken Dialogue Systems ," Proceedings of the 2002 International Conference on Acoustic Speech and Signal Processing (ICASSP-2002), Orlando, Florida, USA, May 2002.
B. Pellom, W. Ward, S. Pradhan, "The CU Communicator: An Architecture for Dialogue Systems", Proceedings of ICSLP, Beijing, China, November, 2000 [ pdf ].
S. Young, "Talking to Machines (Statistically Speaking)", Proceedings of ICSLP, Denver, CO, USA, pp. 9-16, September 2002 [ ps.gz ].

Machine Learning:

C. Ambroise and G. Govaert, " Spatial Clustering and the EM Algorithm," 1995.
Basic Statistics, Electronic Statistics Textbook, StatSoft, Inc., 1999.
T. Bell, " Source Separation and Learning Non-orthogonal Bases for Signals Using Independent Component Analysis," Proceedings of the Summer Workshop, Center for Language and Speech Understanding, Johns Hopkins University, 1998.
T. Bell, " Independent Component Analysis (ICA)," (papers, code, demos and links).
C.M. Bishop and M.E. Tipping, " Variational Relevance Vector Machines," Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence, C. Boutilier and M. Goldszmidt (Eds.), pp. 46-53, Morgan Kaufmann, 2000.
W. Buntine, " Operations for Learning with Graphical Models," Journal of Artificial Intelligence Research, vol. 2, pp. 159-225, December 1994.
C.J.C. Burges, " A Tutorial on Support Vector Machines for Pattern Recognition," Data Mining and Knowledge Discovery, vol. 2, no. 2, pp. 121-167, 1998.
T. Dietterich, " Statistical Tests for Comparing Supervised Classification Learning Algorithms," 1997.
M. Forster, Key Concepts in Model Selection, University of Wisconsin, Madison, 1998.
D. Geiger, D. Heckerman, and C. Meek, " Asymptotic Model Selection for Directed Networks with Hidden Variables," MSR-TR-96-07, Microsoft Corporation, 1997.
S. Haykin and E. Moulines, " From Kalman to Particle Filters," presented at the International Conference on Acoustics, Speech, and Signal Processing, Philadelphia, Pennsylvania, USA, April 2005.
D. Heckerman, C. Meek and G. Cooper, " A Bayesian Approach to Causal Discovery," MSR-TR-97-05, Microsoft Corporation, 1997.
D. Heckerman, " A Tutorial on Learning with Bayesian Networks," MSR-TR-95-06, Microsoft Corporation, 1995.
D. MacKay, Probabilistic Data Modelling, Ph.D. Thesis, Cambridge University, 1991.
D. MacKay, " Bayesian Interpolation," Computation and Neural Systems, 1992.
J. Oliver, "The EM Algorithm - An Old Folk-Song Sung to a Fast New Tune", Journal of the Royal Statistical Society, vol. 59, pp. 511-567, 1997.
J. Oliver and R. Baxter, " MML and Bayesianism: Similarities and Differences," Tech. Report 208, Department of Computer Science, Monash University, 1995.
M.E. Tipping, " Sparse Bayesian Learning and the Relevance Vector Machine," Journal of Machine Learning Research, vol. 1, pp. 211-244, June 2001.
M.E. Tipping, " The Relevance Vector Machine," Neural Information Processing Systems, June 2002.
Greg Welch and Gary Bishop, " An Introduction to the Kalman Filter," Department of computer science, University of North Carolina at Chapel Hill.

Miscellaneous Algorithms:

... first entry goes here ...