Return to Main
Introduction:
Title
Outline
Introduction
Introduction (cont.)
State of the Art
State of the Art (cont.)
Performance Factors
Noise Environment
User Population
Speech Style
Complexity
Decade
Present
In Five Years
Evaluation Metrics:
Evolution
Human Performance
Machine Performance
Evolution of Task
Beyond WER: Named Entity
Named Entity
WER
Beyond WER
Recognition Architectures:
Why so difficult?
Overlap
Theoretic Approach
Bayesian
Approach
Components
Multiple Knowledge Sources
Acoustic Front-end
Acoustic Models
Language Model
Search
Acoustic Modeling:
Feature Extraction
Measurement
Spectral Analysis
Hidden Markov Models
Parameter Estimation
Initialization
Single Gaussian
Two-Way Split
Mixture Distribution
Four-Way Split
Reestimation
Optimizing
Language Modeling:
Wheel of Fortune
N-Grams
Bigrams
Trigrams
Integration of Natural Language
Word-level
Natural Language
Implementation Issues:
Resource Intensive
Requirements
Dynamic Programming-Based Search
Hypothesis
Cross-Word Decoding
Decoding Example
SENT_START
WOULD
GUESS
GUESS
EVERY
REALLY
SAY
THING
SENT_END
Internet Based Speech Recognition
Technology:
Conversational Speech
Indexing of Broadcast News
Real-Time Translation
Imagine the Future
Human Language Engineering
Future Directions
Challenges
Algorithmic Issues
Conclusion and Future Directions:
References
Trends
Limitations on Applications
Applications on the Horizon
High Tech heretic
Beulah Arnott
BravoBrava
Reading Pal:
Reading Pal
Child Reads
Errors in Red
Playback
Word Look-up
Listen
Summary:
Goal: Speech Better than Text