|
LIMSI |
Philips |
SRI |
Front-end |
39 features (MFCC, energy)
|
33 features (MFCC, energy)
VTN
LDA + normalization
|
39 features (MFCC, energy)
|
Segmentation |
GMMs with 64 Gaussians
Viterbi decoding
|
Agglomerative clustering
Kullback-Leibler distance
Merge adjacent segments
|
Phone-tied GMMs
Viterbi search
|
Acoustic Model |
Filler words
Gender-independent models
Position dependent
Xword triphones
|
Laplacian densities
DT clustering for triphones
Gender-independent models
Focus-independent training
|
Gender-dependent
Genones
Focus-specific adaptn.
GMS algorithm
|
Language Model |
CU-CMU SLM toolkit
65K words
1st pass - bigrams
filler words in trigrams
|
75K vocabulary
Phrase models
Adapted remote corpora
|
48K lexicon
1st pass - bigrams
2nd pass - trigram lattice
Final pass - 5-grams
|
Decoder |
Word-graph, bigrams
Trigram pass
Cluster-based MLLR
All passes Xword
|
Multipass wordgraph
MLLR + Xword trigram
|
1st pass WI, ML adapt
2nd pass WI, lattice adapt
FB pass, N-best lists
Xword models adapt
N-best rescoring
|
Performance |
18.5% WER |
23.1% WER |
26.7% WER |