• two preliminary experiments on Alphadigits

    • WER decreased from 44.4% to 37.7% when trained a monophone system where monophones were chosen from a set of cross-word models with 12 mixtures

    • achieve a WER 54.2% when trained a monophone system from flat start using 8 mixture components

  • results are not good due to several reasons

    • number of mixtures is low for a monophone system (typical 32)

    • not enough iterations (three and two respectively)

    • forcing silence between every word

    • model mismatch