• cache all the states output probabilities into memory

  • speed up reestimation significantly since these values are accessed multiple times

  • xRT decreased from 6.0 to 3.6 when training monophone models with 8 Gaussian mixture components