• computational cost
    • both MMI and MCE are limited to small vocabularies if implemented in entirety
    • approximations include Viterbi alignment and N-best paradigm

  • optimization techniques
    • choice of optimization - minimum, gradient descent, Newton's method etc.
    • step size for update of parameters needs to be empirically determined
    • parameter-wise step size often employed
    • choice of loss functions for MCE