An FIR filter composed of all zeros that are inside the unit circle is minimum phase. There are many realizations of a system with a given magnitude response; one is a minimum phase realization, one is a maximum-phase realization, others are in-between. Any non-minimum phase pole-zero system can be decomposed into:
It can be shown that of all the possible realizations of |H(f)|, the minimum-phase version is the most compact in time. Define:
Then, Emin(n) >= E(n) for all n and all possible realizations of |H(f)|.

Why is minimum phase such an important concept in speech processing?

We prefer systems that are invertible:

H(z) H-1(z) = 1

We would like both systems to be stable. The inverse of a non-minimum phase system is not stable.

We end with a very simple question: is phase important in speech recognition?