- speech is a two-dimensional process with axes in frequency and
time
"cut the yards and paint the houses"
- correlation exists across frequency bins: if bin 'i' has high energy,
it is likely that bin 'i+1' and 'i-1' will also have high energy -
this can be learned but requires significant amounts of data
- correlation exists across time instances - consecutive observations are
correlated
- we want a system that takes correlations in both dimensions into
account