This page contains LDM 8-phone real speech classification using 13-mfcc feature result under matlab.

Setup:
1) 13 dimensional Y[] and 5-dimensional X[]
2) database: $ISIP/data/research/nonlinear/data/database/mc
3) 8 raw speech files: aa.snd, ae.snd, eh.snd, f.snd, m.snd, n.snd, sh.snd, z.snd
4) rate 22050, 16 bit signed samples, 1 channel.

Experiment steps:
1) extract 13-mfcc feature for 8 raw speech files: aa.snd, ae.snd, eh.snd, f.snd, m.snd, n.snd, sh.snd, z.snd. Normalize the feature vector into -1 to 1.
2) train ldm models for [aa], [ae], [eh], [f], [m], [n], [sh], [z] by running EM recursion for these 8 speech files 200 times. Therefore we get the ldm model parameters for [aa], [ae], [eh], [f], [m], [n], [sh], [z].
3) calculate likelihoods of signal [aa] and ldm model [aa], [ae], [eh], [f], [m], [n], [sh], [z]. Try to classify [aa].
4) same calculation to [ae], [eh], [f], [m], [n], [sh], [z].
5) get the likelihood value matrix and anylize the result.

EM training 200 times [aa]

EM training 200 times [ae]

EM training 200 times [eh]

EM training 200 times [f]

EM training 200 times [m]

EM training 200 times [n]

EM training 200 times [sh]

EM training 200 times [z]

Experiment result:
1) 8-phone real speech classification experiment using 13-mfcc feature got good result. It succeeded to classify all the 8 phones.
2) Here's the likelihood value matrix:
(columns are trained LDM phone models)

--
May 25, 2007 by Tao.