This page contains LDM phonetic speech recognition experiment to evaluate the potential of LDM for speech classification.

Setup:
1) 13-dimensional Y[] and 13-dimensional X[]
2) features: 12MFCC + Energy
3) database: 2-speaker [tm] and [ss]. 3-sound /aa/, /m/, and /sh/ with 50 examples for each sound.

Experiment steps:
(1) train model /aa/ using 70 examples: tm_aa1 ... tm_aa35, ss_aa1 ... ss_aa35.
(2) train model /m/ using 70 examples: tm_m1 ... tm_m35, ss_m1 ... ss_m35.
(3) train model /sh/ using 70 examples: tm_sh1 ... tm_sh35, ss_sh1 ... ss_sh35.
(4) test sound /aa/ using 30 examples: tm_aa36 ... tm_aa50, ss_aa36 ... ss_aa50.
(5) test sound /m/ using 30 examples: tm_m36 ... tm_m50, ss_m36 ... ss_m50.
(6) test sound /sh/ using 30 examples: tm_sh36 ... tm_sh50, ss_sh36 ... ss_sh50.

EM training 50 iteration /aa/

EM training 50 iteration /m/

EM training 50 iteration /sh/

Confusion Matrix /aa/

Confusion Matrix /m/

Confusion Matrix /sh/

Result and Analysis

As Dr. Picone suggested, we increaseded our database to 50 examples per sound and all the testing examples are recognized correctly. The recognition correct rate is 100%.
This experiment demenstrates that LDM can be a good pattern classifier and has very good potential for speech classification.
We are doing speech recognition experiment again with the same database using HMM model. It will work as our sustained-phone baseline.

--
August 23, 2007 by Tao.