LECTURE 27: DECISION TREES

DATA-DRIVEN OPERATION

There are four important operations in constructing a decision tree:

Question selection: choosing a set of questions to categorize your data (some algorithms can derive questions automatically).
Splitting: partitioning data assigned to a node into N groups
(N=2 for binary trees).
Growing: expanding the tree to better represent the training data.
Pruning: removing nodes to improve generalization.

In speech recognition, we operate on continuous-valued feature vectors, and use likelihood computations derived directly from HMM training. This is a major reason why decision trees are so popular in speech recognition systems - the implementation is very elegant.