THE CART ALGORITHM
The classification and regression tree (CART) algorithm can be summarized
as follows:
- Create a set of questions that consists of all possible
questions about the measured variables (phonetic context).
- Select a splitting criterion (likelihood).
- Initialization: create a tree with one node containing all the
training data.
- Splitting: find the best question for splitting each terminal node.
Split the one terminal node that results in the greatest increase
in the likelihood.
- Stopping: if each leaf node contains data samples from the same class,
or some pre-set threshold is not satisfied, stop. Otherwise,
continue splitting.
- Pruning: use an independent test set or cross-validation to prune
the tree.