THE CART ALGORITHM

The classification and regression tree (CART) algorithm can be summarized as follows:

  1. Create a set of questions that consists of all possible questions about the measured variables (phonetic context).

  2. Select a splitting criterion (likelihood).

  3. Initialization: create a tree with one node containing all the training data.

  4. Splitting: find the best question for splitting each terminal node. Split the one terminal node that results in the greatest increase in the likelihood.

  5. Stopping: if each leaf node contains data samples from the same class, or some pre-set threshold is not satisfied, stop. Otherwise, continue splitting.

  6. Pruning: use an independent test set or cross-validation to prune the tree.