The nearest neighbor algorithm is similar to the Euclidean distance algorithm
in that the line of discrimination is computed using the following steps:
- Determine the points in the current space that are equal in distance
from the nearest point of each data set.
- The points that are equal in distance from the nearest point of
each data set determine the line of discrimination that
separates the them.
- The distance between any two points in the current space is determined
by the following distance formula:
d2 =
(x2 - x1)2 +
(y2 - y1)2
- Here is an example of how the nearest neigbour scheme works:
First select the Two Gaussian data sets from the Patterns
menu. Following that, select the
Nearest Neigbor option under the Algorithms
menu. Initialize this algorithm by selecting Initialize from the
Go menu. In order to compute the line of discrimination
select the
Next option under the Go menu. This will
display the first step of the process, the data sets in both
the input plot (top left) and the output plot (bottom left) of
the applet. Also, the process description box indicates which
step you are currently on and the algorithm that is currently
being used to compute the line of discrimination.
Note that the nearest neighbor algorithm is computationally
expensive so please be patient if the process takes longer.
- The second step of the process computes the mean of the each
data set. The mean of each data sets is displayed on the output plot
as black dots near the corresponding data sets. The value of the mean
for each data set, which corresponds to the current scale, is
displayed on the process description box.
- The third step of the process displays the line of
discrimination of the given data sets as determined by the nearest
neighbor algorithm. Also, the classification error for each
data set along with the total classification error is displayed
on the process description box.
Click here to go back to the main tutorial page.
|