The Euclidean distance algorithm uses the following procedure to determine
the line of discrimination between the data sets:
- Compute the mean of each data set, which is nothing more that
a simple average of the x and y coordinates.
- Determine the points in the current space that are equal in distance
from the means of the data sets.
- The points that are equal in distance from the means of each
data set determine the line of discrimination that separates them.
- The distance between any two points in the current space is determined
by the following distance formula:
d2 =
(x2 - x1)2 +
(y2 - y1)2
- Here is an example of how the Euclidean distance scheme works:
First select the Two Gaussian data set from the Patterns
menu. Next, select
Euclidean Distance under the Algorithms
menu. Initialize this algorithm by selecting Initialize from the
Go menu. In order to compute the line of discrimination,
select the
Next option under the Go menu. This will
display the first step of the process, i.e., it will display
the data sets in both the input plot (top left) and the output
plot (bottom left). The process description box further
indicates the step that we are currently on and the algorithm
that is being used to compute the line of discrimination.
- The second step of the process computes the mean of each
data set. The means of the data sets are then displayed on the
output plot as black dots. The values of the means of each data
set, which correspond to the current scale, are then displayed on
the process description box.
- The third step of the process displays the line of
discrimination, given the current data sets, as computed by the
Euclidean distance algorithm. The classification errors for each
data set, along with the total classification error, are then
displayed on the process description box.
Click here to go back to the main tutorial page.
|