PCA uses the distances to the sample means to give classification.
But instead of using the Euclidean distances in the feature space directly, it uses distances weighted by the covariance matrices.
The first principal component of a sample vector lies parallel to the direction along which there is the largest variance over all samples