The K-Nearest Neighbor technique selects the nearest k neighbors who have similar preferences to a customer by computing the similarity based on their preferences, that is, it only uses the k neighbors who have higher correlation with the customer. Note that in CF all the customers are considered as neighbors.
The K-Means Clustering method creates k clusters each of which consists of the customers who have similar preferences. In this method we first select k customers as the initial center points of the k clusters, respectively. Then each customer is assigned to a cluster in such a way that the distance between the customer and the center of the cluster is minimized. The distance is calculated using the Euclidean distance, that is, a square root of the element-wise square of the difference between the point and each point in the cluster. One advantage of this measure is that a distance can be viewed as a sphere around the center.
Then, for each cluster, we recalculate the mean of the cluster based on the customers who currently belong to the cluster. This mean is now considered as a new center of the cluster. After finding new centers, we compute the distance for each customer as before in order to find which cluster the customer should belong to. Recalculating the means and computing the distances are repeated until a terminating condition is met. The condition is in general how far all the new centers have moved from the previous centers, respectively. That is, if all the new centers moved within a certain distance, we terminate the loop.