What is k-Nearest Neighbors?
k-Nearest Neighbors or kNN is a classification algorithm that asigns a class to a new data point based on known data points that are similar, or nearby. What do you mean 'nearby'? To determine similarity of data you can use a few different distance algorithms. For example Euclidian distance , which is the square root of the sum of squares of the difference of the parameters of data points v and w. ... Or the City Block/Manhattan distance (yes that's what it's called) which is the sum of the absolute differences of v and w. ... Or the Cosine distance , which measures the "angle" between two parameter vectors v and w. ... What does kNN do? It goes through all the known data points and measures the distance to the new point. Then it assigns the class of the majority of k nearest data points to the new measurement. Yes, it goes through all training data every time you run this algorithm! No model is actually created. This is called a ...