K-Nearest Neighbors (KNN) is a simple, yet powerful algorithm used for both classification and regression. It's a non-parametric algorithm, which means it makes no assumptions about the underlying data distribution. Let's dive into the details of the KNN algorithm.
Key Concepts
- Instance: A single data point.
- Training Set: A collection of instances used to train the model.
- Test Set: A collection of instances used to evaluate the performance of the model.
- Distance: A measure of the similarity between two instances.
Steps to Implement KNN
- Select the number of neighbors (k): This is a hyperparameter that you'll need to tune based on your dataset.
- Calculate the distance: For each instance in the test set, calculate the distance to each instance in the training set.
- Find the k nearest neighbors: Sort the distances and select the k nearest neighbors.
- Make a prediction: Assign the class of the majority of the k nearest neighbors.
Example
Suppose you have a dataset of animals, and you want to classify them into two categories: "dog" and "cat". You can use KNN to predict the category of a new animal based on its features.
- Feature 1: Weight
- Feature 2: Height
Here's an example of how you might implement KNN in Python:
# Import necessary libraries
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# Load the dataset
data = [[10, 5], [20, 10], [30, 15], [40, 20], [50, 25]]
labels = ['dog', 'dog', 'cat', 'cat', 'dog']
# Split the data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(data, labels, test_size=0.2, random_state=42)
# Create a KNN classifier
knn = KNeighborsClassifier(n_neighbors=3)
# Train the classifier
knn.fit(X_train, y_train)
# Make predictions
predictions = knn.predict(X_test)
# Calculate accuracy
accuracy = accuracy_score(y_test, predictions)
print(f"Accuracy: {accuracy:.2f}")
Further Reading
For more information on KNN and other machine learning algorithms, check out our Machine Learning Tutorials.
KNN Algorithm Diagram