Logistic regression is a popular and widely used machine learning algorithm for binary classification problems. It is also used for multi-class classification problems. This tutorial will introduce the basic concepts and implementation of logistic regression.

Basic Concepts

  • Binary Classification: Logistic regression is used for binary classification problems, where the output variable is categorical and can have only two possible values, such as "yes" or "no", "true" or "false", "1" or "0".
  • Sigmoid Function: The sigmoid function, also known as the logistic function, is used to model the probability of the positive class. It maps any real-valued number into the range (0, 1).
  • Cost Function: The cost function used in logistic regression is the logistic loss, which measures the difference between the predicted probability and the actual label.
  • Gradient Descent: Gradient descent is an optimization algorithm used to minimize the cost function.

Implementation

Here is a simple implementation of logistic regression using Python and scikit-learn:

from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

# Load the iris dataset
iris = load_iris()
X = iris.data
y = iris.target

# Split the dataset into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a logistic regression model
model = LogisticRegression()

# Train the model
model.fit(X_train, y_train)

# Evaluate the model
accuracy = model.score(X_test, y_test)
print(f"Accuracy: {accuracy:.2f}")

Resources

For more information on logistic regression, you can visit the following resources:

Logistic Regression