🧠 Logistic Regression Tutorial

Logistic regression is a fundamental algorithm in machine learning, widely used for binary classification tasks. Unlike linear regression, which predicts continuous values, logistic regression outputs probabilities to determine the likelihood of an instance belonging to a particular class.

🔍 Core Concepts

Sigmoid Function
The core of logistic regression is the sigmoid function:
$$ \sigma(z) = \frac{1}{1 + e^{-z}} $$
This function maps any real number to a value between 0 and 1, representing the probability of a positive outcome.
Log Loss
The algorithm minimizes log loss (cross-entropy) to improve accuracy:
$$ \text{Loss} = -\frac{1}{N} \sum_{i=1}^{N} [y_i \log(\hat{y}_i) + (1 - y_i)\log(1 - \hat{y}_i)] $$
where $ y_i $ is the true label and $ \hat{y}_i $ is the predicted probability.
Decision Boundary
A threshold (typically 0.5) separates classes. If the predicted probability exceeds this, the instance is classified as positive.

🧪 Example Use Case

Dataset: Iris flowers (classification of species)
Goal: Predict whether a flower is "Setosa" based on petal dimensions

Code Snippet:

from sklearn.linear_model import LogisticRegression
model = LogisticRegression()
model.fit(X_train, y_train)
predictions = model.predict(X_test)

📚延伸阅读

For a deeper dive into related topics, check out our Linear Regression Tutorial.