January 14, 2026 12:59 am

Logistic regression is used to predict a binary response from a binary predictor, used for predicting the outcome of a categorical dependent variable based on one or more predictor variables. Unlike linear regression (which predicts continuous values), logistic regression outputs probabilities between 0 and 1 using a logistic function (S-curve).

Key Concepts

1. The Logistic Function (Sigmoid)

The model transforms a linear equation into a probability using the sigmoid function:

P(Y=1)=1/(1+e−(b0+b1X))

  • P(Y=1): Probability of the event occurring (e.g., defaulting on a loan).
  • b0+b1X: Linear combination of input features (like in linear regression).
  • Output range: 0 to 1 (e.g., P=0.8 → 80% chance of default).

2. Odds ratio

Odds Ratio (OR) = P/1-P

If P increases monotonically, then odds ratio also increases.

OR > 1: Increase in XX raises the probability of Y=1.

OR < 1: Increase in XX lowers the probability.

ln(odds) ranges between – infinity to + infinity

3. When to use a logit model ?

  • When dependent variable is binary, a non-linear specification of the model is required.
  • It seems preferable to fit some kind of a sigmoid curve to the observed points
  • Sigmoid curve looks like cumulative distribution function (CDF) of a random variable
  • We can choose a suitable Cumulative Distribution Function (CDF) to represent 0-1 response models
  • Commonly chosen CDFs to represent 0-1 response models: logistic (for logit model) and normal (for probit model)

4. Specification of logit model :

Pi = P(Yi =1) = F(Zi) = 1/(1+e^-Zi) where F(Zi) =CDF

Pi/1-Pi =e^Zi (where Pi/1-Pi =odds ratio in favour of the event happening)

Ln(Pi/1-Pi) =Zi

5. Interpretation of coefficient of Logistic regression :

In logistic regression, coefficients (b0,b1,…,bk) represent the log-odds relationship between each predictor variable (XX) and the binary outcome (Y=1or Y=0). Unlike linear regression, these coefficients need to be transformed to odds ratios (OR) for intuitive interpretation.

The logistic regression equation in log-odds form:

log⁡(P(Y=1)/1−P(Y=1))=b0+b1X1+⋯+bkXk​

  • b0 (Intercept): Log-odds of Y=1 when all X=0
  • b1,…,bk: Change in log-odds for a 1-unit increase in X, holding other variables constant.

6. Assumptions of Logistic Regression:

✔️ Binary outcome (e.g., 0/1).
✔️ No multicollinearity among predictors.
✔️ Large sample size (for stable estimates).
❌ Not for non-linear relationships (use decision trees or neural networks for complex patterns).

7. Types of Logistic Regression

  1. Binary Logistic Regression
    • Two outcomes (e.g., “Survived” vs. “Died”).
  2. Multinomial Logistic Regression
    • 2 unordered categories (e.g., “Cat,” “Dog,” “Bird”).
  3. Ordinal Logistic Regression
    • Ordered categories (e.g., “Low,” “Medium,” “High” risk).

8. Estimation of logit model :

It is not possible to estimate logit model by OLS method because this is a non-linear model. We apply Maximum Likelihood Estimator (MLE) to estimate logit model since we are woking with ungrouped data.

Python Implementation:

Here’s a simple implementation of Logistic Regression from scratch in Python using only numpy

import numpy as np

class LogisticRegression:
    def __init__(self, learning_rate=0.01, iterations=1000):
        self.learning_rate = learning_rate
        self.iterations = iterations
        self.weights = None
        self.bias = None

    def sigmoid(self, z):
        return 1 / (1 + np.exp(-z))

    def fit(self, X, y):
        n_samples, n_features = X.shape

        # Initialize weights and bias
        self.weights = np.zeros(n_features)
        self.bias = 0

        # Gradient descent
        for _ in range(self.iterations):
            linear_model = np.dot(X, self.weights) + self.bias
            y_predicted = self.sigmoid(linear_model)

            # Compute gradients
            dw = (1 / n_samples) * np.dot(X.T, (y_predicted - y))
            db = (1 / n_samples) * np.sum(y_predicted - y)

            # Update parameters
            self.weights -= self.learning_rate * dw
            self.bias -= self.learning_rate * db

    def predict_proba(self, X):
        linear_model = np.dot(X, self.weights) + self.bias
        return self.sigmoid(linear_model)

    def predict(self, X):
        y_predicted_proba = self.predict_proba(X)
        return [1 if i >= 0.5 else 0 for i in y_predicted_proba]

# Example usage
if __name__ == "__main__":
    # Simple dataset (AND gate)
    X = np.array([
        [0, 0],
        [0, 1],
        [1, 0],
        [1, 1]
    ])
    y = np.array([0, 0, 0, 1])

    model = LogisticRegression(learning_rate=0.1, iterations=1000)
    model.fit(X, y)
    predictions = model.predict(X)
    print("Predictions:", predictions)

9. How to test the model fit?

1. Hosmer – Lemeshow Test :

It is used to test whether the predicted distribution is following actual distribution. In other words this test checks how close the observed values are with the expected values.

2. Kolmogorov-Smirnov (KS) test:

Assess how well a model (e.g., logistic regression) separates two classes (e.g., defaulters vs. non-defaulters).

Calculation Steps

  1. Sort predictions in descending order (highest probability to lowest).
  2. Split data into deciles (or percentiles).
  3. Compute cumulative distributions for both classes:
    • CDF of Events (Y=1): % of defaulters in each bin.
    • CDF of Non-Events (Y=0): % of non-defaulters in each bin.
  4. KS Statistic = Maximum difference between the two CDFs.

3. Somer’s D Test:

Somer’s D = ( No of concordant pairs – no of discordant pairs) / total no of pairs

Concordance : concordance is used to assess how well scorecards are separating the good and bad accounts in the development sample. The higher the concordance, the higher the separation of scores between good and bad accounts.

4. Rank Ordering :

Rank ordering is used to validate whether the model is able to differentiate the goods from the bads across the entire population breakup. The population is divided into deciles in the descending order of predicted values. A model that rank orders, predicts the highest number of goods in first decile and then goes down.

Discover more from SolutionShala

Subscribe now to keep reading and get access to the full archive.

Continue reading