Step-by-Step Guide to Propensity Score Matching with Python

When randomized controlled trials (RCTs) aren’t feasible, observational studies step in. But they come with a risk: treatment assignment bias. For example, healthier individuals may be more likely to choose a treatment, skewing the observed effects.

Propensity Score Matching (PSM) helps solve this by mimicking randomization. It matches treated and untreated individuals with similar characteristics (covariates), reducing confounding and balancing groups.

In this post, we’ll explore:

What is Propensity Score Matching?
Steps: Estimation, Matching, and Balance Checking
Python Code Example (using simulated data)
Caveats and Best Practices

What is Propensity Score Matching?

Propensity score = the probability of receiving treatment given observed covariates.

Developed by Rosenbaum and Rubin (1983), the idea is simple: match treated and untreated units with similar propensity scores.

If we can match units well, we can estimate causal effects like:

Mathematical formula representing the Average Treatment effect on the Treated (ATT) in causal inference, displaying the expected difference in outcomes for treated versus untreated units.

where Y₁ is the outcome if treated, Y₀ if not, and T=1 for treated units.

Step-by-Step Guide to PSM

Step 1: Estimate Propensity Scores

Use logistic regression or machine learning to estimate the probability of treatment based on observed covariates.

Step 2: Match Treated and Untreated Units

Options:

Nearest neighbor matching
Caliper matching (within a threshold)
Kernel or Mahalanobis matching

Step 3: Check Covariate Balance

Compare standardized mean differences (SMDs) before and after matching. Well-matched samples should have small SMDs (<0.1).

Step 4: Estimate Treatment Effects

Use difference in means, regression, or weighted analysis on the matched sample.

🧑‍💻 Python Code Example

Let’s walk through a full code example using synthetic data.

import pandas as pd
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import NearestNeighbors
import matplotlib.pyplot as plt
import seaborn as sns

#simulate the data
np.random.seed(42)
n = 1000

# Covariates
age = np.random.normal(50, 10, n)
income = np.random.normal(60000, 15000, n)

# Treatment assignment depends on covariates
p = 1 / (1 + np.exp(-0.01*(income - 60000) + 0.05*(age - 50)))
treatment = np.random.binomial(1, p)

# Outcome depends on treatment and covariates
outcome = 5*treatment + 0.1*income - 0.3*age + np.random.normal(0, 1000, n)

df = pd.DataFrame({'age': age, 'income': income, 'treatment': treatment, 'outcome': outcome})

#estimate propensity scores

X = df[['age', 'income']]
y = df['treatment']

model = LogisticRegression()
model.fit(X, y)

df['propensity_score'] = model.predict_proba(X)[:, 1]

#matching nearest neighbour
treated = df[df['treatment'] == 1]
control = df[df['treatment'] == 0]

# Fit Nearest Neighbors on control units
nn = NearestNeighbors(n_neighbors=1)
nn.fit(control[['propensity_score']])

# Find nearest neighbor for each treated unit
distances, indices = nn.kneighbors(treated[['propensity_score']])
matched_control = control.iloc[indices.flatten()].copy()
matched_treated = treated.reset_index(drop=True).copy()

matched_df = pd.concat([matched_treated, matched_control])

#check covariate balance
def standardized_mean_diff(var):
    treated_mean = matched_treated[var].mean()
    control_mean = matched_control[var].mean()
    pooled_std = np.sqrt((matched_treated[var].var() + matched_control[var].var()) / 2)
    return np.abs(treated_mean - control_mean) / pooled_std

for col in ['age', 'income', 'propensity_score']:
    print(f"SMD for {col}: {standardized_mean_diff(col):.3f}")

#estimate ATT
att = matched_treated['outcome'].mean() - matched_control['outcome'].mean()
print(f"Estimated ATT: {att:.2f}")

#Visualize
sns.kdeplot(df[df['treatment']==1]['propensity_score'], label='Treated (All)', color='blue')
sns.kdeplot(df[df['treatment']==0]['propensity_score'], label='Control (All)', color='red')
plt.title("Propensity Score Distribution Before Matching")
plt.legend()
plt.show()

Best Practices & Caveats

Unobserved Confounding: PSM only controls for observed variables.
Overlap Assumption: Treated and control groups must have similar propensity score ranges.
Diagnostics Are Key: Always check balance using SMD or visual plots.
Multiple Matches or Calipers: Try matching with replacement or setting a caliper (e.g., 0.05) to improve quality.
Combine With Regression: You can still regress outcomes on covariates post-matching for bias correction.

Conclusion

Propensity Score Matching (PSM) is a vital technique in the data scientist’s and economist’s toolbox for drawing causal inferences from observational data. By estimating the probability of treatment based on observed covariates and matching similar individuals across treatment groups, PSM reduces bias and simulates the conditions of a randomized experiment.

However, it’s important to remember that PSM is only as strong as the covariates you include. If important confounders are omitted, matching will not correct for the hidden bias. That’s why thoughtful model specification, thorough balance checks, and transparent reporting are essential.

When done properly, PSM can uncover meaningful treatment effects from real-world data—offering powerful insights in healthcare, public policy, economics, and beyond.

In short: Propensity Score Matching won’t give you perfect answers, but it will give you better ones—especially when randomization isn’t an option.

Academic References

Rosenbaum, P. R., & Rubin, D. B. (1983).
The central role of the propensity score in observational studies for causal effects.
Biometrika, 70(1), 41–55.
https://doi.org/10.1093/biomet/70.1.41
Austin, P. C. (2011).
An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies.
Multivariate Behavioral Research, 46(3), 399–424.
https://doi.org/10.1080/00273171.2011.568786
Stuart, E. A. (2010).
Matching methods for causal inference: A review and a look forward.
Statistical Science, 25(1), 1–21.
https://doi.org/10.1214/09-STS313
Guo, S., & Fraser, M. W. (2014).
Propensity Score Analysis: Statistical Methods and Applications (2nd ed.).
SAGE Publications.

Online Resources and Tutorials

Harvard T.H. Chan School of Public Health – Causal Inference Book:
https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/
(Free textbook by Hernán and Robins on causal inference in health research.)
The Methodology Center at Penn State – PSM Tutorials:
https://www.methodology.psu.edu/ra/most/psm/
StatsModels Documentation (Python):
https://www.statsmodels.org/stable/index.html

Recommended Python Libraries

econml – for causal inference tools from Microsoft Research
causalml – from Uber for uplift and causal modeling
DoWhy – robust library for causal inference using structural assumptions
matchms – for matching scores, though more in mass spectrometry context

Step-by-Step Guide to Propensity Score Matching with Python

What is Propensity Score Matching?

Step-by-Step Guide to PSM

Step 1: Estimate Propensity Scores

Step 2: Match Treated and Untreated Units

Step 3: Check Covariate Balance

Step 4: Estimate Treatment Effects

🧑‍💻 Python Code Example

Best Practices & Caveats

Conclusion

Academic References

Online Resources and Tutorials

Recommended Python Libraries

Like this:

Related

Leave a ReplyCancel reply

Step-by-Step Guide to Propensity Score Matching with Python

What is Propensity Score Matching?

Step-by-Step Guide to PSM

Step 1: Estimate Propensity Scores

Step 2: Match Treated and Untreated Units

Step 3: Check Covariate Balance

Step 4: Estimate Treatment Effects

🧑‍💻 Python Code Example

Best Practices & Caveats

Conclusion

Academic References

Online Resources and Tutorials

Recommended Python Libraries

Share this:

Like this:

Related

Leave a ReplyCancel reply

Discover more from SolutionShala