Understanding Gini Coefficient, AUC, and CAP

Here’s a concise explanation of Gini Coefficient, Cumulative Accuracy Profile (CAP), and AUC (Area Under the ROC Curve), along with their relationships:

1. Cumulative Accuracy Profile (CAP)

What it is: The CAP curve (also called the Lorenz curve in credit risk modeling) evaluates the effectiveness of a classification model (e.g., credit scoring). It compares the cumulative proportion of positive outcomes (e.g., defaults) against the cumulative proportion of observations ranked by model scores.
How it works:
- X-axis: Cumulative % of observations (ordered by model score from riskiest to safest).
- Y-axis: Cumulative % of actual positive cases (e.g., defaults).
Perfect Model: A curve that reaches 100% of positives with the fewest possible observations.
Random Model: A diagonal line (45°).

2. AUC (Area Under the ROC Curve)

What it is: The AUC measures the discriminative power of a binary classifier (e.g., default vs. non-default). It’s derived from the ROC curve, which plots:
- X-axis: False Positive Rate (FPR).
- Y-axis: True Positive Rate (TPR).
Interpretation:
- AUC = 1: Perfect classifier.
- AUC = 0.5: Random classifier.
- Higher AUC = Better ranking ability.
AUC is mathematically equivalent to the concordance probability (C-statistic). Concordance (or C-statistic) measures how well a binary classification model ranks predictions. It answers: “What is the probability that a randomly chosen positive instance is ranked higher than a randomly chosen negative instance?”
Range: 0 to 1 (higher = better ranking).
Perfect concordance (1.0): Every positive is ranked above every negative.
Random concordance (0.5): No ranking power (like flipping a coin).
Proof:
- AUC = Probability that a random positive (class=1) has a higher predicted score than a random negative (class=0).
- This is exactly the definition of concordance!

Why Use AUC?
- Works well with imbalanced data (unlike accuracy).
- Measures ranking ability (how well the model orders predictions).
- Threshold-independent (evaluates performance across all thresholds).

Example: AUC = 0.85 means the model has an 85% chance of correctly ranking a random positive case higher than a random negative case.

3. Gini Coefficient

What it is: A metric derived from the CAP curve or ROC curve, quantifying inequality in prediction power.
Range: 0 (random) to 1 (perfect).
Calculation:

From CAP :

From AUC:

Gini = 2 × AUC−1

Interpretation:
- Gini < 0.2: Poor model.
- Gini 0.2-0.4: Moderate model.
- Gini > 0.4: Strong model.
- Gini > 0.6: Excellent (rare in credit scoring).

Key Relationships

AUC vs. Gini:
- Gini=2×AUC−1
- Example: AUC = 0.8 → Gini = 0.6.
CAP vs. ROC:
- CAP focuses on actual positives (e.g., defaults), while ROC considers both TPR and FPR.
- Both can be used to derive Gini.
Use Cases:
- Credit Risk: CAP/Gini are more intuitive (directly shows default capture).
- General ML: AUC is more common (balanced view of TPR/FPR).

Summary Table

Metric	Source Curve	Range	Interpretation
AUC	ROC Curve	0.5 to 1	Ranking power of the model.
Gini	CAP or ROC	0 to 1	Inequality in prediction (scaled AUC).
CAP Curve	Lorenz-like curve	Visual	Shows model’s default capture rate.

I have written similar articles which may be helpful for you. Please check them out

Understanding Gini Coefficient, AUC, and CAP

1. Cumulative Accuracy Profile (CAP)

2. AUC (Area Under the ROC Curve)

3. Gini Coefficient

Key Relationships

Summary Table

Like this:

Related

Leave a ReplyCancel reply

Understanding Gini Coefficient, AUC, and CAP

1. Cumulative Accuracy Profile (CAP)

2. AUC (Area Under the ROC Curve)

3. Gini Coefficient

Key Relationships

Summary Table

Share this:

Like this:

Related

Leave a ReplyCancel reply

Discover more from SolutionShala