Show Sidebar Hide Sidebar

# Probability calibration of classifiers in Scikit-learn

When performing classification you often want to predict not only the class label, but also the associated probability. This probability gives you some kind of confidence on the prediction. However, not all classifiers provide well-calibrated probabilities, some being over-confident while others being under-confident. Thus, a separate calibration of predicted probabilities is often desirable as a postprocessing. This example illustrates two different methods for this calibration and evaluates the quality of the returned probabilities using Brierâ€™s score (see https://en.wikipedia.org/wiki/Brier_score).

Compared are the estimated probability using a Gaussian naive Bayes classifier without calibration, with a sigmoid calibration, and with a non-parametric isotonic calibration. One can observe that only the non-parametric model is able to provide a probability calibration that returns probabilities close to the expected 0.5 for most of the samples belonging to the middle cluster with heterogeneous labels. This results in a significantly improved Brier score.

#### New to Plotly?¶

You can set up Plotly to work in online or offline mode, or in jupyter notebooks.
We also have a quick-reference cheatsheet (new!) to help you get started!

### Version¶

In [1]:
import sklearn
sklearn.__version__

Out[1]:
'0.18'

### Imports¶

In [2]:
print(__doc__)

import plotly.plotly as py
import plotly.graph_objs as go

import numpy as np
from matplotlib import cm

from sklearn.datasets import make_blobs
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import brier_score_loss
from sklearn.calibration import CalibratedClassifierCV
from sklearn.model_selection import train_test_split

Automatically created module for IPython interactive environment


### Calculations¶

In [3]:
n_samples = 50000
n_bins = 3  # use 3 bins for calibration_curve as we have 3 clusters here

# Generate 3 blobs with 2 classes where the second blob contains
# half positive samples and half negative samples. Probability in this
# blob is therefore 0.5.
centers = [(-5, -5), (0, 0), (5, 5)]
X, y = make_blobs(n_samples=n_samples, n_features=2, cluster_std=1.0,
centers=centers, shuffle=False, random_state=42)

y[:n_samples // 2] = 0
y[n_samples // 2:] = 1
sample_weight = np.random.RandomState(42).rand(y.shape[0])

# split train, test for calibration
X_train, X_test, y_train, y_test, sw_train, sw_test = \
train_test_split(X, y, sample_weight, test_size=0.9, random_state=42)

# Gaussian Naive-Bayes with no calibration
clf = GaussianNB()
clf.fit(X_train, y_train)  # GaussianNB itself does not support sample-weights
prob_pos_clf = clf.predict_proba(X_test)[:, 1]

# Gaussian Naive-Bayes with isotonic calibration
clf_isotonic = CalibratedClassifierCV(clf, cv=2, method='isotonic')
clf_isotonic.fit(X_train, y_train, sw_train)
prob_pos_isotonic = clf_isotonic.predict_proba(X_test)[:, 1]

# Gaussian Naive-Bayes with sigmoid calibration
clf_sigmoid = CalibratedClassifierCV(clf, cv=2, method='sigmoid')
clf_sigmoid.fit(X_train, y_train, sw_train)
prob_pos_sigmoid = clf_sigmoid.predict_proba(X_test)[:, 1]

print("Brier scores: (the smaller the better)")

clf_score = brier_score_loss(y_test, prob_pos_clf, sw_test)
print("No calibration: %1.3f" % clf_score)

clf_isotonic_score = brier_score_loss(y_test, prob_pos_isotonic, sw_test)
print("With isotonic calibration: %1.3f" % clf_isotonic_score)

clf_sigmoid_score = brier_score_loss(y_test, prob_pos_sigmoid, sw_test)
print("With sigmoid calibration: %1.3f" % clf_sigmoid_score)

Brier scores: (the smaller the better)
No calibration: 0.104
With isotonic calibration: 0.084
With sigmoid calibration: 0.109


### Plots¶

In [4]:
y_unique = np.unique(y)
marker_colors = ['rgba(128,0,128,0.5)','rgba(255,0,0,0.5)']
data_plot=[]
i=0
colors = cm.rainbow(np.linspace(0.0, 1.0, y_unique.size))

for this_y, color in zip(y_unique, colors):
this_X = X_train[y_train == this_y]
this_sw = sw_train[y_train == this_y]
trace = go.Scatter(x=this_X[:, 0], y=this_X[:, 1],
mode='markers',
marker=dict(color=marker_colors[i],size=12,
line=dict(color='black',
width=1)),
name="Class %s" % this_y)
data_plot.append(trace)
i=i+1

layout = go.Layout(title='Data',
xaxis=dict(zeroline=False, showgrid=False),
yaxis=dict(zeroline=False, showgrid=False))
fig = go.Figure(data=data_plot, layout=layout)
py.iplot(fig)

Out[4]:
In [6]:
order = np.lexsort((prob_pos_clf, ))
No_calibration=go.Scatter(y=prob_pos_clf[order],
name='No calibration (%1.3f)' % clf_score,
mode='lines',
line=dict(color='red')
)
Isotonic_calibration = go.Scatter(y=prob_pos_isotonic[order],
name='Isotonic calibration (%1.3f)' % clf_isotonic_score,
mode='lines',
line=dict(color='green', width=3)
)
Sigmoid_calibration = go.Scatter(y=prob_pos_sigmoid[order],
name='Sigmoid calibration (%1.3f)' % clf_sigmoid_score,
mode='lines',
line=dict(color='blue', width=3)
)
Empirical = go.Scatter(x=np.linspace(0, y_test.size, 51)[1::2],
y=y_test[order].reshape(25, -1).mean(1),
name=r'Empirical',
mode='lines',
line=dict(color='black', width=3)
)
data = [No_calibration,Isotonic_calibration,Sigmoid_calibration,Empirical]

layout = go.Layout(title="Gaussian naive Bayes probabilities",
xaxis=dict(zeroline=False, showgrid=False,
title="Instances sorted according to predicted probability (uncalibrated GNB)"),
yaxis=dict(zeroline=False, showgrid=False,
title="P(y=1)"))

fig = go.Figure(data=data, layout=layout)
py.iplot(fig)

Out[6]:

Author:

     Mathieu Blondel <mathieu@mblondel.org>
Alexandre Gramfort <alexandre.gramfort@telecom-paristech.fr>
Balazs Kegl <balazs.kegl@gmail.com>
Jan Hendrik Metzen <jhm@informatik.uni-bremen.de>


     BSD Style