Show Sidebar Hide Sidebar

Probabilistic Predictions with Gaussian Process Classification in Scikit-learn

This example illustrates the predicted probability of GPC for an RBF kernel with different choices of the hyperparameters. The first figure shows the predicted probability of GPC with arbitrarily chosen hyperparameters and with the hyperparameters corresponding to the maximum log-marginal-likelihood (LML).

While the hyperparameters chosen by optimizing LML have a considerable larger LML, they perform slightly worse according to the log-loss on test data. The figure shows that this is because they exhibit a steep change of the class probabilities at the class boundaries (which is good) but have predicted probabilities close to 0.5 far away from the class boundaries (which is bad) This undesirable effect is caused by the Laplace approximation used internally by GPC.

The second figure shows the log-marginal-likelihood for different choices of the kernel’s hyperparameters, highlighting the two choices of the hyperparameters used in the first figure by black dots.

New to Plotly?

Plotly's Python library is free and open source! Get started by downloading the client and reading the primer.
You can set up Plotly to work in online or offline mode, or in jupyter notebooks.
We also have a quick-reference cheatsheet (new!) to help you get started!

Version

In [1]:
import sklearn
sklearn.__version__
Out[1]:
'0.18.1'

Imports

This tutorial includes accuracy_score, log_loss, GaussianProcessClassifier and RBF.

In [2]:
import plotly.plotly as py
import plotly.graph_objs as go

import numpy as np
from matplotlib import pyplot as plt
from sklearn.metrics.classification import accuracy_score, log_loss
from sklearn.gaussian_process import GaussianProcessClassifier
from sklearn.gaussian_process.kernels import RBF

Calculations

In [3]:
# Generate data
train_size = 50
rng = np.random.RandomState(0)
X = rng.uniform(0, 5, 100)[:, np.newaxis]
y = np.array(X[:, 0] > 2.5, dtype=int)

# Specify Gaussian Processes with fixed and optimized hyperparameters
gp_fix = GaussianProcessClassifier(kernel=1.0 * RBF(length_scale=1.0),
                                   optimizer=None)
gp_fix.fit(X[:train_size], y[:train_size])

gp_opt = GaussianProcessClassifier(kernel=1.0 * RBF(length_scale=1.0))
gp_opt.fit(X[:train_size], y[:train_size])

print("Log Marginal Likelihood (initial): %.3f"
      % gp_fix.log_marginal_likelihood(gp_fix.kernel_.theta))
print("Log Marginal Likelihood (optimized): %.3f"
      % gp_opt.log_marginal_likelihood(gp_opt.kernel_.theta))

print("Accuracy: %.3f (initial) %.3f (optimized)"
      % (accuracy_score(y[:train_size], gp_fix.predict(X[:train_size])),
         accuracy_score(y[:train_size], gp_opt.predict(X[:train_size]))))
print("Log-loss: %.3f (initial) %.3f (optimized)"
      % (log_loss(y[:train_size], gp_fix.predict_proba(X[:train_size])[:, 1]),
         log_loss(y[:train_size], gp_opt.predict_proba(X[:train_size])[:, 1])))
Log Marginal Likelihood (initial): -17.598
Log Marginal Likelihood (optimized): -3.875
Accuracy: 1.000 (initial) 1.000 (optimized)
Log-loss: 0.214 (initial) 0.319 (optimized)

Plot posteriors

In [4]:
trace1 = go.Scatter(x=X[:train_size, 0], 
                    y=y[:train_size],
                    mode='markers',
                    marker=dict(color='black'),
                    name="Train data")

trace2 = go.Scatter(x=X[train_size:, 0],
                    y=y[train_size:],
                    mode='markers',
                    marker=dict(color='green'),
                    name="Test data")

X_ = np.linspace(0, 5, 100)

trace3 = go.Scatter(x=X_, 
                    y=gp_fix.predict_proba(X_[:, np.newaxis])[:, 1], 
                    mode='lines',
                    line=dict(color='red'),
                    name="Initial kernel: %s" % gp_fix.kernel_)

trace4 = go.Scatter(x=X_,
                    y=gp_opt.predict_proba(X_[:, np.newaxis])[:, 1],
                    mode='lines',
                    line=dict(color='blue'),
                    name="Optimized kernel: %s" % gp_opt.kernel_)

layout = go.Layout(yaxis=dict(title='Class 1 probability'),
                   xaxis=dict(title='Feature'),
                   hovermode='closest')

fig  = go.Figure(data=[trace1, trace2, trace3, trace4], 
                 layout=layout)
In [5]:
py.iplot(fig)
Out[5]:

Plot LML landscape

In [6]:
theta0 = np.logspace(0, 8, 30)
theta1 = np.logspace(-1, 1, 29)
Theta0, Theta1 = np.meshgrid(theta0, theta1)
LML = [[gp_opt.log_marginal_likelihood(np.log([Theta0[i, j], Theta1[i, j]]))
        for i in range(Theta0.shape[0])] for j in range(Theta0.shape[1])]
LML = np.array(LML).T

trace1 = go.Scatter(x=[np.exp(gp_fix.kernel_.theta)[0]], 
                    y=[np.exp(gp_fix.kernel_.theta)[1]],
                    mode='markers',
                    marker=dict(color='black'),
                    showlegend=False
                   )

trace2 = go.Scatter(x=[np.exp(gp_opt.kernel_.theta)[0]], 
                    y=[np.exp(gp_opt.kernel_.theta)[1]],
                    mode='markers',
                    marker=dict(color='black'),
                    showlegend=False
                   )
trace3 = go.Heatmap(x=theta0, 
                    y=theta1, 
                    z=LML,
                    colorscale='Jet'
                   )
layout = go.Layout(title="Log-marginal-likelihood",
                   xaxis=dict(type='log', title="Magnitude"),
                   yaxis=dict(type='log', title="Length-scale")
                  )
fig  = go.Figure(data=[trace3, trace1, trace2], 
                 layout=layout)
In [7]:
py.iplot(fig)
Out[7]:

License

Authors:

    Jan Hendrik Metzen <jhm@informatik.uni-bremen.de>

License:

    BSD 3 clause
Still need help?
Contact Us

For guaranteed 24 hour response turnarounds, upgrade to a Developer Support Plan.