Show Sidebar Hide Sidebar

Gaussian Rrocess Regression with Noise-Level Estimation in Scikit-learn

This example illustrates that GPR with a sum-kernel including a WhiteKernel can estimate the noise level of data. An illustration of the log-marginal-likelihood (LML) landscape shows that there exist two local maxima of LML. The first corresponds to a model with a high noise level and a large length scale, which explains all variations in the data by noise. The second one has a smaller noise level and shorter length scale, which explains most of the variation by the noise-free functional relationship. The second model has a higher likelihood; however, depending on the initial value for the hyperparameters, the gradient-based optimization might also converge to the high-noise solution. It is thus important to repeat the optimization several times for different initializations.

New to Plotly?

Plotly's Python library is free and open source! Get started by downloading the client and reading the primer.
You can set up Plotly to work in online or offline mode, or in jupyter notebooks.
We also have a quick-reference cheatsheet (new!) to help you get started!

Version

In [1]:
import sklearn
sklearn.__version__
Out[1]:
'0.18.1'

Imports

This tutorial imports GaussianProcessRegressor, RBF and WhiteKernel.

In [2]:
import plotly.plotly as py
import plotly.graph_objs as go

import numpy as np
from matplotlib import pyplot as plt
from matplotlib.colors import LogNorm

from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF, WhiteKernel

Calculations

In [3]:
rng = np.random.RandomState(0)
X = rng.uniform(0, 5, 20)[:, np.newaxis]
y = 0.5 * np.sin(3 * X[:, 0]) + rng.normal(0, 0.5, X.shape[0])

First Run

In [4]:
kernel = 1.0 * RBF(length_scale=100.0, length_scale_bounds=(1e-2, 1e3)) \
    + WhiteKernel(noise_level=1, noise_level_bounds=(1e-10, 1e+1))
gp = GaussianProcessRegressor(kernel=kernel,
                              alpha=0.0).fit(X, y)
X_ = np.linspace(0, 5, 100)
y_mean, y_cov = gp.predict(X_[:, np.newaxis], return_cov=True)

trace1 = go.Scatter(x=X_,y=y_mean, 
                    mode='lines',
                    line=dict(color='black'),
                    showlegend=False,
                    fill = 'tonexty'
                   )

trace2 = go.Scatter(x=X_, y=y_mean - np.sqrt(np.diag(y_cov)),
                    mode='lines',
                    line=dict(color='black'),
                    showlegend=False,
                    fill='tonexty'
                   )

trace3 = go.Scatter(x=X_, y=y_mean + np.sqrt(np.diag(y_cov)),
                    mode='lines',
                    line=dict(color='black'),
                    showlegend=False
                   )

trace4 = go.Scatter(x=X_, y=0.5*np.sin(3*X_), 
                    mode='lines',
                    line=dict(color='red'),
                    showlegend=False 
                   )

trace5 = go.Scatter(x=X[:, 0], y=y,
                    mode='markers',
                    marker=dict(color='red'),
                    showlegend=False
                   )
data = [ trace2, trace3,  trace1,trace4, trace5]
layout = go.Layout(title="Initial: %s<br>Optimum: %s<br>Log-Marginal-Likelihood: %s"
                          % (kernel, gp.kernel_,
                          gp.log_marginal_likelihood(gp.kernel_.theta)))
fig = go.Figure(data=data, layout=layout)
In [5]:
py.iplot(fig)
Out[5]:

Second Run

In [6]:
kernel = 1.0 * RBF(length_scale=1.0, length_scale_bounds=(1e-2, 1e3)) \
    + WhiteKernel(noise_level=1e-5, noise_level_bounds=(1e-10, 1e+1))
gp = GaussianProcessRegressor(kernel=kernel,
                              alpha=0.0).fit(X, y)
X_ = np.linspace(0, 5, 100)
y_mean, y_cov = gp.predict(X_[:, np.newaxis], return_cov=True)

trace1 = go.Scatter(x=X_,y=y_mean, 
                    mode='lines',
                    line=dict(color='black'),
                    showlegend=False,
                    fill = 'tozeroy'
                   )

trace2 = go.Scatter(x=X_, y=y_mean - np.sqrt(np.diag(y_cov)),
                    mode='lines',
                    line=dict(color='black'),
                    showlegend=False,
                    fill='tonexty'
                   )

trace3 = go.Scatter(x=X_, y=y_mean + np.sqrt(np.diag(y_cov)),
                    mode='lines',
                    line=dict(color='black'),
                    showlegend=False,
                    fill='tonexty'
                   )

trace4 = go.Scatter(x=X_, y=0.5*np.sin(3*X_), 
                    mode='lines',
                    line=dict(color='red'),
                    showlegend=False 
                   )

trace5 = go.Scatter(x=X[:, 0], y=y,
                    mode='markers',
                    marker=dict(color='red'),
                    showlegend=False
                   )
data = [ trace2, trace3, trace1, trace4, trace5]
layout = go.Layout(title="Initial: %s<br>Optimum: %s<br>Log-Marginal-Likelihood: %s"
                          % (kernel, gp.kernel_,
                          gp.log_marginal_likelihood(gp.kernel_.theta)))
fig = go.Figure(data=data, layout=layout)
In [7]:
py.iplot(fig)
Out[7]:

Plot LML landscape

In [8]:
theta0 = np.logspace(-2, 3, 49)
theta1 = np.logspace(-2, 0, 50)
Theta0, Theta1 = np.meshgrid(theta0, theta1)
LML = [[gp.log_marginal_likelihood(np.log([0.36, Theta0[i, j], Theta1[i, j]]))
        for i in range(Theta0.shape[0])] for j in range(Theta0.shape[1])]
LML = np.array(LML).T

vmin, vmax = (-LML).min(), (-LML).max()
vmax = 50

trace = go.Contour(x=theta0, 
                   y=theta1,
                   z=-LML,
                   ncontours=np.logspace(np.log10(vmin), np.log10(vmax), 50),
                   contours=dict(coloring='lines')
                  )
layout = go.Layout(title="Log-marginal-likelihood",
                   xaxis=dict(type='log', title="Length-scale",
                              showgrid=False),
                   yaxis=dict(type='log', title="Noise-level",
                              showgrid=False)
                  )
fig = go.Figure(data=[trace], layout=layout)
In [10]:
py.iplot(fig)
Out[10]:

License

Authors:

    Jan Hendrik Metzen <jhm@informatik.uni-bremen.de>

License:

    BSD 3 clause
Still need help?
Contact Us

For guaranteed 24 hour response turnarounds, upgrade to a Developer Support Plan.