Show Sidebar Hide Sidebar

Automatic Relevance Determination Regression in Scikit-learn

Fit regression model with Bayesian Ridge Regression.

See Bayesian Ridge Regression for more information on the regressor.

Compared to the OLS (ordinary least squares) estimator, the coefficient weights are slightly shifted toward zeros, which stabilises them.

The histogram of the estimated weights is very peaked, as a sparsity-inducing prior is implied on the weights. The estimation of the model is done by iteratively maximizing the marginal log-likelihood of the observations.

New to Plotly?

Plotly's Python library is free and open source! Get started by downloading the client and reading the primer.
You can set up Plotly to work in online or offline mode, or in jupyter notebooks.
We also have a quick-reference cheatsheet (new!) to help you get started!

Version

In [1]:
import sklearn
sklearn.__version__
Out[1]:
'0.18.1'

Imports

This tutorial imoprts ARDRegression and LinearRegression.

In [2]:
print(__doc__)

import plotly.plotly as py
import plotly.graph_objs as go

import numpy as np
from scipy import stats

from sklearn.linear_model import ARDRegression, LinearRegression
Automatically created module for IPython interactive environment

Calculations

Generating simulated data with Gaussian weights

In [3]:
# Parameters of the example
np.random.seed(0)
n_samples, n_features = 100, 100
# Create Gaussian data
X = np.random.randn(n_samples, n_features)
# Create weights with a precision lambda_ of 4.
lambda_ = 4.
w = np.zeros(n_features)
# Only keep 10 weights of interest
relevant_features = np.random.randint(0, n_features, 10)
for i in relevant_features:
    w[i] = stats.norm.rvs(loc=0, scale=1. / np.sqrt(lambda_))
# Create noise with a precision alpha of 50.
alpha_ = 50.
noise = stats.norm.rvs(loc=0, scale=1. / np.sqrt(alpha_), size=n_samples)
# Create the target
y = np.dot(X, w) + noise

Fit the ARD Regression

In [4]:
clf = ARDRegression(compute_score=True)
clf.fit(X, y)

ols = LinearRegression()
ols.fit(X, y)
Out[4]:
LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)

Plot Results

Weights of the model

In [8]:
lw = 2

p1 = go.Scatter(y=clf.coef_,
                mode='lines', 
                line=dict(color='darkblue', width=lw),
                name="ARD estimate")

p2 = go.Scatter(y=ols.coef_, 
                mode='lines', 
                line=dict(color='yellowgreen', width=lw,  dash='dot'), 
                name="Ground truth")

p3 = go.Scatter(y=w, 
                mode='lines', 
                line=dict(color='orange',),
                name="OLS estimate")

layout = go.Layout(title="Weights of the model",
                   xaxis=dict(title="Features"),
                   yaxis=dict(title="Values of the weights")
                  )
fig = go.Figure(data=[p1, p2, p3], layout=layout)
py.iplot(fig)
Out[8]:

Histogram of the weights

In [6]:
p1 = go.Histogram(x=clf.coef_, nbinsx=n_features, 
                  marker=dict(color='navy', 
                              line=dict(color='black', width=1)),
                  showlegend=False)

p2 = go.Scatter(x=clf.coef_[relevant_features], y=5 * np.ones(len(relevant_features)),
                mode='markers',
                marker=dict(color='gold'),
                name="Relevant features")

layout = go.Layout(title="Histogram of the weights",
                   xaxis=dict(title="Features"),
                   yaxis=dict(title="Values of the weights",
                             type='log')
                  )
fig = go.Figure(data=[p1, p2], layout=layout)
py.iplot(fig)
Out[6]:

Marginal log-likelihood

In [7]:
p1 = go.Scatter(y=clf.scores_, 
                mode='lines', 
                line=dict(color='navy', width=lw)
                )

layout = go.Layout(title="Marginal log-likelihood",
                   xaxis=dict(title="Iterations"),
                   yaxis=dict(title="Score")
                  )

fig = go.Figure(data=[p1], layout=layout)
py.iplot(fig)
Out[7]:
Still need help?
Contact Us

For guaranteed 24 hour response turnarounds, upgrade to a Developer Support Plan.