Show Sidebar Hide Sidebar

# Shrinkage Covariance Estimation LedoitWolf vs OAS and max-likelihood in Scikit-learn

When working with covariance estimation, the usual approach is to use a maximum likelihood estimator, such as the sklearn.covariance.EmpiricalCovariance. It is unbiased, i.e. it converges to the true (population) covariance when given many observations. However, it can also be beneficial to regularize it, in order to reduce its variance; this, in turn, introduces some bias. This example illustrates the simple regularization used in Shrunk Covariance estimators. In particular, it focuses on how to set the amount of regularization, i.e. how to choose the bias-variance trade-off.

Here we compare 3 approaches:

• Setting the parameter by cross-validating the likelihood on three folds according to a grid of potential shrinkage parameters.
• A close formula proposed by Ledoit and Wolf to compute the asymptotically optimal regularization parameter (minimizing a MSE criterion), yielding the sklearn.covariance.LedoitWolf covariance estimate.
• An improvement of the Ledoit-Wolf shrinkage, the sklearn.covariance.OAS, proposed by Chen et al. Its convergence is significantly better under the assumption that the data are Gaussian, in particular for small samples.

To quantify estimation error, we plot the likelihood of unseen data for different values of the shrinkage parameter. We also show the choices by cross-validation, or with the LedoitWolf and OAS estimates.

Note that the maximum likelihood estimate corresponds to no shrinkage, and thus performs poorly. The Ledoit-Wolf estimate performs really well, as it is close to the optimal and is computational not costly. In this example, the OAS estimate is a bit further away. Interestingly, both approaches outperform cross-validation, which is significantly most computationally costly.

#### New to Plotly?¶

You can set up Plotly to work in online or offline mode, or in jupyter notebooks.
We also have a quick-reference cheatsheet (new!) to help you get started!

### Version¶

In [1]:
import sklearn
sklearn.__version__
Out[1]:
'0.18'

### Imports¶

In [2]:
print(__doc__)

import plotly.plotly as py
import plotly.graph_objs as go

import numpy as np
import matplotlib.pyplot as plt
from scipy import linalg

from sklearn.covariance import LedoitWolf, OAS, ShrunkCovariance, \
log_likelihood, empirical_covariance
from sklearn.model_selection import GridSearchCV
Automatically created module for IPython interactive environment

### Calculations¶

Generate sample data.

In [3]:
n_features, n_samples = 40, 20
np.random.seed(42)
base_X_train = np.random.normal(size=(n_samples, n_features))
base_X_test = np.random.normal(size=(n_samples, n_features))

# Color samples
coloring_matrix = np.random.normal(size=(n_features, n_features))
X_train = np.dot(base_X_train, coloring_matrix)
X_test = np.dot(base_X_test, coloring_matrix)

Compute the likelihood on test data

In [4]:
# spanning a range of possible shrinkage coefficient values
shrinkages = np.logspace(-2, 0, 30)
negative_logliks = [-ShrunkCovariance(shrinkage=s).fit(X_train).score(X_test)
for s in shrinkages]

# under the ground-truth model, which we would not have access to in real
# settings
real_cov = np.dot(coloring_matrix.T, coloring_matrix)
emp_cov = empirical_covariance(X_train)
loglik_real = -log_likelihood(emp_cov, linalg.inv(real_cov))

Compare different approaches to setting the parameter

In [5]:
# GridSearch for an optimal shrinkage coefficient
tuned_parameters = [{'shrinkage': shrinkages}]
cv = GridSearchCV(ShrunkCovariance(), tuned_parameters)
cv.fit(X_train)

# Ledoit-Wolf optimal shrinkage coefficient estimate
lw = LedoitWolf()
loglik_lw = lw.fit(X_train).score(X_test)

# OAS coefficient estimate
oa = OAS()
loglik_oa = oa.fit(X_train).score(X_test)

### Plot Results¶

In [6]:
# range shrinkage curve
shrinkage1 = go.Scatter(x=shrinkages,
y=negative_logliks,
mode='lines',
name="Negative log-likelihood")

shrinkage2 = go.Scatter(x=[min(shrinkages), max(shrinkages)],
y=2 * [loglik_real],
mode='lines',
line=dict(color='red',
dash='dash',
width=1),
name="Real covariance likelihood")

lik_max = np.amax(negative_logliks)
lik_min = np.amin(negative_logliks)
ymin = lik_min - 6. * np.log((max(negative_logliks) - min(negative_logliks)))
ymax = lik_max + 10. * np.log(lik_max - lik_min)
xmin = shrinkages[0]
xmax = shrinkages[-1]

# LW likelihood
lw_likelihood = go.Scatter(x=[lw.shrinkage_, lw.shrinkage_],
y=[ymin, -loglik_lw],
mode='lines',
line=dict(color='magenta',
width=4),
name='Ledoit-Wolf estimate')
# OAS likelihood
oas_likelihood = go.Scatter(x=[oa.shrinkage_, oa.shrinkage_],
y=[ymin, -loglik_oa],
mode='lines',
line=dict(color='purple',
width=4),
name='OAS estimate'
)
# best CV estimator likelihood
cv_estimator = go.Scatter(x=[cv.best_estimator_.shrinkage,cv.best_estimator_.shrinkage],
y=[ymin, -cv.best_estimator_.score(X_test)],
mode='lines',
line=dict(color='cyan',
width=4),
name='Cross-validation best estimate'
)

layout = go.Layout(title="Regularized covariance: likelihood and shrinkage coefficient",
xaxis=dict(
title='Regularizaton parameter: shrinkage coefficient',
type='log',
showgrid=False),
yaxis=dict(
title='Error: negative log-likelihood on test data',
type='log',
showgrid=False)
)

data = [shrinkage1, shrinkage2, lw_likelihood, cv_estimator, oas_likelihood]

fig = go.Figure(data=data, layout=layout)
In [7]:
py.iplot(fig)
Out[7]:
Still need help?