Show Sidebar Hide Sidebar

# Joint Feature Selection with Multi-Task Lasso in Scikit-learn

The multi-task lasso allows to fit multiple regression problems jointly enforcing the selected features to be the same across tasks. This example simulates sequential measurements, each task is a time instant, and the relevant features vary in amplitude over time while being the same. The multi-task lasso imposes that features that are selected at one time point are select for all time point. This makes feature selection by the Lasso more stable.

#### New to Plotly?¶

You can set up Plotly to work in online or offline mode, or in jupyter notebooks.
We also have a quick-reference cheatsheet (new!) to help you get started!

### Version¶

In [1]:
import sklearn
sklearn.__version__

Out[1]:
'0.18.1'

### Imports¶

In [2]:
print(__doc__)

import plotly.plotly as py
import plotly.graph_objs as go
from plotly import tools

import numpy as np

Automatically created module for IPython interactive environment


### Calculations¶

In [3]:
rng = np.random.RandomState(42)

# Generate some 2D coefficients with sine waves with random frequency and phase
n_samples, n_features, n_tasks = 100, 30, 40
n_relevant_features = 5
times = np.linspace(0, 2 * np.pi, n_tasks)
for k in range(n_relevant_features):
coef[:, k] = np.sin((1. + rng.randn(1)) * times + 3 * rng.randn(1))

X = rng.randn(n_samples, n_features)
Y = np.dot(X, coef.T) + rng.randn(n_samples, n_tasks)

coef_lasso_ = np.array([Lasso(alpha=0.5).fit(X, y).coef_ for y in Y.T])


### Plot Results¶

In [4]:
fig = tools.make_subplots(rows=1, cols=2,
print_grid=False)

trace1 = go.Heatmap(z=coef_lasso_,
colorscale=[[0, 'white'],[1,'black']],
showscale=False)

fig.append_trace(trace1, 1, 1)

colorscale=[[0, 'white'],[1,'black']],
showscale=False)
fig.append_trace(trace2, 1, 2)

fig['layout']['xaxis1'].update(title='Feature')
fig['layout']['xaxis2'].update(title='Feature')

fig['layout'].update(title='Coefficient non-zero location',
annotations=[
dict(x=0.2, y=.5,
xref='paper', yref='paper',
text='Lasso', showarrow=False
),
dict(x=0.85, y=.5,
xref='paper', yref='paper',
)
])

In [5]:
py.iplot(fig)

Out[5]:
In [6]:
lw = 2
feature_to_plot = 0

p1 = go.Scatter(y=coef[:, feature_to_plot],
mode='lines',
line=dict(color='seagreen', width=lw),
name='Ground truth')
p2 = go.Scatter(y=coef_lasso_[:, feature_to_plot],
mode='lines',
line=dict(color='cornflowerblue', width=lw),
name='Lasso')
mode='lines',
line=dict(color='gold', width=lw),

data = [p1, p2, p3]

In [7]:
py.iplot(data)

Out[7]:

Author:

    Alexandre Gramfort <alexandre.gramfort@inria.fr>



    BSD 3 clause