Show Sidebar Hide Sidebar

# Feature Selection Using SelectFromModel and LassoCV in Scikit-learn

Use SelectFromModel meta-transformer along with Lasso to select the best couple of features from the Boston dataset.

#### New to Plotly?¶

You can set up Plotly to work in online or offline mode, or in jupyter notebooks.
We also have a quick-reference cheatsheet (new!) to help you get started!

### Version¶

In [1]:
import sklearn
sklearn.__version__

Out[1]:
'0.18.1'

### Imports¶

Thjs tutorial imports load_boston, SelectFromModel and LassoCV.

In [2]:
import plotly.plotly as py
import plotly.graph_objs as go

import numpy as np
from sklearn.feature_selection import SelectFromModel
from sklearn.linear_model import LassoCV


### Calculations¶

In [3]:
# Load the boston dataset.
X, y = boston['data'], boston['target']

# We use the base estimator LassoCV since the L1 norm promotes sparsity of features.
clf = LassoCV()

# Set a minimum threshold of 0.25
sfm = SelectFromModel(clf, threshold=0.25)
sfm.fit(X, y)
n_features = sfm.transform(X).shape[1]

# Reset the threshold till the number of features equals two.
# Note that the attribute can be set directly instead of repeatedly
# fitting the metatransformer.
while n_features > 2:
sfm.threshold += 0.1
X_transform = sfm.transform(X)
n_features = X_transform.shape[1]


### Plot Results¶

Plot the selected two features from X.

In [4]:
layout = go.Layout(title="Features selected from Boston using SelectFromModel with "
"threshold %0.3f." % sfm.threshold,
xaxis=dict(title="Feature number 1"),
yaxis=dict(title="Feature number 2")
)

feature1 = X_transform[:, 0]
feature2 = X_transform[:, 1]

trace = go.Scatter(x=feature1, y=feature2,
mode='markers',
marker=dict(color='red')
)

fig = go.Figure(data=[trace], layout=layout)

In [5]:
py.iplot(fig)

Out[5]:

Author:

    Manoj Kumar <mks542@nyu.edu>



    BSD 3 clause