Show Sidebar Hide Sidebar

# Cross-validation on Digits Dataset Exercise in Scikit-learn

A tutorial exercise using Cross-validation with an SVM on the Digits dataset.

This exercise is used in the Cross-validation generators part of the Model selection: choosing estimators and their parameters section of the A tutorial on statistical-learning for scientific data processing.

#### New to Plotly?¶

You can set up Plotly to work in online or offline mode, or in jupyter notebooks.
We also have a quick-reference cheatsheet (new!) to help you get started!

### Version¶

In [1]:
import sklearn
sklearn.__version__

Out[1]:
'0.18.1'

### Imports¶

This tutorial imports cross_val_score.

In [2]:
print(__doc__)

import plotly.plotly as py
import plotly.graph_objs as go

import numpy as np
from sklearn.model_selection import cross_val_score
from sklearn import datasets, svm

Automatically created module for IPython interactive environment


### Calculations¶

In [3]:
digits = datasets.load_digits()
X = digits.data
y = digits.target

svc = svm.SVC(kernel='linear')
C_s = np.logspace(-10, 0, 10)

scores = list()
scores_std = list()
for C in C_s:
svc.C = C
this_scores = cross_val_score(svc, X, y, n_jobs=1)
scores.append(np.mean(this_scores))
scores_std.append(np.std(this_scores))


### Plot Results¶

In [4]:
p1 = go.Scatter(x=C_s, y=scores,
mode='lines',
showlegend=False,
)
p2 = go.Scatter(x=C_s,
y=np.array(scores) + np.array(scores_std),
showlegend=False,
mode='lines',
line=dict(color='blue', dash='dash')
)
p3 = go.Scatter(x=C_s, y=np.array(scores) - np.array(scores_std),
showlegend=False,
mode='lines',
line=dict(color='blue', dash='dash')
)
layout = go.Layout(xaxis=dict(type='log',
title='Parameter C'),
yaxis=dict(title='CV score')
)
fig = go.Figure(data=[p1, p2, p3], layout=layout)

In [5]:
py.iplot(fig)

Out[5]:
Still need help?