Show Sidebar Hide Sidebar

# One-Class SVM with Non-Linear Kernel (RBF) in Scikit-learn

An example using a one-class SVM for novelty detection.

One-class SVM is an unsupervised algorithm that learns a decision function for novelty detection: classifying new data as similar or different to the training set.

#### New to Plotly?¶

You can set up Plotly to work in online or offline mode, or in jupyter notebooks.
We also have a quick-reference cheatsheet (new!) to help you get started!

### Version¶

In [1]:
import sklearn
sklearn.__version__
Out[1]:
'0.18.1'

### Imports¶

In [2]:
print(__doc__)

import plotly.plotly as py
import plotly.graph_objs as go
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm
Automatically created module for IPython interactive environment

### Calculations¶

In [3]:
x_ = np.linspace(-5, 5, 500)
xx, yy = np.meshgrid(x_, x_)

# Generate train data
X = 0.3 * np.random.randn(100, 2)
X_train = np.r_[X + 2, X - 2]
# Generate some regular novel observations
X = 0.3 * np.random.randn(20, 2)
X_test = np.r_[X + 2, X - 2]
# Generate some abnormal novel observations
X_outliers = np.random.uniform(low=-4, high=4, size=(20, 2))

# fit the model
clf = svm.OneClassSVM(nu=0.1, kernel="rbf", gamma=0.1)
clf.fit(X_train)
y_pred_train = clf.predict(X_train)
y_pred_test = clf.predict(X_test)
y_pred_outliers = clf.predict(X_outliers)
n_error_train = y_pred_train[y_pred_train == -1].size
n_error_test = y_pred_test[y_pred_test == -1].size
n_error_outliers = y_pred_outliers[y_pred_outliers == 1].size

### Plot Results¶

Plot the line, the points, and the nearest vectors to the plane

In [4]:
def matplotlib_to_plotly(cmap, pl_entries):
h = 1.0/(pl_entries-1)
pl_colorscale = []

for k in range(pl_entries):
C = map(np.uint8, np.array(cmap(k*h)[:3])*255)
pl_colorscale.append([k*h, 'rgb'+str((C[0], C[1], C[2]))])

return pl_colorscale

cmap = matplotlib_to_plotly(plt.cm.PuBu, 5)
In [7]:
Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)

p1 = go.Contour(x=x_, y=x_, z=Z,
colorscale=cmap,
showscale=False)

p2 = go.Scatter(x=X_train[:, 0], y=X_train[:, 1],
mode='markers',
marker=dict(color='white',
line=dict(color='black', width=1)),
name="training observations")

p3 = go.Scatter(x=X_test[:, 0], y=X_test[:, 1],
mode='markers',
marker=dict(color='blueviolet',
line=dict(color='black', width=1)),
name="new regular observations")

p4 = go.Scatter(x=X_outliers[:, 0], y=X_outliers[:, 1],
mode='markers',
marker=dict(color='gold',
line=dict(color='black', width=1)),
name="new abnormal observations")

layout = go.Layout(title="Novelty Detection",
xaxis=dict(title=
"error train: %d/200 ; errors novel regular: %d/40 ; "
"errors novel abnormal: %d/40"
% (n_error_train, n_error_test, n_error_outliers)))

fig  = go.Figure(data = [p1, p2, p3, p4], layout=layout)
In [8]:
py.iplot(fig)
Out[8]:
Still need help?