Show Sidebar Hide Sidebar

# Comparison of Manifold Learning Methods in Scikit-learn

An illustration of dimensionality reduction on the S-curve dataset with various manifold learning methods.

For a discussion and comparison of these algorithms, see the manifold module page

For a similar example, where the methods are applied to a sphere dataset, see Manifold Learning methods on a severed sphere

Note that the purpose of the MDS is to find a low-dimensional representation of the data (here 2D) in which the distances respect well the distances in the original high-dimensional space, unlike other manifold-learning algorithms, it does not seeks an isotropic representation of the data in the low-dimensional space.

#### New to Plotly?¶

Plotly's Python library is free and open source! Get started by downloading the client and reading the primer.
You can set up Plotly to work in online or offline mode, or in jupyter notebooks.
We also have a quick-reference cheatsheet (new!) to help you get started!

### Version¶

In [1]:
import sklearn
sklearn.__version__

Out[1]:
'0.18.1'

### Imports¶

In [2]:
print(__doc__)

import plotly.plotly as py
import plotly.graph_objs as go
from plotly import tools

from time import time
import numpy as np

import matplotlib.pyplot as plt
from sklearn import manifold, datasets

Automatically created module for IPython interactive environment


### Calculations¶

In [3]:
n_points = 1000
X, color = datasets.samples_generator.make_s_curve(n_points, random_state=0)
n_neighbors = 10
n_components = 2


### Plot Results¶

In [4]:
def matplotlib_to_plotly(cmap, pl_entries):
h = 1.0/(pl_entries-1)
pl_colorscale = []

for k in range(pl_entries):
C = map(np.uint8, np.array(cmap(k*h)[:3])*255)
pl_colorscale.append([k*h, 'rgb'+str((C[0], C[1], C[2]))])

return pl_colorscale

cmap = matplotlib_to_plotly(plt.cm.Spectral, 4)


### Plot Dataset¶

In [5]:
try:
p1 = go.Scatter3d(x=X[:, 0], y=X[:, 1], z=X[:, 2],
mode='markers',
marker=dict(color=color,
colorscale=cmap,
showscale=False,
line=dict(color='black', width=1)))

except:
p1 = go.Scatter(x=X[:, 0], y=X[:, 2],
mode='markers',
marker=dict(color=color,
colorscale=cmap,
showscale=False,
line=dict(color='black', width=1)))

layout=dict(margin=dict(l=10, r=10,
t=30, b=10)
)
fig = go.Figure(data=[p1], layout=layout)

In [6]:
py.iplot(fig)

Out[6]:

### Methods: Standard, Ltsa, Hessian, Modified¶

In [7]:
methods = ['standard', 'ltsa', 'hessian', 'modified']
labels = ['LLE', 'LTSA', 'Hessian LLE', 'Modified LLE']
data = []
titles = []

for i, method in enumerate(methods):
t0 = time()
Y = manifold.LocallyLinearEmbedding(n_neighbors, n_components,
eigen_solver='auto',
method=method).fit_transform(X)
t1 = time()
print("%s: %.2g sec" % (methods[i], t1 - t0))

trace = go.Scatter(x=Y[:, 0], y=Y[:, 1],
mode='markers',
marker=dict(color=color,
colorscale=cmap,
showscale=False,
line=dict(color='black', width=1)))
data.append(trace)
titles.append("%s (%.2g sec)" % (labels[i], t1 - t0))


standard: 0.15 sec
ltsa: 0.32 sec
hessian: 0.5 sec
modified: 0.43 sec


### Isomap¶

In [8]:
t0 = time()
Y = manifold.Isomap(n_neighbors, n_components).fit_transform(X)
t1 = time()
print("Isomap: %.2g sec" % (t1 - t0))

trace = go.Scatter(x=Y[:, 0], y=Y[:, 1],
mode='markers',
marker=dict(color=color,
colorscale=cmap,
showscale=False,
line=dict(color='black', width=1)))
data.append(trace)
titles.append("Isomap (%.2g sec)" % (t1 - t0))

Isomap: 0.59 sec


### MDS¶

In [9]:
t0 = time()
mds = manifold.MDS(n_components, max_iter=100, n_init=1)
Y = mds.fit_transform(X)
t1 = time()
print("MDS: %.2g sec" % (t1 - t0))

trace = go.Scatter(x=Y[:, 0], y=Y[:, 1],
mode='markers',
marker=dict(color=color,
colorscale=cmap,
showscale=False,
line=dict(color='black', width=1)))
data.append(trace)

titles.append("MDS (%.2g sec)" % (t1 - t0))

MDS: 1.7 sec


### SpectralEmbedding¶

In [10]:
t0 = time()
se = manifold.SpectralEmbedding(n_components=n_components,
n_neighbors=n_neighbors)
Y = se.fit_transform(X)
t1 = time()
print("SpectralEmbedding: %.2g sec" % (t1 - t0))

trace = go.Scatter(x=Y[:, 0], y=Y[:, 1],
mode='markers',
marker=dict(color=color,
colorscale=cmap,
showscale=False,
line=dict(color='black', width=1)))
data.append(trace)

titles.append("SpectralEmbedding (%.2g sec)" % (t1 - t0))

SpectralEmbedding: 0.16 sec


### t-SNE¶

In [11]:
t0 = time()
tsne = manifold.TSNE(n_components=n_components, init='pca', random_state=0)
Y = tsne.fit_transform(X)
t1 = time()
print("t-SNE: %.2g sec" % (t1 - t0))

trace = go.Scatter(x=Y[:, 0], y=Y[:, 1],
mode='markers',
marker=dict(color=color,
colorscale=cmap,
showscale=False,
line=dict(color='black', width=1)))
data.append(trace)
titles.append("t-SNE (%.2g sec)" % (t1 - t0))

t-SNE: 3.6 sec

In [12]:
fig = tools.make_subplots(rows=2, cols=4,
subplot_titles=tuple(titles))

for i in range(0, len(data)):
fig.append_trace(data[i], (i/4)+1, (i%4)+1)

fig['layout'].update(title="Manifold Learning with %i points, %i neighbors" % (1000, n_neighbors),
showlegend=False, height=900, hovermode='closest')

This is the format of your plot grid:
[ (1,1) x1,y1 ]  [ (1,2) x2,y2 ]  [ (1,3) x3,y3 ]  [ (1,4) x4,y4 ]
[ (2,1) x5,y5 ]  [ (2,2) x6,y6 ]  [ (2,3) x7,y7 ]  [ (2,4) x8,y8 ]


In [13]:
py.iplot(fig)

Out[13]:

### License¶

Author:

    Jake Vanderplas -- <vanderplas@astro.washington.edu>
Still need help?
##### Contact Us

For guaranteed 24 hour response turnarounds, upgrade to a Developer Support Plan.