Show Sidebar Hide Sidebar

# Affinity Propagation Clustering Algorithm in Scikit-learn

#### New to Plotly?¶

You can set up Plotly to work in online or offline mode, or in jupyter notebooks.
We also have a quick-reference cheatsheet (new!) to help you get started!

### Version¶

In [1]:
import sklearn
sklearn.__version__

Out[1]:
'0.18'

### Imports¶

This tutorial imports AffinityPropagation and make_blobs.

In [2]:
print(__doc__)

import plotly.plotly as py
import plotly.graph_objs as go

from sklearn.cluster import AffinityPropagation
from sklearn import metrics
from sklearn.datasets.samples_generator import make_blobs

Automatically created module for IPython interactive environment


### Calculations¶

Generate sample data

In [3]:
centers = [[1, 1], [-1, -1], [1, -1]]
X, labels_true = make_blobs(n_samples=300, centers=centers, cluster_std=0.5,
random_state=0)


Compute Affinity Propagation

In [4]:
af = AffinityPropagation(preference=-50).fit(X)
cluster_centers_indices = af.cluster_centers_indices_
labels = af.labels_

n_clusters_ = len(cluster_centers_indices)

print('Estimated number of clusters: %d' % n_clusters_)
print("Homogeneity: %0.3f" % metrics.homogeneity_score(labels_true, labels))
print("Completeness: %0.3f" % metrics.completeness_score(labels_true, labels))
print("V-measure: %0.3f" % metrics.v_measure_score(labels_true, labels))
print("Silhouette Coefficient: %0.3f"
% metrics.silhouette_score(X, labels, metric='sqeuclidean'))

Estimated number of clusters: 3
Homogeneity: 0.872
Completeness: 0.872
V-measure: 0.872
Silhouette Coefficient: 0.753


### Plot Result¶

In [5]:
colors = ['blue','green','red','cyan','magenta']
data = []
for k, col in zip(range(n_clusters_), colors):
class_members = labels == k
cluster_center = X[cluster_centers_indices[k]]
trace1 = go.Scatter(x=X[class_members, 0],
y=X[class_members, 1],
showlegend=False,
mode='markers', marker=dict(color=col,
size=10))

trace2 = go.Scatter(x=[cluster_center[0]],
y=[cluster_center[1]],
showlegend=False,
mode='markers', marker=dict(color=col,
size=14))
data.append(trace1)
data.append(trace2)
for x in X[class_members]:
trace3 = go.Scatter(x = [cluster_center[0], x[0]],
y=[cluster_center[1], x[1]],
showlegend=False,
mode='lines', line=dict(color=col,
width=2))
data.append(trace3)

layout = go.Layout(title='Estimated number of clusters: %d' % n_clusters_,
xaxis=dict(zeroline=False),
yaxis=dict(zeroline=False))
fig = go.Figure(data=data, layout=layout)

py.iplot(fig)

Out[5]:

### Reference¶

Brendan J. Frey and Delbert Dueck, “Clustering by Passing Messages Between Data Points”, Science Feb. 2007

Still need help?