Show Sidebar Hide Sidebar

# Decision Tree Regression in Scikit-learn

A 1D regression with decision tree.

The decision trees is used to fit a sine curve with addition noisy observation. As a result, it learns local linear regressions approximating the sine curve.

We can see that if the maximum depth of the tree (controlled by the max_depth parameter) is set too high, the decision trees learn too fine details of the training data and learn from the noise, i.e. they overfit.

#### New to Plotly?¶

You can set up Plotly to work in online or offline mode, or in jupyter notebooks.
We also have a quick-reference cheatsheet (new!) to help you get started!

### Version¶

In [1]:
import sklearn
sklearn.__version__

Out[1]:
'0.18.1'

### Imports¶

In [2]:
print(__doc__)
import plotly.plotly as py
import plotly.graph_objs as go

import numpy as np
from sklearn.tree import DecisionTreeRegressor

Automatically created module for IPython interactive environment


### Calculations¶

In [3]:
# Create a random dataset
rng = np.random.RandomState(1)
X = np.sort(5 * rng.rand(80, 1), axis=0)
y = np.sin(X).ravel()
y[::5] += 3 * (0.5 - rng.rand(16))

# Fit regression model
regr_1 = DecisionTreeRegressor(max_depth=2)
regr_2 = DecisionTreeRegressor(max_depth=5)
regr_1.fit(X, y)
regr_2.fit(X, y)

# Predict
X_test = np.arange(0.0, 5.0, 0.01)[:, np.newaxis]
y_1 = regr_1.predict(X_test)
y_2 = regr_2.predict(X_test)


### Plot Results¶

In [4]:
def data_to_plotly(x):
k = []

for i in range(0, len(x)):
k.append(x[i][0])

return k

In [5]:
p1 = go.Scatter(x=data_to_plotly(X), y=y,
mode='markers',
marker=dict(color="darkorange"),
name="data")

p2 = go.Scatter(x=data_to_plotly(X_test), y=y_1,
mode='lines',
line=dict(color="cornflowerblue"),
name="max_depth=2")

p3 = go.Scatter(x=data_to_plotly(X_test), y=y_2,
mode='lines',
line=dict(color="yellowgreen"),
name="max_depth=5")

layout = go.Layout(xaxis=dict(title="data"),
yaxis=dict(title="target"),
title="Decision Tree Regression"
)
fig = go.Figure(data=[p1, p2, p3], layout=layout)

In [6]:
py.iplot(fig)

Out[6]:
Still need help?