Show Sidebar Hide Sidebar

# Prediction Intervals for Gradient Boosting Regression in Scikit-learn

This example shows how quantile regression can be used to create prediction intervals.

#### New to Plotly?¶

You can set up Plotly to work in online or offline mode, or in jupyter notebooks.
We also have a quick-reference cheatsheet (new!) to help you get started!

### Version¶

In [1]:
import sklearn
sklearn.__version__

Out[1]:
'0.18.1'

### Imports¶

In [2]:
print(__doc__)

import plotly.plotly as py
import plotly.graph_objs as go

import numpy as np
import matplotlib.pyplot as plt


Automatically created module for IPython interactive environment


### Calculation¶

In [3]:
np.random.seed(1)
def f(x):
"""The function to predict."""
return x * np.sin(x)


First the noiseless case

In [4]:
X = np.atleast_2d(np.random.uniform(0, 10.0, size=100)).T
X = X.astype(np.float32)

# Observations
y = f(X).ravel()

dy = 1.5 + 1.0 * np.random.random(y.shape)
noise = np.random.normal(0, dy)
y += noise
y = y.astype(np.float32)


Mesh the input space for evaluations of the real function, the prediction and its MSE

In [5]:
xx = np.atleast_2d(np.linspace(0, 10, 1000)).T
xx = xx.astype(np.float32)

alpha = 0.95

n_estimators=250, max_depth=3,
learning_rate=.1, min_samples_leaf=9,
min_samples_split=9)

clf.fit(X, y)

# Make the prediction on the meshed x-axis
y_upper = clf.predict(xx)

clf.set_params(alpha=1.0 - alpha)
clf.fit(X, y)

# Make the prediction on the meshed x-axis
y_lower = clf.predict(xx)

clf.set_params(loss='ls')
clf.fit(X, y)

# Make the prediction on the meshed x-axis
y_pred = clf.predict(xx)


### Plot Results¶

Plot the function, the prediction and the 90% confidence interval based on the MSE

In [6]:
def data_to_plotly(k):
data = []

for i in range(0, len(k)):
data.append(k[i][0])

return data


In [7]:
sinx = go.Scatter(x=data_to_plotly(xx), y=data_to_plotly(f(xx)),
mode='lines',
line=dict(color='black', dash='dash'),
name='f(x) = xsin(x)'
)

observations = go.Scatter(x=data_to_plotly(X), y=y,
mode='markers',
marker=dict(color='blue', size=10),
name='Observations'
)

prediction = go.Scatter(x=data_to_plotly(xx), y=y_pred,
name='Prediction',
mode='lines',
line=dict(color='red')
)

trace = go.Scatter(x=data_to_plotly(np.concatenate([xx, xx[::-1]])),
y=np.concatenate([y_upper, y_lower[::-1]]),
showlegend=False,
line=dict(color='blue'),
name='90% prediction interval',
fill='tozeroy'
)

layout = go.Layout(xaxis=dict(title='x'),
yaxis=dict(title='f(x)')
)
fig = go.Figure(data=[trace, sinx, observations, prediction], layout=layout)

In [8]:
py.iplot(fig)

Out[8]:
Still need help?