# Polynomial Interpolation in Scikit-learn

This example demonstrates how to approximate a function with a polynomial of degree n_degree by using ridge regression. Concretely, from n_samples 1d points, it suffices to build the Vandermonde matrix, which is n_samples x n_degree+1 and has the following form:

** [[1, x_1, x_1 ** 2, x_1 ** 3, ...],**

```
[1, x_2, x_2 ** 2, x_2 ** 3, ...], ...]
```

Intuitively, this matrix can be interpreted as a matrix of pseudo features (the points raised to some power). The matrix is akin to (but different from) the matrix induced by a polynomial kernel. This example shows that you can do non-linear regression with a linear model, using a pipeline to add non-linear features. Kernel methods extend this idea and can induce very high (even infinite) dimensional feature spaces.

#### New to Plotly?¶

Plotly's Python library is free and open source! Get started by downloading the client and reading the primer.

You can set up Plotly to work in online or offline mode, or in jupyter notebooks.

We also have a quick-reference cheatsheet (new!) to help you get started!

### Version¶

```
import sklearn
sklearn.__version__
```

### Imports¶

This tutorial imports Ridge, PolynomialFeatures and make_pipeline.

```
import plotly.plotly as py
import plotly.graph_objs as go
import numpy as np
from sklearn.linear_model import Ridge
from sklearn.preprocessing import PolynomialFeatures
from sklearn.pipeline import make_pipeline
```

### Calculations¶

```
def f(x):
""" function to approximate by polynomial interpolation"""
return x * np.sin(x)
# generate points used to plot
x_plot = np.linspace(0, 10, 100)
# generate points and keep a subset of them
x = np.linspace(0, 10, 100)
rng = np.random.RandomState(0)
rng.shuffle(x)
x = np.sort(x[:20])
y = f(x)
# create matrix versions of these arrays
X = x[:, np.newaxis]
X_plot = x_plot[:, np.newaxis]
colors = ['teal', 'yellowgreen', 'gold']
lw = 2
```

### Plot Results¶

```
data = []
p1 = go.Scatter(x=x_plot, y=f(x_plot),
mode='lines',
line=dict(color='cornflowerblue', width=lw),
name="ground truth")
p2 = go.Scatter(x=x, y=y,
mode='markers',
marker=dict(color='navy',
line=dict(color='black', width=1)),
name="training points")
data.append(p1)
data.append(p2)
for count, degree in enumerate([3, 4, 5]):
model = make_pipeline(PolynomialFeatures(degree), Ridge())
model.fit(X, y)
y_plot = model.predict(X_plot)
p3 = go.Scatter(x=x_plot, y=y_plot,
mode='lines',
line=dict(color=colors[count], width=lw),
name="degree %d" % degree)
data.append(p3)
layout = go.Layout(xaxis=dict(zeroline=False),
yaxis=dict(zeroline=False))
fig = go.Figure(data=data, layout=layout)
```

```
py.iplot(fig)
```

### License¶

Author:

```
Mathieu Blondel
Jake Vanderplas
```

License:

` BSD 3 clause`