Show Sidebar Hide Sidebar

# Peak Finding in Python

Learn how to find peaks and valleys on datasets in Python

#### New to Plotly?¶

You can set up Plotly to work in online or offline mode, or in jupyter notebooks.
We also have a quick-reference cheatsheet (new!) to help you get started!

#### Imports¶

The tutorial below imports NumPy, Pandas, SciPy and PeakUtils.

In [1]:
import plotly.plotly as py
import plotly.graph_objs as go
from plotly.tools import FigureFactory as FF

import numpy as np
import pandas as pd
import scipy
import peakutils


#### Import Data¶

To start detecting peaks, we will import some data on milk production by month:

In [2]:
milk_data = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/monthly-milk-production-pounds.csv')
time_series = milk_data['Monthly milk production (pounds per cow)']
time_series = time_series.tolist()

df = milk_data[0:15]

table = FF.create_table(df)
py.iplot(table, filename='milk-production-dataframe')

Out[2]:

#### Original Plot¶

In [3]:
trace = go.Scatter(
x = [j for j in range(len(time_series))],
y = time_series,
mode = 'lines'
)

data = [trace]
py.iplot(data, filename='milk-production-plot')

Out[3]:

#### With Peak Detection¶

We need to find the x-axis indices for the peaks in order to determine where the peaks are located.

In [4]:
cb = np.array(time_series)
indices = peakutils.indexes(cb, thres=0.02/max(cb), min_dist=0.1)

trace = go.Scatter(
x=[j for j in range(len(time_series))],
y=time_series,
mode='lines',
name='Original Plot'
)

trace2 = go.Scatter(
x=indices,
y=[time_series[j] for j in indices],
mode='markers',
marker=dict(
size=8,
color='rgb(255,0,0)',
symbol='cross'
),
name='Detected Peaks'
)

data = [trace, trace2]
py.iplot(data, filename='milk-production-plot-with-peaks')

Out[4]:

#### Only Highest Peaks¶

We can attempt to set our threshold so that we identify as many of the highest peaks that we can.

In [5]:
cb = np.array(time_series)
indices = peakutils.indexes(cb, thres=0.678, min_dist=0.1)

trace = go.Scatter(
x=[j for j in range(len(time_series))],
y=time_series,
mode='lines',
name='Original Plot'
)

trace2 = go.Scatter(
x=indices,
y=[time_series[j] for j in indices],
mode='markers',
marker=dict(
size=8,
color='rgb(255,0,0)',
symbol='cross'
),
name='Detected Peaks'
)

data = [trace, trace2]
py.iplot(data, filename='milk-production-plot-with-higher-peaks')

Out[5]:
Still need help?