Peak Integration in Python/v3

Learn how to integrate the area between peaks and bassline in Python.


Note: this page is part of the documentation for version 3 of Plotly.py, which is not the most recent version.
See our Version 4 Migration Guide for information about how to upgrade.

New to Plotly?¶

Plotly's Python library is free and open source! Get started by downloading the client and reading the primer.
You can set up Plotly to work in online or offline mode, or in jupyter notebooks.
We also have a quick-reference cheatsheet (new!) to help you get started!

Imports¶

The tutorial below imports NumPy, Pandas, SciPy and PeakUtils.

In [3]:
import plotly.plotly as py
import plotly.graph_objs as go
import plotly.figure_factory as ff

import numpy as np
import pandas as pd
import scipy
import peakutils

Tips¶

Our method for finding the area under any peak is to find the area from the data values to the x-axis, the area from the baseline to the x-axis, and then take the difference between them. In particular, we want to find the areas of these functions defined on the x-axis interval $I$ under the peak.

Let $T(x)$ be the function of the data, $B(x)$ the function of the baseline, and $Area$ the peak integration area between the baseline and the first peak. Since $T(x) \geq B(x)$ for all $x$, then we know that

$$ \begin{align} A = \int_{I} T(x)dx - \int_{I} B(x)dx \end{align} $$

Import Data¶

For our example below we will import some data on milk production by month:

In [4]:
milk_data = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/monthly-milk-production-pounds.csv')
time_series = milk_data['Monthly milk production (pounds per cow)']
time_series = np.asarray(time_series)

df = milk_data[0:15]

table = ff.create_table(df)
py.iplot(table, filename='milk-production-dataframe')
Out[4]:

Area Under One Peak¶

In [5]:
baseline_values = peakutils.baseline(time_series)

x = [j for j in range(len(time_series))]
time_series = time_series.tolist()
baseline_values = baseline_values.tolist()

rev_baseline_values = baseline_values[:11]
rev_baseline_values.reverse()
area_x = [0,1,2,3,4,5,6,7,8,9,10,11,10,9,8,7,6,5,4,3,2,1]
area_y = time_series[:11] + rev_baseline_values

trace = go.Scatter(
    x=x,
    y=time_series,
    mode='lines',
    marker=dict(
        color='#B292EA',
    ),
    name='Original Plot'
)

trace2 = go.Scatter(
    x=x,
    y=baseline_values,
    mode='markers',
    marker=dict(
        size=3,
        color='#EB55BF',
    ),
    name='Bassline'
)

trace3 = go.Scatter(
    x=area_x,
    y=area_y,
    mode='lines+markers',
    marker=dict(
        size=4,
        color='rgb(255,0,0)',
    ),
    name='1st Peak Outline'
)

first_peak_x = [j for j in range(11)]
area_under_first_peak = np.trapz(time_series[:11], first_peak_x) - np.trapz(baseline_values[:11], first_peak_x)
area_under_first_peak

annotation = go.Annotation(
    x=80,
    y=1000,
    text='The peak integration for the first peak is approximately %s' % (area_under_first_peak),
    showarrow=False
)

layout = go.Layout(
    annotations=[annotation]
)

trace_data = [trace, trace2, trace3]
fig = go.Figure(data=trace_data, layout=layout)
py.iplot(fig, filename='milk-production-peak-integration')
/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/plotly/graph_objs/_deprecations.py:144: DeprecationWarning:

plotly.graph_objs.Annotation is deprecated.
Please replace it with one of the following more specific types
  - plotly.graph_objs.layout.Annotation
  - plotly.graph_objs.layout.scene.Annotation


Out[5]: