Show Sidebar Hide Sidebar

Peak Integration in Python

Learn how to integrate the area between peaks and bassline in Python.

New to Plotly?

Plotly's Python library is free and open source! Get started by downloading the client and reading the primer.
You can set up Plotly to work in online or offline mode, or in jupyter notebooks.
We also have a quick-reference cheatsheet (new!) to help you get started!

Imports

The tutorial below imports NumPy, Pandas, SciPy and PeakUtils.

In [1]:
import plotly.plotly as py
import plotly.graph_objs as go
from plotly.tools import FigureFactory as FF

import numpy as np
import pandas as pd
import scipy
import peakutils

Tips

Our method for finding the area under any peak is to find the area from the data values to the x-axis, the area from the baseline to the x-axis, and then take the difference between them. In particular, we want to find the areas of these functions defined on the x-axis interval $I$ under the peak.

Let $T(x)$ be the function of the data, $B(x)$ the function of the baseline, and $Area$ the peak integration area between the baseline and the first peak. Since $T(x) \geq B(x)$ for all $x$, then we know that

$$ \begin{align} A = \int_{I} T(x)dx - \int_{I} B(x)dx \end{align} $$

Import Data

For our example below we will import some data on milk production by month:

In [2]:
milk_data = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/monthly-milk-production-pounds.csv')
time_series = milk_data['Monthly milk production (pounds per cow)']
time_series = np.asarray(time_series)

df = milk_data[0:15]

table = FF.create_table(df)
py.iplot(table, filename='milk-production-dataframe')
Out[2]:

Area Under One Peak

In [3]:
baseline_values = peakutils.baseline(time_series)

x = [j for j in range(len(time_series))]
time_series = time_series.tolist()
baseline_values = baseline_values.tolist()

rev_baseline_values = baseline_values[:11]
rev_baseline_values.reverse()
area_x = [0,1,2,3,4,5,6,7,8,9,10,11,10,9,8,7,6,5,4,3,2,1]
area_y = time_series[:11] + rev_baseline_values

trace = go.Scatter(
    x=x,
    y=time_series,
    mode='lines',
    marker=dict(
        color='#B292EA',
    ),
    name='Original Plot'
)

trace2 = go.Scatter(
    x=x,
    y=baseline_values,
    mode='markers',
    marker=dict(
        size=3,
        color='#EB55BF',
        symbol='open-circle'
    ),
    name='Bassline'
)

trace3 = go.Scatter(
    x=area_x,
    y=area_y,
    mode='lines+markers',
    marker=dict(
        size=4,
        color='rgb(255,0,0)',
    ),
    name='1st Peak Outline'
)

first_peak_x = [j for j in range(11)]
area_under_first_peak = np.trapz(time_series[:11], first_peak_x) - np.trapz(baseline_values[:11], first_peak_x)
area_under_first_peak

annotation = go.Annotation(
    x=80,
    y=1000,
    text='The peak integration for the first peak is approximately %s' % (area_under_first_peak),
    showarrow=False
)

layout = go.Layout(
    annotations=[annotation]
)
 
trace_data = [trace, trace2, trace3]
fig = go.Figure(data=trace_data, layout=layout)
py.iplot(fig, filename='milk-production-peak-integration')
Out[3]:
Still need help?
Contact Us

For guaranteed 24 hour response turnarounds, upgrade to a Developer Support Plan.