Empirical Cumulative Distribution Plots in Python
How to add empirical cumulative distribution function (ECDF) plots.
New to Plotly?
Plotly is a free and open-source graphing library for Python. We recommend you read our Getting Started guide for the latest installation or upgrade instructions, then move on to our Plotly Fundamentals tutorials or dive straight in to some Basic Charts tutorials.
Overview¶
Empirical cumulative distribution function plots are a way to visualize the distribution of a variable, and Plotly Express has a built-in function, px.ecdf()
to generate such plots. Plotly Express is the easy-to-use, high-level interface to Plotly, which operates on a variety of types of data and produces easy-to-style figures.
Alternatives to ECDF plots for visualizing distributions include histograms, violin plots, box plots and strip charts.
Simple ECDF Plots¶
Providing a single column to the x
variable yields a basic ECDF plot.
import plotly.express as px
df = px.data.tips()
fig = px.ecdf(df, x="total_bill")
fig.show()
Providing multiple columns leverage's Plotly Express' wide-form data support to show multiple variables on the same plot.
import plotly.express as px
df = px.data.tips()
fig = px.ecdf(df, x=["total_bill", "tip"])
fig.show()
It is also possible to map another variable to the color dimension of a plot.
import plotly.express as px
df = px.data.tips()
fig = px.ecdf(df, x="total_bill", color="sex")
fig.show()
Configuring the Y axis¶
By default, the Y axis shows probability, but it is also possible to show raw counts by setting the ecdfnorm
argument to None
or to show percentages by setting it to percent
.
import plotly.express as px
df = px.data.tips()
fig = px.ecdf(df, x="total_bill", color="sex", ecdfnorm=None)
fig.show()
If a y
value is provided, the Y axis is set to the sum of y
rather than counts.
import plotly.express as px
df = px.data.tips()
fig = px.ecdf(df, x="total_bill", y="tip", color="sex", ecdfnorm=None)
fig.show()
Reversed and Complementary CDF plots¶
By default, the Y value represents the fraction of the data that is at or below the value on on the X axis. Setting ecdfmode
to "reversed"
reverses this, with the Y axis representing the fraction of the data at or above the X value. Setting ecdfmode
to "complementary"
plots 1-ECDF
, meaning that the Y values represent the fraction of the data above the X value.
In standard
mode (the default), the right-most point is at 1 (or the total count/sum, depending on ecdfnorm
) and the right-most point is above 0.
import plotly.express as px
fig = px.ecdf(df, x=[1,2,3,4], markers=True, ecdfmode="standard",
title="ecdfmode='standard' (Y=fraction at or below X value, this the default)")
fig.show()
In reversed
mode, the right-most point is at 1 (or the total count/sum, depending on ecdfnorm
) and the left-most point is above 0.
import plotly.express as px
fig = px.ecdf(df, x=[1,2,3,4], markers=True, ecdfmode="reversed",
title="ecdfmode='reversed' (Y=fraction at or above X value)")
fig.show()
In complementary
mode, the right-most point is at 0 and no points are at 1 (or the total count/sum) per the definition of the CCDF as 1-ECDF, which has no point at 0.
import plotly.express as px
fig = px.ecdf(df, x=[1,2,3,4], markers=True, ecdfmode="complementary",
title="ecdfmode='complementary' (Y=fraction above X value)")
fig.show()
Orientation¶
By default, plots are oriented vertically (i.e. the variable is on the X axis and counted/summed upwards), but this can be overridden with the orientation
argument.
import plotly.express as px
df = px.data.tips()
fig = px.ecdf(df, x="total_bill", y="tip", color="sex", ecdfnorm=None, orientation="h")
fig.show()
Markers and/or Lines¶
ECDF Plots can be configured to show lines and/or markers.
import plotly.express as px
df = px.data.tips()
fig = px.ecdf(df, x="total_bill", color="sex", markers=True)
fig.show()
import plotly.express as px
df = px.data.tips()
fig = px.ecdf(df, x="total_bill", color="sex", markers=True, lines=False)
fig.show()
Marginal Plots¶
ECDF plots also support marginal plots
import plotly.express as px
df = px.data.tips()
fig = px.ecdf(df, x="total_bill", color="sex", markers=True, lines=False, marginal="histogram")
fig.show()
import plotly.express as px
df = px.data.tips()
fig = px.ecdf(df, x="total_bill", color="sex", marginal="rug")
fig.show()
import plotly.express as px
df = px.data.tips()
fig = px.ecdf(df, x="total_bill", color="sex", facet_row="time", facet_col="day")
fig.show()
What About Dash?¶
Dash is an open-source framework for building analytical applications, with no Javascript required, and it is tightly integrated with the Plotly graphing library.
Learn about how to install Dash at https://dash.plot.ly/installation.
Everywhere in this page that you see fig.show()
, you can display the same figure in a Dash application by passing it to the figure
argument of the Graph
component from the built-in dash_core_components
package like this:
import plotly.graph_objects as go # or plotly.express as px
fig = go.Figure() # or any Plotly Express function e.g. px.bar(...)
# fig.add_trace( ... )
# fig.update_layout( ... )
from dash import Dash, dcc, html
app = Dash()
app.layout = html.Div([
dcc.Graph(figure=fig)
])
app.run_server(debug=True, use_reloader=False) # Turn off reloader if inside Jupyter