plotly.express
.scatter¶
-
plotly.express.
scatter
(data_frame=None, x=None, y=None, color=None, symbol=None, size=None, hover_name=None, hover_data=None, custom_data=None, text=None, facet_row=None, facet_col=None, facet_col_wrap=0, facet_row_spacing=None, facet_col_spacing=None, error_x=None, error_x_minus=None, error_y=None, error_y_minus=None, animation_frame=None, animation_group=None, category_orders=None, labels=None, orientation=None, color_discrete_sequence=None, color_discrete_map=None, color_continuous_scale=None, range_color=None, color_continuous_midpoint=None, symbol_sequence=None, symbol_map=None, opacity=None, size_max=None, marginal_x=None, marginal_y=None, trendline=None, trendline_options=None, trendline_color_override=None, trendline_scope='trace', log_x=False, log_y=False, range_x=None, range_y=None, render_mode='auto', title=None, template=None, width=None, height=None) → plotly.graph_objects._figure.Figure¶ In a scatter plot, each row of
data_frame
is represented by a symbol mark in 2D space.- Parameters
data_frame (DataFrame or array-like or dict) – This argument needs to be passed for column names (and not keyword names) to be used. Array-like and dict are transformed internally to a pandas DataFrame. Optional: if missing, a DataFrame gets constructed under the hood using the other arguments.
x (str or int or Series or array-like) – Either a name of a column in
data_frame
, or a pandas Series or array_like object. Values from this column or array_like are used to position marks along the x axis in cartesian coordinates. Eitherx
ory
can optionally be a list of column references or array_likes, in which case the data will be treated as if it were ‘wide’ rather than ‘long’.y (str or int or Series or array-like) – Either a name of a column in
data_frame
, or a pandas Series or array_like object. Values from this column or array_like are used to position marks along the y axis in cartesian coordinates. Eitherx
ory
can optionally be a list of column references or array_likes, in which case the data will be treated as if it were ‘wide’ rather than ‘long’.color (str or int or Series or array-like) – Either a name of a column in
data_frame
, or a pandas Series or array_like object. Values from this column or array_like are used to assign color to marks.symbol (str or int or Series or array-like) – Either a name of a column in
data_frame
, or a pandas Series or array_like object. Values from this column or array_like are used to assign symbols to marks.size (str or int or Series or array-like) – Either a name of a column in
data_frame
, or a pandas Series or array_like object. Values from this column or array_like are used to assign mark sizes.hover_name (str or int or Series or array-like) – Either a name of a column in
data_frame
, or a pandas Series or array_like object. Values from this column or array_like appear in bold in the hover tooltip.hover_data (str, or list of str or int, or Series or array-like, or dict) – Either a name or list of names of columns in
data_frame
, or pandas Series, or array_like objects or a dict with column names as keys, with values True (for default formatting) False (in order to remove this column from hover information), or a formatting string, for example ‘:.3f’ or ‘|%a’ or list-like data to appear in the hover tooltip or tuples with a bool or formatting string as first element, and list-like data to appear in hover as second element Values from these columns appear as extra data in the hover tooltip.custom_data (str, or list of str or int, or Series or array-like) – Either name or list of names of columns in
data_frame
, or pandas Series, or array_like objects Values from these columns are extra data, to be used in widgets or Dash callbacks for example. This data is not user-visible but is included in events emitted by the figure (lasso selection etc.)text (str or int or Series or array-like) – Either a name of a column in
data_frame
, or a pandas Series or array_like object. Values from this column or array_like appear in the figure as text labels.facet_row (str or int or Series or array-like) – Either a name of a column in
data_frame
, or a pandas Series or array_like object. Values from this column or array_like are used to assign marks to facetted subplots in the vertical direction.facet_col (str or int or Series or array-like) – Either a name of a column in
data_frame
, or a pandas Series or array_like object. Values from this column or array_like are used to assign marks to facetted subplots in the horizontal direction.facet_col_wrap (int) – Maximum number of facet columns. Wraps the column variable at this width, so that the column facets span multiple rows. Ignored if 0, and forced to 0 if
facet_row
or amarginal
is set.facet_row_spacing (float between 0 and 1) – Spacing between facet rows, in paper units. Default is 0.03 or 0.07 when facet_col_wrap is used.
facet_col_spacing (float between 0 and 1) – Spacing between facet columns, in paper units Default is 0.02.
error_x (str or int or Series or array-like) – Either a name of a column in
data_frame
, or a pandas Series or array_like object. Values from this column or array_like are used to size x-axis error bars. Iferror_x_minus
isNone
, error bars will be symmetrical, otherwiseerror_x
is used for the positive direction only.error_x_minus (str or int or Series or array-like) – Either a name of a column in
data_frame
, or a pandas Series or array_like object. Values from this column or array_like are used to size x-axis error bars in the negative direction. Ignored iferror_x
isNone
.error_y (str or int or Series or array-like) – Either a name of a column in
data_frame
, or a pandas Series or array_like object. Values from this column or array_like are used to size y-axis error bars. Iferror_y_minus
isNone
, error bars will be symmetrical, otherwiseerror_y
is used for the positive direction only.error_y_minus (str or int or Series or array-like) – Either a name of a column in
data_frame
, or a pandas Series or array_like object. Values from this column or array_like are used to size y-axis error bars in the negative direction. Ignored iferror_y
isNone
.animation_frame (str or int or Series or array-like) – Either a name of a column in
data_frame
, or a pandas Series or array_like object. Values from this column or array_like are used to assign marks to animation frames.animation_group (str or int or Series or array-like) – Either a name of a column in
data_frame
, or a pandas Series or array_like object. Values from this column or array_like are used to provide object-constancy across animation frames: rows with matching `animation_group`s will be treated as if they describe the same object in each frame.category_orders (dict with str keys and list of str values (default
{}
)) – By default, in Python 3.6+, the order of categorical values in axes, legends and facets depends on the order in which these values are first encountered indata_frame
(and no order is guaranteed by default in Python below 3.6). This parameter is used to force a specific ordering of values per column. The keys of this dict should correspond to column names, and the values should be lists of strings corresponding to the specific display order desired.labels (dict with str keys and str values (default
{}
)) – By default, column names are used in the figure for axis titles, legend entries and hovers. This parameter allows this to be overridden. The keys of this dict should correspond to column names, and the values should correspond to the desired label to be displayed.orientation (str, one of
'h'
for horizontal or'v'
for vertical.) – (default'v'
ifx
andy
are provided and both continous or both categorical, otherwise'v'`(
‘h’) if `x`(`y
) is categorical andy`(`x
) is continuous, otherwise'v'`(
‘h’) if only `x`(`y
) is provided)color_discrete_sequence (list of str) – Strings should define valid CSS-colors. When
color
is set and the values in the corresponding column are not numeric, values in that column are assigned colors by cycling throughcolor_discrete_sequence
in the order described incategory_orders
, unless the value ofcolor
is a key incolor_discrete_map
. Various useful color sequences are available in theplotly.express.colors
submodules, specificallyplotly.express.colors.qualitative
.color_discrete_map (dict with str keys and str values (default
{}
)) – String values should define valid CSS-colors Used to overridecolor_discrete_sequence
to assign a specific colors to marks corresponding with specific values. Keys incolor_discrete_map
should be values in the column denoted bycolor
. Alternatively, if the values ofcolor
are valid colors, the string'identity'
may be passed to cause them to be used directly.color_continuous_scale (list of str) – Strings should define valid CSS-colors This list is used to build a continuous color scale when the column denoted by
color
contains numeric data. Various useful color scales are available in theplotly.express.colors
submodules, specificallyplotly.express.colors.sequential
,plotly.express.colors.diverging
andplotly.express.colors.cyclical
.range_color (list of two numbers) – If provided, overrides auto-scaling on the continuous color scale.
color_continuous_midpoint (number (default
None
)) – If set, computes the bounds of the continuous color scale to have the desired midpoint. Setting this value is recommended when usingplotly.express.colors.diverging
color scales as the inputs tocolor_continuous_scale
.symbol_sequence (list of str) – Strings should define valid plotly.js symbols. When
symbol
is set, values in that column are assigned symbols by cycling throughsymbol_sequence
in the order described incategory_orders
, unless the value ofsymbol
is a key insymbol_map
.symbol_map (dict with str keys and str values (default
{}
)) – String values should define plotly.js symbols Used to overridesymbol_sequence
to assign a specific symbols to marks corresponding with specific values. Keys insymbol_map
should be values in the column denoted bysymbol
. Alternatively, if the values ofsymbol
are valid symbol names, the string'identity'
may be passed to cause them to be used directly.opacity (float) – Value between 0 and 1. Sets the opacity for markers.
size_max (int (default
20
)) – Set the maximum mark size when usingsize
.marginal_x (str) – One of
'rug'
,'box'
,'violin'
, or'histogram'
. If set, a horizontal subplot is drawn above the main plot, visualizing the x-distribution.marginal_y (str) – One of
'rug'
,'box'
,'violin'
, or'histogram'
. If set, a vertical subplot is drawn to the right of the main plot, visualizing the y-distribution.trendline (str) – One of
'ols'
,'lowess'
,'rolling'
,'expanding'
or'ewm'
. If'ols'
, an Ordinary Least Squares regression line will be drawn for each discrete-color/symbol group. If'lowess
’, a Locally Weighted Scatterplot Smoothing line will be drawn for each discrete-color/symbol group. If'rolling
’, a Rolling (e.g. rolling average, rolling median) line will be drawn for each discrete-color/symbol group. If'expanding
’, an Expanding (e.g. expanding average, expanding sum) line will be drawn for each discrete-color/symbol group. If'ewm
’, an Exponentially Weighted Moment (e.g. exponentially-weighted moving average) line will be drawn for each discrete-color/symbol group. See the docstrings for the functions inplotly.express.trendline_functions
for more details on these functions and how to configure them with thetrendline_options
argument.trendline_options (dict) – Options passed as the first argument to the function from
plotly.express.trendline_functions
named in thetrendline
argument.trendline_color_override (str) – Valid CSS color. If provided, and if
trendline
is set, all trendlines will be drawn in this color rather than in the same color as the traces from which they draw their inputs.trendline_scope (str (one of
'trace'
or'overall'
, default'trace'
)) – If'trace'
, then one trendline is drawn per trace (i.e. per color, symbol, facet, animation frame etc) and if'overall'
then one trendline is computed for the entire dataset, and replicated across all facets.log_x (boolean (default
False
)) – IfTrue
, the x-axis is log-scaled in cartesian coordinates.log_y (boolean (default
False
)) – IfTrue
, the y-axis is log-scaled in cartesian coordinates.range_x (list of two numbers) – If provided, overrides auto-scaling on the x-axis in cartesian coordinates.
range_y (list of two numbers) – If provided, overrides auto-scaling on the y-axis in cartesian coordinates.
render_mode (str) – One of
'auto'
,'svg'
or'webgl'
, default'auto'
Controls the browser API used to draw marks.'svg'
is appropriate for figures of less than 1000 data points, and will allow for fully-vectorized output.'webgl'
is likely necessary for acceptable performance above 1000 points but rasterizes part of the output.'auto'
uses heuristics to choose the mode.title (str) – The figure title.
template (str or dict or plotly.graph_objects.layout.Template instance) – The figure template name (must be a key in plotly.io.templates) or definition.
width (int (default
None
)) – The figure width in pixels.height (int (default
None
)) – The figure height in pixels.
- Returns
- Return type