plotly.express.box

plotly.express.box(data_frame=None, x=None, y=None, color=None, facet_row=None, facet_col=None, facet_col_wrap=0, hover_name=None, hover_data=None, custom_data=None, animation_frame=None, animation_group=None, category_orders={}, labels={}, color_discrete_sequence=None, color_discrete_map={}, orientation='v', boxmode='group', log_x=False, log_y=False, range_x=None, range_y=None, points=None, notched=False, title=None, template=None, width=None, height=None)

In a box plot, rows of data_frame are grouped together into a box-and-whisker mark to visualize their distribution.

Each box spans from quartile 1 (Q1) to quartile 3 (Q3). The second quartile (Q2) is marked by a line inside the box. By default, the whiskers correspond to the box’ edges +/- 1.5 times the interquartile range (IQR: Q3-Q1), see “points” for other options.

Parameters
  • data_frame (DataFrame or array-like or dict) – This argument needs to be passed for column names (and not keyword names) to be used. Array-like and dict are tranformed internally to a pandas DataFrame.

  • x (str or int or Series or array-like) – Either a name of a column in data_frame, or a pandas Series or array_like object. Values from this column or array_like are used to position marks along the x axis in cartesian coordinates. For horizontal histograms, these values are used as inputs to histfunc.

  • y (str or int or Series or array-like) – Either a name of a column in data_frame, or a pandas Series or array_like object. Values from this column or array_like are used to position marks along the y axis in cartesian coordinates. For vertical histograms, these values are used as inputs to histfunc.

  • color (str or int or Series or array-like) – Either a name of a column in data_frame, or a pandas Series or array_like object. Values from this column or array_like are used to assign color to marks.

  • facet_row (str or int or Series or array-like) – Either a name of a column in data_frame, or a pandas Series or array_like object. Values from this column or array_like are used to assign marks to facetted subplots in the vertical direction.

  • facet_col (str or int or Series or array-like) – Either a name of a column in data_frame, or a pandas Series or array_like object. Values from this column or array_like are used to assign marks to facetted subplots in the horizontal direction.

  • facet_col_wrap (int) – Maximum number of facet columns. Wraps the column variable at this width, so that the column facets span multiple rows. Ignored if 0, and forced to 0 if facet_row or a marginal is set.

  • hover_name (str or int or Series or array-like) – Either a name of a column in data_frame, or a pandas Series or array_like object. Values from this column or array_like appear in bold in the hover tooltip.

  • hover_data (list of str or int, or Series or array-like) – Either names of columns in data_frame, or pandas Series, or array_like objects Values from these columns appear as extra data in the hover tooltip.

  • custom_data (list of str or int, or Series or array-like) – Either names of columns in data_frame, or pandas Series, or array_like objects Values from these columns are extra data, to be used in widgets or Dash callbacks for example. This data is not user-visible but is included in events emitted by the figure (lasso selection etc.)

  • animation_frame (str or int or Series or array-like) – Either a name of a column in data_frame, or a pandas Series or array_like object. Values from this column or array_like are used to assign marks to animation frames.

  • animation_group (str or int or Series or array-like) – Either a name of a column in data_frame, or a pandas Series or array_like object. Values from this column or array_like are used to provide object-constancy across animation frames: rows with matching `animation_group`s will be treated as if they describe the same object in each frame.

  • category_orders (dict with str keys and list of str values (default {})) – By default, in Python 3.6+, the order of categorical values in axes, legends and facets depends on the order in which these values are first encountered in data_frame (and no order is guaranteed by default in Python below 3.6). This parameter is used to force a specific ordering of values per column. The keys of this dict should correspond to column names, and the values should be lists of strings corresponding to the specific display order desired.

  • labels (dict with str keys and str values (default {})) – By default, column names are used in the figure for axis titles, legend entries and hovers. This parameter allows this to be overridden. The keys of this dict should correspond to column names, and the values should correspond to the desired label to be displayed.

  • color_discrete_sequence (list of str) – Strings should define valid CSS-colors. When color is set and the values in the corresponding column are not numeric, values in that column are assigned colors by cycling through color_discrete_sequence in the order described in category_orders, unless the value of color is a key in color_discrete_map. Various useful color sequences are available in the plotly_express.colors submodules, specifically plotly_express.colors.qualitative.

  • color_discrete_map (dict with str keys and str values (default {})) – String values should define valid CSS-colors Used to override color_discrete_sequence to assign a specific colors to marks corresponding with specific values. Keys in color_discrete_map should be values in the column denoted by color.

  • orientation (str (default 'v')) – One of 'h' for horizontal or 'v' for vertical)

  • boxmode (str (default 'group')) – One of 'group' or 'overlay' In 'overlay' mode, boxes are on drawn top of one another. In 'group' mode, baxes are placed beside each other.

  • log_x (boolean (default False)) – If True, the x-axis is log-scaled in cartesian coordinates.

  • log_y (boolean (default False)) – If True, the y-axis is log-scaled in cartesian coordinates.

  • range_x (list of two numbers) – If provided, overrides auto-scaling on the x-axis in cartesian coordinates.

  • range_y (list of two numbers) – If provided, overrides auto-scaling on the y-axis in cartesian coordinates.

  • points (str or boolean (default 'outliers')) – One of 'outliers', 'suspectedoutliers', 'all', or False. If 'outliers', only the sample points lying outside the whiskers are shown. If 'suspectedoutliers', all outlier points are shown and those less than 4*Q1-3*Q3 or greater than 4*Q3-3*Q1 are highlighted with the marker’s 'outliercolor'. If 'outliers', only the sample points lying outside the whiskers are shown. If 'all', all sample points are shown. If False, no sample points are shown and the whiskers extend to the full range of the sample.

  • notched (boolean (default False)) – If True, boxes are drawn with notches.

  • title (str) – The figure title.

  • template (or dict or plotly.graph_objects.layout.Template instance) – The figure template name or definition.

  • width (int (default None)) – The figure width in pixels.

  • height (int (default 600)) – The figure height in pixels.

Returns

Return type

A Figure object.