plotly.express.data package

Built-in datasets for demonstration, educational and test purposes.

plotly.express.data.carshare(return_type='pandas')

Each row represents the availability of car-sharing services near the centroid of a zone in Montreal over a month-long period.

Parameters

return_type ({'pandas', 'polars', 'pyarrow', 'modin', 'cudf'}) – Type of the resulting dataframe

Returns

Dataframe` with 249 rows and the following columns: ['centroid_lat', 'centroid_lon', 'car_hours', 'peak_hour'].

Return type

Dataframe of return_type type

plotly.express.data.election(return_type='pandas')

Each row represents voting results for an electoral district in the 2013 Montreal mayoral election.

Parameters

return_type ({'pandas', 'polars', 'pyarrow', 'modin', 'cudf'}) – Type of the resulting dataframe

Returns

Dataframe with 58 rows and the following columns: ['district', 'Coderre', 'Bergeron', 'Joly', 'total', 'winner', 'result', 'district_id'].

Return type

Dataframe of return_type type

plotly.express.data.election_geojson()

Each feature represents an electoral district in the 2013 Montreal mayoral election.

Returns

  • A GeoJSON-formatted dict with 58 polygon or multi-polygon features whose id

  • is an electoral district numerical ID and whose district property is the ID and

  • district name.

plotly.express.data.experiment(indexed=False, return_type='pandas')

Each row in this wide dataset represents the results of 100 simulated participants on three hypothetical experiments, along with their gender and control/treatment group.

Parameters
  • indexed (bool) – If True, then the index is named “participant”. Applicable only if return_type='pandas'

  • return_type ({'pandas', 'polars', 'pyarrow', 'modin', 'cudf'}) – Type of the resulting dataframe

Returns

Dataframe with 100 rows and the following columns: ['experiment_1', 'experiment_2', 'experiment_3', 'gender', 'group']. If indexed is True, the data frame index is named “participant”

Return type

Dataframe of return_type type

plotly.express.data.gapminder(datetimes=False, centroids=False, year=None, pretty_names=False, return_type='pandas')

Each row represents a country on a given year.

https://www.gapminder.org/data/

Parameters
  • datetimes (bool) – Whether or not ‘year’ column will converted to datetime type

  • centroids (bool) – If True, [‘centroid_lat’, ‘centroid_lon’] columns are added

  • year (int | None) – If provided, the dataset will be filtered for that year

  • pretty_names (bool) – If True, prettifies the column names

  • return_type ({'pandas', 'polars', 'pyarrow', 'modin', 'cudf'}) – Type of the resulting dataframe

Returns

Dataframe with 1704 rows and the following columns: ['country', 'continent', 'year', 'lifeExp', 'pop', 'gdpPercap', 'iso_alpha', 'iso_num'].

If datetimes is True, the ‘year’ column will be a datetime column If centroids is True, two new columns are added: [‘centroid_lat’, ‘centroid_lon’] If year is an integer, the dataset will be filtered for that year

Return type

Dataframe of return_type type

plotly.express.data.iris(return_type='pandas')

Each row represents a flower.

https://en.wikipedia.org/wiki/Iris_flower_data_set

Parameters

return_type ({'pandas', 'polars', 'pyarrow', 'modin', 'cudf'}) – Type of the resulting dataframe

Returns

Dataframe with 150 rows and the following columns: ['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species', 'species_id'].

Return type

Dataframe of return_type type

plotly.express.data.medals_long(indexed=False, return_type='pandas')

This dataset represents the medal table for Olympic Short Track Speed Skating for the top three nations as of 2020.

Parameters
  • indexed (bool) – Whether or not the ‘nation’ column is used as the index. Applicable only if return_type='pandas'

  • return_type ({'pandas', 'polars', 'pyarrow', 'modin', 'cudf'}) – Type of the resulting dataframe

Returns

Dataframe with 9 rows and the following columns: ['nation', 'medal', 'count']. If indexed is True, the ‘nation’ column is used as the index.

Return type

Dataframe of return_type type

plotly.express.data.medals_wide(indexed=False, return_type='pandas')

This dataset represents the medal table for Olympic Short Track Speed Skating for the top three nations as of 2020.

Parameters
  • indexed (bool) – Whether or not the ‘nation’ column is used as the index and the column index is named ‘medal’. Applicable only if return_type='pandas'

  • return_type ({'pandas', 'polars', 'pyarrow', 'modin', 'cudf'}) – Type of the resulting dataframe

Returns

Dataframe with 3 rows and the following columns: ['nation', 'gold', 'silver', 'bronze']. If indexed is True, the ‘nation’ column is used as the index and the column index is named ‘medal’

Return type

Dataframe of return_type type

plotly.express.data.stocks(indexed=False, datetimes=False, return_type='pandas')

Each row in this wide dataset represents closing prices from 6 tech stocks in 2018/2019.

Parameters
  • indexed (bool) – Whether or not the ‘date’ column is used as the index and the column index is named ‘company’. Applicable only if return_type='pandas'

  • datetimes (bool) – Whether or not the ‘date’ column will be of datetime type

  • return_type ({'pandas', 'polars', 'pyarrow', 'modin', 'cudf'}) – Type of the resulting dataframe

Returns

Dataframe with 100 rows and the following columns: ['date', 'GOOG', 'AAPL', 'AMZN', 'FB', 'NFLX', 'MSFT']. If indexed is True, the ‘date’ column is used as the index and the column index is named ‘company’ If datetimes is True, the ‘date’ column will be a datetime column

Return type

Dataframe of return_type type

plotly.express.data.tips(pretty_names=False, return_type='pandas')

Each row represents a restaurant bill.

https://vincentarelbundock.github.io/Rdatasets/doc/reshape2/tips.html

Parameters
  • pretty_names (bool) – If True, prettifies the column names

  • return_type ({'pandas', 'polars', 'pyarrow', 'modin', 'cudf'}) – Type of the resulting dataframe

Returns

Dataframe with 244 rows and the following columns: ['total_bill', 'tip', 'sex', 'smoker', 'day', 'time', 'size'].

Return type

Dataframe of return_type type

plotly.express.data.wind(return_type='pandas')

Each row represents a level of wind intensity in a cardinal direction, and its frequency.

Parameters

return_type ({'pandas', 'polars', 'pyarrow', 'modin', 'cudf'}) – Type of the resulting dataframe

Returns

Dataframe with 128 rows and the following columns: ['direction', 'strength', 'frequency'].

Return type

Dataframe of return_type type