dsci_524_ezplot.plot_scatterplot

Functions

plot_scatterplot(df, x, y[, color, title, xlabel, ylabel])

Create a scatter plot from the provided dataset or Array.

Module Contents

dsci_524_ezplot.plot_scatterplot.plot_scatterplot(df, x, y, color=None, title=None, xlabel=None, ylabel=None)[source]

Create a scatter plot from the provided dataset or Array.

Parameters:
  • df (pandas.DataFrame or numpy.ndarray) – The dataset containing the variables to plot. Must be a pandas DataFrame or a NumPy array.

  • x (str) – The name of the column to use for the x-axis values.

  • y (str) – The name of the column to use for the y-axis values.

  • color (str, optional) – The name of the column to use for color-coding the points. If the column is categorical, colors will be mapped to unique categories (default is None).

  • title (str, optional) – The title of the scatter plot (default is None).

  • xlabel (str, optional) – The label for the x-axis (default is None).

  • ylabel (str, optional) – The label for the y-axis (default is None).

Returns:

A Matplotlib figure and axes object containing the scatter plot.

Return type:

matplotlib.figure.Figure, matplotlib.axes.Axes

Raises:
  • TypeError – If the input data is not a pandas DataFrame or NumPy array. If the x or y column contains non-numeric or mixed data types.

  • ValueError – If the DataFrame or NumPy array is empty.

Example

>>> import pandas as pd
>>> df = pd.DataFrame({
...     'height': [150, 160, 165, 170],
...     'weight': [50, 60, 65, 70],
...     'category': ['small', 'medium', 'medium', 'large']
... })
>>> fig, ax = plot_scatterplot(df, x='height', y='weight', color='category',
...                            title='Height vs. Weight',
...                            xlabel='Height (cm)', ylabel='Weight (kg)')