dsci_524_ezplot.plot_histogram

Functions

plot_histogram(df[, column, bins, title, xlabel, ...])

Create a histogram for numeric data or a bar plot for categorical data

Module Contents

dsci_524_ezplot.plot_histogram.plot_histogram(df, column=None, bins=10, title=None, xlabel=None, ylabel=None, color=None)[source]

Create a histogram for numeric data or a bar plot for categorical data from a pandas DataFrame or a NumPy array.

Parameters:
  • df (pandas.DataFrame or numpy.ndarray) – Input data containing the values to plot.

  • column (str or None, optional) – Column name for the values to plot in the histogram/bar plot (only applicable for DataFrame). If None, all columns will be used (only for DataFrame input with numeric data).

  • bins (int, optional) – Number of bins for the histogram. Default is 10 (ignored for categorical data).

  • title (str, optional) – Title of the plot. If None, no title is added.

  • xlabel (str, optional) – Label for the x-axis. If None, no label is added.

  • ylabel (str, optional) – Label for the y-axis. If None, no label is added.

  • color (str or list, optional) – Color for the bars in the plot. Default is None (Matplotlib default colors are used).

Returns:

  • matplotlib.figure.Figure

    The figure object containing the plot.

  • matplotlib.axes.Axes

    The axes object containing the plot elements.

Return type:

tuple

Raises:
  • TypeError – If input data is not a DataFrame or a NumPy array.

  • ValueError – If the data is empty or contains all NaN values. If bins is not a positive integer.

Examples

Using a pandas DataFrame: >>> import pandas as pd >>> import numpy as np >>> df = pd.DataFrame({‘values’: np.random.randn(100)}) >>> fig, ax = plot_histogram(df, column=’values’, bins=20, title=”Histogram of Values”, xlabel=”Values”, ylabel=”Frequency”)

Using a NumPy array: >>> arr = np.random.randn(100) >>> fig, ax = plot_histogram(arr, bins=15, title=”Histogram from NumPy Array”, xlabel=”Values”, ylabel=”Frequency”)