📊

Histogram Concepts and Components

Jul 12, 2025

Overview

This lecture introduces histograms as a preferred method for displaying large sets of numerical data, explains their construction, and covers concepts such as bin width and relative frequency.

Introduction to Histograms

  • Histograms are used to display large datasets of numerical variables.
  • Unlike dot plots, histograms use bars (rectangles) and group data into bins for efficiency.

Components of a Histogram

  • The x-axis (horizontal) represents the variable of interest (e.g., number of home runs).
  • The y-axis (vertical) represents the frequency (number of occurrences in each bin).
  • Bins are equally sized intervals on the x-axis where data points are grouped.

Creating and Interpreting Histograms

  • Data is sorted into bins by value; the height of each bar shows the frequency for that range.
  • To avoid overlap, the left value of a bin is included, but the right is not (interval notation).
  • Example: For home runs, the first bin might be 80 to 129, including 80 but excluding 130.

Bin Width and its Effects

  • Bin width is the range each bin covers (e.g., 50 or 10).
  • Changing bin width alters the histogram's shape and interpretation.
  • The choice of bin width is important for accurately displaying data trends.

Relative Frequency Histograms

  • Relative frequency is the fraction of observations in a bin (frequency divided by total number, n).
  • Relative frequency histograms have decimals or percentages on the y-axis instead of counts.
  • The x-axis and bin width remain unchanged when switching between frequency and relative frequency histograms.

Key Terms & Definitions

  • Histogram — A bar graph for numerical data, grouping data into intervals (bins) on a number line.
  • Bin — An interval grouping data points on the x-axis.
  • Bin Width — The size of each bin; determines how data is grouped.
  • Frequency — The count of data points within a bin.
  • Relative Frequency — The proportion of data points in a bin, often shown as a decimal or percentage.
  • n — The total number of observations in the dataset.

Action Items / Next Steps

  • Practice creating histograms with sample data, adjusting bin widths.
  • Calculate frequency and relative frequency for given datasets.
  • Review reading on Chapter 1 as foundation for these concepts.