Matplotlib Basics and Plot Customization

Aug 24, 2024

Lecture Notes on Matplotlib in Python

Overview

  • Continuing with Matplotlib basics.
  • Recap of last session: Hierarchical data structures in Matplotlib.
    • Figure: The top-level container (blank canvas).
    • Axes: Multiple plots within a figure.
    • X-axis and Y-axis: For labeling and customization.

Customizing Plots

  • Customization Features:
    • Assign colors (e.g., 'r' for red, 'b' for blue, etc.).
    • Different line styles (solid, dashed, dotted).
  • Examples:
    • Red solid line: pl.plot(x, y, 'r-')
    • Green dashed line: pl.plot(x, y, 'g--')
    • Blue dotted line: pl.plot(x, y, 'b:')
    • Marker customizations (e.g., circles, stars, triangles).

Multiple Plots in a Single Figure

  • Challenges with Shared Y-axis: Different scales can hide data.
  • Creating Multiple Axes:
    • Use plt.subplot() to create a 1x2 grid for two plots.
    • Example: plt.subplot(1, 2, 1) for the first plot, plt.subplot(1, 2, 2) for the second.
  • Independent Axes: Each subplot can have its own labels and legends.

Advanced Charting Techniques

  • Creating Larger Figures: Use the figsize parameter to adjust dimensions.
  • Spanning Axes: To create a plot spanning multiple axes, specify the axis indices.
  • Common Types of Plots:
    • Line Charts: For time series data.
    • Bar Charts: For categorical comparisons.
    • Box Plots: For visualizing distributions and outliers.
    • Histograms: For visualizing frequency distributions.
    • Scatter Plots: For observing relationships between two numerical columns.

Using Pandas with Matplotlib

  • Pandas Integration: Simplifies plotting with DataFrames.
    • Example: dataframe.plot(x='date', y='count') for a time series.
  • Creating Bar Charts: Use grouped data for categorical vs numerical analysis.
    • Example: Aggregating average premiums by age.

Summary of Plot Types

  • Line Charts: Good for trends over time.
  • Bar Charts: Useful for category comparisons.
  • Box Plots: Identify outliers and distribution.
  • Histograms: Visualize frequency distributions.
  • Scatter Plots: Analyze relationships between two variables.

Next Session Preview

  • Introduction to Seaborn: A more visually appealing library for data visualization.
  • Comparison with Matplotlib regarding ease of use and aesthetics.