Data Visualization with Matplotlib

Jul 26, 2024

Notes on Data Visualization with Matplotlib

Purpose of Matplotlib Library

  • Matplotlib: A Python library for creating 2D graphics, primarily used to generate graphs and charts.
  • Modules: Contains different modules, the most popular being Pyplot.

Common Chart Functions

Chart Types and Corresponding Functions

  • Line Chart: plt.plot()
  • Bar Chart: plt.bar()
  • Horizontal Bar Chart: plt.barh()

Example: Bar Chart of Students

  • Sample Data Structures:
    • Create lists for classes and corresponding number of students.
    • Example:
      import matplotlib.pyplot as plt
      classes = [6, 7, 8, 9, 10]  # X-axis
      strengths = [30, 25, 20, 35, 40]  # Y-axis
      plt.bar(classes, strengths)
      plt.show()
      

Adding Titles

  • To add a title to a chart: plt.title('Your Title Here')

Histogram and Data Counting

  • Histogram: Used to count how many values fall within each interval. Ideal for continuous data.

Customization of Charts

  • Common customization methods: plt.xlabel(), plt.ylabel(), plt.title(), plt.legend().
  • Misconception: No method named color for customization.

Example: Line Chart for Temperature

  • For depicting changing temperatures over weeks:
    import matplotlib.pyplot as plt
    weeks = [1, 2, 3, 4]
    avg_temp = [30, 35, 33, 31]
    plt.plot(weeks, avg_temp)
    plt.show()
    

Legend Placement

  • The default position of the legend is the upper right corner. Specify location using: plt.legend(loc='upper left') or by using numbers.

General Overview of Python’s Data Visualization

  • Python's most popular library for data visualization is Matplotlib.
  • Use of libraries facilitates effective data representation.

Example: Code Representation of a Graph

  • When constructing specific graphs, remember to align x and y values properly.

Common Issues in Code Implementation

  • Error reason: Shapes of data lists (x and y) must match for plotting. Mismatched lengths cause errors.

Using Legends in Charts

  • Legends help identify different data series in a graph.

Example of Data Comparison with Column Chart

  • E.g., comparing students' scores in a class can be effectively shown using a column chart.

Final Programming Task Example (Cricket Team Scores)

  • Create a horizontal bar chart using scores from matches.
    import matplotlib.pyplot as plt
    matches = ['Match 1', 'Match 2', 'Match 3', 'Match 4']
    scores = [270, 230, 150, 190]
    plt.barh(matches, scores)
    plt.xlabel('Runs')
    plt.legend(loc='4')
    plt.show()
    

Conclusion and Tips

  • Best Practices: Always include plt.show() after plotting to display the chart.
  • The command to install Matplotlib: pip install matplotlib.
  • If a filename extension is not provided when saving a figure, it defaults to .jpg.

Essential to keep practicing and learning concepts related to data visualization.