🏏

T20 Cricket Data Analytics Project Overview

Sep 20, 2024

Lecture Notes: T20 Cricket Data Analytics Series

Overview of the Project

  • Review of T20 Cricket World Cup (England vs Pakistan)
  • Project focus on cricket data analytics using T20 World Cup data
  • Steps involved: scraping data, data cleaning and transformation, building dashboards in Power BI

Project Steps

1. Data Scraping

  • Source: ESPNcricinfo website
  • Tools: Web scraping techniques to extract relevant data
    • Use of Bright Data for scraping with proxy networks
    • Data types to capture:
      • Match results table
      • Detailed scorecards for batting and bowling
      • Player specific information

2. Data Cleaning & Transformation

  • Python and Pandas: Tools used for data cleaning and transformation
  • Objective: Transform JSON data into a flat CSV format for easier analysis in Power BI
  • Steps include:
    • Renaming columns
    • Creating new columns (e.g., match ID, out/not out indicator)
    • Handling missing values and data types

3. Building Dashboards in Power BI

  • Dashboard Features:
    • Categorized player selection (openers, anchors, fast bowlers)
    • Display player statistics: runs, strike rate, batting average
    • Filtering capabilities for selection criteria
    • Visualizations like scatter plots for performance comparisons

Problem Statement

  • Assembling a cricket team to defeat Planet Sporta
  • Required team performance metrics:
    • Average runs scored: 180
    • Runs to defend: 150

Parameters for Player Selection

Openers

  • Criteria:
    • Batting average
    • Strike rate
    • Boundary percentage
  • Target: 50 runs in the first 5 overs

Middle Order (Anchors)

  • Criteria:
    • Ability to shift gears and bat for a longer duration
    • Higher batting average
    • Average balls faced

Finishers

  • Criteria:
    • Ability to chase down and stabilize innings
  • Preferably all-rounders with batting focus

All-rounders

  • Criteria:
    • Should be capable of hitting hard and also bowling effectively

Fast Bowlers

  • Criteria:
    • Bowling economy below 7
    • Wickets taken every 16 balls

Data Collection Process

  • Bright Data: Utilized for efficient data collection
  • Created multiple collectors for different data types
  • Example of data extraction process shown using JavaScript code for web scraping

Data Modeling in Power BI

  • Establishing relationships between tables based on match IDs and player names
  • Creating DAX measures for calculations (e.g., total runs, innings batted)
  • Calculated columns for derived metrics (e.g., boundary runs)

Dashboard Creation

  • Mock-up provided for initial design
  • Creating visuals based on player categories and performance metrics
  • Emphasized on the importance of visualization aesthetics and layout

Final Team Selection

  • Analysis of player performance using the created dashboard
    • Selected players based on statistical performance and roles needed
    • Emphasized the importance of pairing players effectively for optimal results

Challenge and Further Learning

  • Participants given an exercise to improve the dashboard and add new insights
  • Encouraged to share progress on platforms like LinkedIn for networking and visibility
  • Resources and codes provided for further exploration and practice

Conclusion

  • The project aims to provide hands-on experience in cricket data analytics through practical application of data scraping, cleaning, and visualization techniques.
  • Encouraged students to leverage the knowledge gained for real-world applications and potential career opportunities.