💻

MySQL Database Tutorial for Data Analysis

Jul 6, 2024

MySQL Database Tutorial for Data Analysis

Introduction

  • Focus on using MySQL for data analysis
  • Relevant for aspiring data analysts
  • SQL knowledge essential for data analysts
  • No prerequisite SQL knowledge needed for this tutorial
  • Covering basics: tables, relationships, simple queries (select, where, inner join)
  • Demonstrating data analysis using SQL without tools like Tableau or Power BI
  • Subsequent videos will cover using Tableau

Initial Setup

  • Video link with installation instructions for MySQL
  • Installation includes MySQL server and MySQL Workbench
  • Launching MySQL Workbench and creating a new connection
  • Example connection: 'dev sales insights'
  • Testing the connection and checking server status
  • Downloading database dump file from video description
  • Importing database dump into MySQL
  • Exploring tables in the newly created database (sales)

Exploring Tables

  • Customers Table
    • Columns: customer code, name, customer type
    • Contains clients of AtliQ Hardware
  • Transactions Table
    • Columns: product code, customer code, market code, order date, sales quantity, sales amount, currency
  • Products Table
    • Simple product names like product1, product2
  • Markets Table
    • Columns: market code, name of city, zone
    • Mostly business in Indian cities
    • Non-relevant records from New York and Paris

Data Cleaning Challenges

  • Presence of garbage data due to long-term database usage
  • Identifying negative sales amounts and USD transactions
  • Importance of data cleaning in data analysis

Basic SQL Queries for Data Analysis

  • Counting records:
    • SELECT COUNT(*) FROM sales.transactions;
    • SELECT COUNT(*) FROM sales.customers;
  • Filtering transactions from a specific market (e.g., Chennai):
    • SELECT * FROM sales.transactions WHERE market_code = 'MARK001';
    • Adding LIMIT 5 to view first five records
  • Exporting data to CSV from MySQL Workbench
  • Counting transactions by applying filters
  • Identifying transactions with USD currency:
    • SELECT * FROM sales.transactions WHERE currency = 'USD';

Data Analysis with Joins

  • Understanding date table for filtering by year
  • Performing an inner join to filter transactions by year:
    • SELECT * FROM sales.transactions INNER JOIN sales.date ON sales.transactions.order_date = sales.date.date WHERE sales.date.year = 2020;
  • Aggregating data (e.g., total sales revenue for specific year):
    • SELECT SUM(sales.transactions.sales_amount) FROM sales.transactions INNER JOIN sales.date ON sales.transactions.order_date = sales.date.date WHERE sales.date.year = 2020;
  • Filtering aggregated data by market (e.g., Chennai):
    • SELECT SUM(sales.transactions.sales_amount) FROM sales.transactions INNER JOIN sales.date ON sales.transactions.order_date = sales.date.date WHERE sales.date.year = 2020 AND sales.transactions.market_code = 'MARK001';
  • Finding distinct products sold in a specific market:
    • SELECT DISTINCT product_code FROM sales.transactions WHERE market_code = 'MARK001';*

Next Steps

  • Future videos will use Power BI for more advanced data analysis and cleaning
  • Power BI will be connected to SQL database
  • Using Power BI to handle issues like USD currency and negative sales amounts
  • Expect detailed instructions in the readme file on GitHub for database setup and SQL queries

Conclusion

  • SQL as a powerful tool for data analysis
  • Encouraged to practice and generate insights using SQL
  • Next video in series will focus on data cleaning and advanced analysis using Power BI

Helpful Resources:

  • Course on Khan Academy for SQL fundamentals
  • Online SQL tutorials and examples for deeper understanding