SQL for Data Analytics: Full Course Tutorial

Jul 17, 2024

SQL for Data Analytics: Full Course Tutorial

Introduction

  • Course aims to teach SQL for data analytics efficiently.
  • Key tool in data science, with high demand in various roles: data analyst, data engineer, data scientist.
  • SQL = Structured Query Language, used to query databases.

Course Structure

  • Three chapters: Basics, Advanced Techniques, Capstone project.
  • Basics: SQL database concepts, common keywords, simple analysis, joins.
  • Advanced: Creating databases, complex analysis (CTE, subqueries).
  • Capstone: Real-world project analyzing data science job postings.

Chapter 1: Basics

SQL Introduction

  • SQL (Structured Query Language): Pronounced 'SQL' or 'sequel'. Used to query databases.
  • CRUD operations: Create, Read, Update, Delete.

Databases Overview

  • Relational vs Non-Relational: (SQL vs NoSQL).
    • Relational: Structured data in tables (rows/columns). E.g., SQL databases.
    • Non-Relational (NoSQL): Unstructured data. E.g., Document-based, Key-value pairs.
  • Popular SQL Databases: Postgres, MySQL, SQLite.

SQL Tools

  • Running SQL Queries: via database provided apps (e.g., Postgres, MySQL), cloud providers (e.g., Google Cloud, Azure), or code editors (e.g., VS Code).
  • Editors: sqlite-viz (beginner, browser-based), VS Code (advanced, local setup).
  • Installation Steps: Download Postgres, setup VS Code, SQL tools extension installation.
  • Data sets: Example: Data science job postings 2023 data.

Chapter 2: Advanced Techniques

Advanced SQL

  • Creating/Modifying/Deleting tables.
  • Data types: INT, VARCHAR, DATE, etc.
  • Arithmetical Operations: Addition, subtraction, multiplication, modulus.

Useful Functions in SQL

  • Aggregation Functions: SUM, COUNT, AVG, MIN, MAX.
  • Group By & Having Clauses.
  • Working with NULL values.
  • Joins: LEFT, RIGHT, INNER, FULL OUTER JOIN.
  • Order of Execution: SQL commands must follow a specific order for logical processing.

Advanced Practice Problems

  • Combining SQL Operations: Example of combining arithmetical operations with aggregation functions.
  • Working with Dates: Date functions, converting time zones, extracting dates.
  • Case Expressions: Similar to 'if' statements in programming.
  • Subqueries & CTE (Common Table Expressions): Creating temporary result sets/tables.

Loading and Connecting to Datasets

  • Setting up Data: Downloading data files, creating tables, loading data into tables.

Capstone Project

Setting Up the Project

  • Git & GitHub: Version control, pushing & pulling changes.
  • Creating Repository: Initialize repository, push project to GitHub.

Analysis Steps

  • Top Paying Jobs Analysis: Query to find highest paying jobs for a specific role/location.
  • Skills Associated with Top Paying Jobs: Exploring the skills required for top roles.
  • Most In-Demand Skills: Based on job postings data.
  • Top Paying Skills: Analyzing salary data associated with skills.
  • Most Optimal Skill: Combining demand and salary data.

Final Deliverables

  • Readme File: Consolidating analysis and results for display on GitHub.
  • Sharing on LinkedIn: Showcasing skills and project results.

Conclusion

  • Reflecting on Learnings: SQL fundamentals, advanced techniques, analysis methods.
  • Possible next steps: Enhancing skills with Python and other data analytics tools.