đź’ľ

Comprehensive Guide to Database Concepts

May 3, 2025

Relational Databases and Normalization

Definition and History of Databases

  • Database: An organized collection of data, stored electronically.
  • History:
    • 1940s: Development of early computers.
    • 1950s-1960s: Use of flat files and hierarchical databases.
    • 1970s: Emergence of relational databases, E.F Codd at IBM.
    • Relational databases allow data storage in tables, enhancing data retrieval and management.

Types of Databases

  • Files: No structure or relationships.
  • Hierarchical Databases: Tree-like structure, single parent per record.
  • Network Databases: Complex relationships, multiple parent-child relationships.
  • Relational Databases: Tables represent data and relationships, enabling SQL queries.

Normalization and Importance

  • Normalization: Organizing data to reduce redundancy and improve integrity.
  • Divides large tables into smaller, related tables.
  • Eliminates insert, update, and delete anomalies.
  • Normal Forms:
    • 1NF: Atomic values in columns.
    • 2NF: Columns dependent on candidate keys.
    • 3NF: Columns dependent on primary key only.

Keys in Relational Databases

  • Candidate Key: Uniquely identifies a record.
  • Primary Key: Chosen candidate key, cannot be NULL.
  • Foreign Key: Links two tables, referencing a primary key.
  • Alternate Key: Candidate key not chosen as primary.

Structured Query Language (SQL)

Overview of SQL

  • Standardized language for managing relational databases.
  • Developed in the 1970s by IBM.

SQL Syntax and Clauses

  • DDL (Data Definition Language): Manages database objects (e.g., CREATE, ALTER).
  • DML (Data Manipulation Language): Manages data (e.g., SELECT, INSERT).
  • SELECT Statement: Queries data from tables.

SQL Functions and Clauses

  • Aggregate Functions: COUNT, SUM, AVG.
  • GROUP BY: Arranges identical data into groups.
  • ORDER BY: Specifies result order.
  • WHERE: Filters records with conditions.

SQL Joins

  • Inner Join: Matches in both tables.
  • Left Join: All left table rows, matched right table rows.
  • Right Join: All right table rows, matched left table rows.
  • Full/Outer Join: All rows with matches in either table.

Advanced SQL Concepts

SQL Statements and Queries

  • Use of INSERT, UPDATE, DELETE for data manipulation.
  • Three-Way Joins: Joining three tables.
  • Subqueries: Queries within queries.
  • UNION Clause: Combines results from multiple queries.

Database Concurrency Management

  • Concurrency Issues: Lost updates, dirty reads, and deadlocks.
  • ACID Principles: Atomicity, Consistency, Isolation, Durability.
  • Transaction Isolation Levels: Managing concurrency (e.g., Serializable).

Database Administration

Database Tables and Relationships

  • Parent Tables: Contain primary keys.
  • Child Tables: Reference foreign keys.

Security

  • Managing user permissions.

Database Integrity Constraints

  • Constraints: NOT NULL, UNIQUE, PRIMARY KEY, FOREIGN KEY, CHECK, DEFAULT.

Database Locking Mechanisms

  • Locks: Shared and exclusive locks.
  • Deadlocks: Strategies for resolution.
  • Lock Granularity: Table, column, row.

NoSQL Databases

Characteristics

  • Schema: No fixed schema.
  • Scalability: Horizontal.
  • Transaction Properties: BASE instead of ACID.
  • Types: Document, key/value, column, graph.

Big Data and Data Warehousing

Data Warehouse

  • Central location for large data volumes, designed for analysis.

Big Data

  • 3 V’s: Volume, Variety, Velocity.

SQL Syntax Overview

  • Select Statement: Basic query structure.
  • Joins: Inner, left, right joins.
  • Subqueries: Nested queries for complex operations.

Views and Indexes

Views

  • Saved queries for simplifying complex queries and enhancing security.

Index

  • Enhances search speed; avoids full table scans.

Test Questions

  • Characteristics modeled: Attribute
  • SQL Execution: Cartesian product without ON clause.
  • Maximum cardinality: 1:1 for student-dorm relationship.
  • Grouping query returns sum by district.
  • Correct syntax for constraints and defaults in SQL.