πŸ—ƒοΈ

Delta Tables and Versioning

Jul 10, 2025

Overview

This lecture explains how Delta tables in Databricks maintain version history through transaction logs, enabling time travel, auditing, and restores.

Delta Table Structure & Transaction Log

  • Delta tables store data in Parquet files and track all changes in the _delta_log directory.
  • The _delta_log contains JSON transaction files and checkpoint files recording all table operations.
  • Each transaction file logs operation details such as user, operation type, timestamp, and parameters.

Time Travel & Versioning

  • Delta tables keep a complete history of all changes, tracking versions numerically starting from zero.
  • Use the .history command to view the table’s modification history, including version numbers and metadata.
  • Overwriting a Delta table creates a new version and logs the change in the transaction log.
  • You can read past table versions by specifying versionAsOf or timestampAsOf options when querying.
  • SQL and Python/Scala APIs are available to query specific versions or timestamps.

Restoring Table to Previous Versions

  • The RESTORE TABLE command allows reverting the Delta table to any previous version or timestamp.
  • Restoration updates the table with past data and provides information on file changes and size after the restore.
  • The default read operation always fetches the latest version unless a specific version is requested.

Key Terms & Definitions

  • Delta Table β€” A storage format in Databricks that supports ACID transactions, versioning, and schema enforcement.
  • Transaction Log β€” JSON files in _delta_log that record every change or operation on the Delta table.
  • Version β€” Numeric identifier incremented with each Delta table operation, allowing access to previous states.
  • Time Travel β€” The ability to query, audit, or restore data as it existed at a specific point in time or version._

Action Items / Next Steps

  • Practice using .history, versionAsOf, and RESTORE TABLE commands in Databricks.
  • Review previous lecture or video for basics on Delta tables and table creation.
  • Try overwriting data and restoring older versions to understand time travel and versioning hands-on.