Snowflake: Managing Tables and Data Retention

Jul 8, 2024

Lecture on Snowflake: Managing Tables and Data Retention

Displaying List of Tables

  • Show tables in a database: Use the SHOW TABLES command.
  • Find main table & clone tables: Understand schema and table relationships.
  • Counting tables: Use COUNT(*) or similar aggregate functions.

Identifying Main and Clone Tables

  • Image concept: Identify main tables and clone tables using IDs.
  • Filter tables by schema: Use SQL queries to filter by database, schema, etc.
  • Sum function: Useful for aggregating values and identifying table types.

Retrieve Database and Schema Names

  • Current Database: Use `SELECT CURRENT_DATABASE()
  • Current Schema: Use `SELECT CURRENT_SCHEMA()
  • Combine results: Use concatenation in SQL to combine results.

Data Manipulation

  • Inserting records: standard SQL INSERT INTO table_name VALUES...
  • Updating records: UPDATE table_name SET column=value WHERE condition
  • Incorrect update fix: Using rollback commands or similar techniques.

Snowflake Time Travel

  • Time travel: Allows access to historical data within a specified retention period.
  • Data retention period: Default is 1 day, but can be set between 0-90 days.
  • Access historical data: Use SELECT * FROM table_name BEFORE (statement or timestamp)
  • Recover data: Copy data into new tables or historical versions of tables.

Managing Tables

  • Creating tables: CREATE TABLE table_name (...columns definition...)
  • Temporary and Transient tables: Session-specific, will be dropped after session ends; default settings allow for 0-1 day retention time.

Undrop and Restore Tables

  • Undrop table command: Use to restore tables that were accidentally dropped.
  • Failsafe: Snowflake-managed feature that can retain data for up to 7 days.
  • Disabling time travel: Set retention time to 0 days using ALTER TABLE table_name SET DATA_RETENTION_TIME = 0.

Important SQL Commands

  • Retrieve database name: SELECT CURRENT_DATABASE()
  • Retrieve schema name: SELECT CURRENT_SCHEMA()
  • Counting records: SELECT COUNT(*) FROM table_name
  • Show tables: SHOW TABLES
  • Concatenate strings: Use concatenation operators or functions in SQL.

Key Concepts

  • Current Session: SELECT CURRENT_SESSION() to get session info.
  • Managing data and tables: Using Snowflake-specific commands for efficient data retrieval and manipulation.
  • Historical Data Access: Allows comparison and restoration of data over different timespans.

Special Table Types and Settings

  • Permanent Tables: Default retention of up to 90 days.
  • Transient Tables: Typically used for temporary data, 0-1 day retention by default.
  • Temporary Tables: Session-specific, discarded after the session ends.

Query Execution Flow & Error Handling

  • Executing queries: Importance of understanding query flow and impact.
  • Handling Mistakes: Use of rollback, undrop, and time-travel features to correct errors.

Practical Examples

  • Example queries: Simple insert, update, and table creation examples.
  • Steps for Time-travel: How to use before queries to fetch historical data.
  • Managing Failsafe: Interaction with Snowflake customer support for recovering data.

Summary

  • Effective use of Snowflake tools: Best practices in managing data retention, tables, and historical data.
  • Importance of Sessions and Connection Management: Understanding how Snowflake manages data sessions and implications for data retrieval and manipulation.