💾

Loading Data into Snowflake Table from Amazon S3 Bucket

Jul 4, 2024

How to Load Data into Snowflake Table from Amazon S3 Bucket

Introduction

  • Learn to load data into Snowflake from AWS S3
  • Requirements: Snowflake account and AWS console access

Steps to Follow

1. Sign In to AWS Console

  • Open AWS console via Google search
  • Select root user and provide email and password

2. Create S3 Bucket

  • Navigate to S3 service
  • Create a new S3 bucket (e.g., Bucky20102010)
  • Upload a CSV file (e.g., video games.csv) to the bucket

3. Create IAM Policy

  • Go to IAM service → Policies
  • Create a new policy using a JSON script
  • Include the bucket name in the policy
  • Name the policy (e.g., policy2010) and create it

4. Create IAM Role

  • Go to Roles in IAM service
  • Create a new role and select AWS account
  • Attach the created policy (policy2010)
  • Name the role (e.g., 2010) and create it
  • Copy the ARN of the role

5. Setup Snowflake

  • Ensure role is set to account admin
  • Create a new worksheet in Snowflake
  • Select Snowflake database (default databases available)

6. Create Warehouse and Database in Snowflake

  • Create warehouse (dwh_wh)
  • Create database (DBT_DB) and schema (prod)

7. Create Storage Integration in Snowflake

  • Create integration object
  • Provide the ARN and bucket name (2010)
  • Describe command to get external ID and user ARN
  • Use external ID in trust relationship section of AWS role

8. Create CSV File Format in Snowflake

  • Define CSV file format (CSV_format)
  • Specify parameters like file type, delimiter, and header row

9. Create Stage Area in Snowflake

  • Define the stage area with bucket name and file format

10. Load Data into Snowflake Table

  • Create target table if not exists
  • Use COPY INTO command to load data from S3 to Snowflake
  • Example: Loaded 998 rows successfully

11. Validate Data Load

  • Query top 10 rows from the target table
  • Data loaded successfully (e.g., 998 rows)

Conclusion

  • Successfully learned data loading from S3 to Snowflake
  • Validation of loaded data in Snowflake