My Snowflake Set up with Terraform

28. January 2023 23:21 by Jay Grossman in   //  Tags: , , , , , , ,   //   Comments (0)

I have been working with Snowflake since 2016 when I proposed and chose to bring it into Rent the Runway to replace our very painful on premise Vertica implementation (yes we had it in a data center in NJ). Since then, Snowflake has grown considerably and is now one of the leading Data Warehouse offerings.

After implementing data warehouses at several companies and having lots of conversations with some very smart folks, I've learned some things along the way. I've found it is a really good idea to think about up front how I want to segment responsibilities and permissions for databases. I want to decide what types of data to store in different places and to build roles + permission grants to enforce those decisions. 

My Best Practices Assumptions for segmenting data:

Best practice for running a cloud data warehouse is to build separate repositories for:

  1. raw data from source systems and providers
  2. schemas where data is regularly transformed/generated and documented to be consumed for reporting and analysis
  3. area to data exploration, modeling, and experimentation

To ensure that these logical areas are used for their defined purposes, we can create specific roles and permissions for each. 

Logical set up of Databases with Role Permissions

Diagram 1: Logical diagram of users, roles, privilege, and databases.


database functional description privileges
  • Data is loaded from source systems and providers in its original (non transformed) format
  • No users can directly query this data
  • load_role - full access 
  • transform_role - read
  • Data is transformed or generated (on a regular cadence) and documented to support reporting and analysis needs.
  • Users can directly query this data, but can not write/update
  • Metabase (and other reporting tools) would read from this database
  • transform_role - full access
  • report_role - read
  • Serves as an area to data exploration, modeling, and experimentation
  • Users can access, create/load, and change data in this area
  • analyst_role - full access 


Other key assumptions with this set up:

  1. Each Snowflake Role has its own dedicated compute (Snowflake Warehouse).
    For instance the LOAD_ROLE can only run on a warehouse named LOAD_WH.

  2. The Raw database should contain a schema for each source system. 
    For example, if we are syncing data from our ecommerce site with a Mongo backend, we would name the schema as MONGO_ECOMMERCE. 

  3. In the Analytics database, we define 5 schemas:
    - src:  holding dbt models of data that has been lightly transformed from the original source data. Here we may flatten nested variant data into a relational model.
    - trans: schema for intermediate dbt models
    - rpt: holding dbt dimensional models that drive our BI reporting
    - export: holding tables/views that are for data to be exported to systems outside of Snowflake
    db_stats: metadata about our dbt models like total runtime, last runtime, number of rows added incrementally, etc
  4. The specific tools in the diagram above are meant to be examples for certain roles:
    - Fivetran to sync raw data from source systems into RAW (could be meltano, airbyte)
    - dbt to create models in ANALYTICS (could be ML scripts via dagster/airflow)
    - Metabase to read from ANALYTICS for BI (could be Looker, Tableau, Superset)

  5. You need to consider the processes for Development and Testing of Pipelines + Models. This diagram does not describe how I would approach these topics.

Snowflake configuration management with Terraform

Terraform (created by Hashicorp) is an open-source infrastructure as code software tool that enables you to safely and predictably create, change, and improve infrastructure.

Chan Zuckerberg terraform provider for Snowflake

This is a terraform provider plugin for managing Snowflake accounts:
GitHub -

Documentation available here:
Terraform Registry

We can use this framework to manage objects in snowflake such as:

  • databases
  • roles
  • schemas
  • user accounts
  • permission grants

Github repo with my terraform configuration for this set up  

PLEASE NOTE: This scope for this repo does not include the terrafom configuration to set up snowflake stages, functions, and file formatters. These objects often have more dependencies and require more advanced configuration, so I may plan to dedicate future blog posts to explaining the details. 

Folks who this would not have been possible without:

Collin Meyers Rob Sokolowski Tim Ricablanca

About the author

Jay Grossman

techie / entrepreneur that enjoys:
 1) my kids + awesome wife
 2) building software projects/products
 3) digging for gold in data sets
 4) my various day jobs
 5) rooting for my Boston sports teams:
    New England PatriotsBoston Red SoxBoston CelticsBoston Bruins

Month List