Costs Control Series Part 1: How To Save Money With Snowflake

Explore Snowflake Cost Drivers, how to Control, Alert, and Monitor

Eylon Steiner
Infostrux Engineering Blog

--

In this post, we will explore calculating Snowflake costs, monitoring techniques, and implementing resource alerts or suspensions when exceeding specified limits.
In the second part of this post, we will discuss best practices for minimizing these costs.

If you know Snowflake's cost model details, skip to the Part 2 post.

Introduction

Snowflake operates on a usage-based pricing model, where costs are determined by data storage, data transfer, and compute utilization. When transitioning from an on-premises data warehouse to Snowflake, the initial perception of Snowflake’s pricing model as expensive can be mitigated by understanding the cost structure and the efficiency gains offered by Snowflake’s architecture. This understanding can lead to potential cost savings and improved operational efficiency.

However, administrators must have a solid grasp of the cost model, provision appropriate resources, implement necessary guardrails, and ensure that users are familiar with best practices for optimizing costs effectively. Failing to do so may result in unoptimized expenses and potentially higher overall costs.

In my experience, starting with a Snowflake sandbox or development account is the best approach to alleviate initial concerns and gain hands-on experience with the platform’s cost optimization strategies.

Snowflake’s cost model

To gain a clear understanding of Snowflake’s costs, it’s important to consider several factors. These include storage usage, compute resources, and data transfer, all of which contribute to the overall cost of using Snowflake.

Snowflake’s Credits

Snowflake credits are the payment unit for resource consumption within the Snowflake platform. These credits are used when customers utilize various resources, such as running virtual warehouses, performing tasks in the cloud services layer, or utilizing serverless features. As of June 2023, the cost of one credit for AWS Canada Central in the Enterprise Edition is $3.5 USD, while the Standard Edition is priced at $2.25 USD per credit. The Critical Edition, on the other hand, costs $4.5 USD per credit. It is essential to consider the edition you choose based on the desired features and cost implications, as the Business Critical compute option is roughly twice as expensive as the Standard edition. You can find detailed specifications for each edition in the following link: Edition Specifications

To find the cost per credit, go to: https://www.snowflake.com/pricing/pricing-guide/
Choose either ‘on-demand’ or Pre-Purchased Capacity’, then choose your cloud provider (AWS/Azure/GCP) and the region. I found out that sometimes I need to fill in my details in order to get to the pricing page.

Screenshot of a page showing the costs that were detailed above

Here’s a breakdown of Snowflake’s cost:

1. Storage: Snowflake applies charges based on the data stored in its platform, measured in terabytes (TB) per month. This includes the actual data, metadata, indexes, temporary query storage, and historical data (time travel and fail-safe). The monthly fee for data storage in Snowflake is determined by the average compressed storage used per month for ingested data. Compression can significantly reduce storage requirements, and it’s important to note that table cloning or time travel only store the differences so that you won’t be charged twice for all the data. As of June 2023, the cost for one terabyte of storage in AWS Canada Central is $46 USD per month. Remember that the price can vary depending on the data, with a possible 1:10 ratio for CSV data, meaning $46 could cover 10 TB of uncompressed data per month.

2. Compute: Snowflake adopts a usage-based pricing model for compute resources, where you are billed according to the time and resources consumed during query processing. This includes virtual warehouses, which are dedicated compute clusters for executing queries. The different sizes of virtual warehouses incur costs at specific rates, billed by the second with a one-minute minimum. When a warehouse is suspended, no credits are utilized. Whenever a warehouse is started or resumed, it incurs a 1-minute charge based on the hourly rate. If a warehouse is resized to a larger size, it incurs a 1-minute charge, but the billed credits are only for the additional compute resources provisioned.

A screenshot of a page showing the credits per warehouse type

3. Data Transfer: Snowflake does not impose fees for data ingress, allowing you to bring data into your Snowflake account without additional charges. However, data egress from Snowflake is subject to fees.

When a Snowflake client or driver retrieves query results across regions within Snowflake, no data egress charges apply.

Data transfer charges occur when you move data from a Snowflake account to a different region within the same cloud platform or to an entirely different cloud platform. Snowflake applies a per-byte fee for such transfers. Pricing details specific to each cloud provider can be found in the Snowflake Pricing Guide, available at Snowflake Pricing Guide

A screenshot of a page showing data transfer pricing

4. Serverless Feature: Snowflake offers serverless features that utilize Snowflake-managed compute resources. These resources are paid for using Snowflake credits and are billed on a per-second basis. Additionally, there is a fixed Snowflake credit charge per file.

Examples of serverless features include Snowpipe for real-time data ingestion, database replication, materialized views, automatic clustering, search optimization service, query acceleration service, tasks, and data sharing for sharing data between accounts. It’s important to note that these services may have associated costs depending on your usage.

5. Cloud Services Compute: Within Snowflake’s architecture, the cloud services layer handles essential tasks such as authentication, metadata management, and access control. This layer consumes credits from your account. However, credits are only charged if the daily consumption of cloud services resources exceeds 10% of the daily warehouse usage. In other words, if the cloud services layer’s usage remains below this threshold, no additional credits are incurred.

Payment options

1 — On demand

Customers are billed at a fixed rate for their services and are invoiced retrospectively every month. To obtain the most up-to-date pricing information for your cloud provider and region, please visit Snowflake’s Pricing Guide and select your desired region.

A screenshot of a page showing price per credit per Snowflake’s edition

2 — Pre-Purchased Capacity

A capacity purchase refers to a predefined financial commitment made to Snowflake. This committed capacity is then utilized every month. By opting for a capacity purchase, customers can access additional service options, enjoy lower pricing, and benefit from a long-term price guarantee. To obtain specific pricing details for a capacity purchase, it is recommended to contact Snowflake sales. They will provide you with a quote tailored to your specific requirements and discuss the available options for your organization.

Pricing Examples

A few examples of pricing can be found in https://www.snowflake.com/pricing/pricing-guide/

Or https://docs.snowflake.com/en/user-guide/cost-understanding-overall

Exploring the costs

Accessing Cost and Usage Data

By default, only the account administrator (user with the ACCOUNTADMIN role) can view cost and usage data in Snowsight, the ACCOUNT_USAGE schema, and the ORGANIZATION_USAGE schema.

An administrator with the USERADMIN role or higher can use SNOWFLAKE database roles to grant access to other users. The following SNOWFLAKE database roles provide access to cost and usage data:

  • USAGE_VIEWER — Provides access to a single account in Snowsight and to related views in the ACCOUNT_USAGE schema.
  • ORGANIZATION_USAGE_VIEWER — Assuming the current account is the ORGADMIN account, provides access to all accounts in Snowsight and to views in the ORGANIZATION_USAGE schema that are related to cost and usage, but not billing.

Admin only option — Viewing Overall Cost

An account administrator (i.e. a user with the ACCOUNTADMIN role) can use Snowsight to view the overall cost of using Snowflake for any given day, week, or month.

To use Snowsight to explore overall cost:

  1. Navigate to Admin » Usage.
  2. Select a warehouse to use to view the usage data. Snowflake recommends using an XS warehouse for this purpose.
  3. You can choose Compute, Storage or data transfer cost from the drop-down list.

As you might not have ACCOUNTADMIN role, here is a screenshot of the usage:

A screenshot of a page showing all usages credits
A screenshot of a page showing only compute usage credits
A screenshot of a page showing only storage usage
A screenshot of a page showing data transfer usage

Querying Data for Overall Cost

One of the great features of Snowflake is that all usage information is stored in dedicated tables within the platform. If you have the necessary permissions, you can access these tables to obtain detailed cost insights. Snowflake offers various useful usage queries that can be copied from the following page: Cost Explore Queries

Snowflake provides two schemas, namely ORGANIZATION_USAGE and ACCOUNT_USAGE, which contain valuable data related to usage and cost. The ORGANIZATION_USAGE schema offers cost information for all accounts within the organization, while the ACCOUNT_USAGE schema provides similar data specific to a single account. These schemas include views that provide fine-grained, analytics-ready usage data, enabling you to create custom reports or dashboards based on your specific needs.

- Credits used (all time = past year)
SELECT warehouse_name,
SUM(credits_used_compute) AS credits_used_compute_sum
FROM snowflake.account_usage.warehouse_metering_history
GROUP BY 1
ORDER BY 2 DESC;
- Credits used (past N days/weeks/months)
SELECT warehouse_name,
SUM(credits_used_compute) AS credits_used_compute_sum
FROM snowflake.account_usage.warehouse_metering_history
WHERE start_time >= DATEADD(day, -m, CURRENT_TIMESTAMP())
GROUP BY 1
ORDER BY 2 DESC;
Credits used per warehouse

You can view table size by running the SHOW TABLES command

Tables information including the size

Monitoring the Snowflake’s costs

In addition to querying tables and utilizing the cost UI, a resource monitor provides the capability to track credit usage associated with user-managed virtual warehouses, along with the required cloud services that support these warehouses. This allows for comprehensive monitoring of resource consumption and related costs.

Resource Monitor

The resource monitor is a valuable tool for tracking credit usage. It allows for setting limits within specific intervals or date ranges. When these limits are nearing or reached, the resource monitor can initiate various actions, such as sending alert notifications and/or suspending user-managed warehouses. It offers flexibility to set limits at different levels, such as limiting the dev warehouse to $100 per day and the prod warehouse to $200 per day. It can also be configured at the account level, enabling alert notifications when monthly limits are exceeded.

Please note that only ACCOUNTADMIN users can create resource monitors. However, account administrators can grant users with other roles the ability to view and modify resource monitors if necessary.

For example:

USE ROLE ACCOUNTADMIN;
CREATE OR REPLACE RESOURCE MONITOR limit1 WITH CREDIT_QUOTA=1000
FREQUENCY = MONTHLY
START_TIMESTAMP = IMMEDIATELY
TRIGGERS ON 80 PERCENT DO NOTIFY
ON 90 PERCENT DO NOTIFY
ON 100 PERCENT DO SUSPEND
ON 120 PERCENT DO SUSPEND_IMMEDIATE;
ALTER WAREHOUSE wh1 SET RESOURCE_MONITOR = limit1;

When 80% and 90% usage is reached in a month, an alert notification is sent to all account administrators who have enabled notifications, but no other actions are performed.

  • When 100% usage is reached, the assigned warehouse is suspended, which means that it will finish executing the running queries but will not get new ones.
  • If the warehouse is still running when 120% usage is reached, it is suspended immediately.

Important note: We recommend having resource monitors from day one in order not to have billing ‘surprises’.

You can set the users to be notified when there is a trigger to NOTIFY. The users will need to enable the way they want to get notifications from the UI:

Agree to get notification screenshot

There can be an ACCOUNT LEVEL RESOURCE_MONITOR

CREATE OR REPLACE RESOURCE MONITOR ACCOUNT_RESOURCE_MONITOR WITH
CREDIT_QUOTA = 1000
FREQUENCY = MONTHLY
START_TIMESTAMP = IMMEDIATELY
NOTIFY_USERS = ()
TRIGGERS ON 80 PERCENT DO NOTIFY
ON 85 PERCENT DO SUSPEND
ON 95 PERCENT DO SUSPEND_IMMEDIATE;
ALTER ACCOUNT SET RESOURCE_MONITOR = ACCOUNT_RESOURCE_MONITOR;

More details about Resource Montior configuration can be found in this link: https://docs.snowflake.com/en/user-guide/resource-monitors

Attributing costs

Snowflake offers a comprehensive range of cost attribution features that allow organizations to attribute costs at various levels of the Snowflake object hierarchy. These features enable credit consumption to be attributed to specific groupings such as cost centers, environments, and projects. Snowflake provides the following cost attribution strategies:

1. Object tagging: Object tagging enables granular cost attribution, allowing you to assign the cost of using individual resources like warehouses or databases to specific units within the organization. This feature provides flexibility in tracking and allocating costs accurately.

2. Query execution attribution: Snowflake allows for attributing warehouse usage based on roles, users, or queries. This feature is particularly useful when multiple cost centers share the same warehouse. By attributing costs to specific roles, users, or queries, organizations can gain insights into the consumption patterns and allocate costs accordingly.

For example, you can assign tags to warehouses:

ALTER WAREHOUSE warehouse1 SET TAG cost_center='BC';
ALTER WAREHOUSE warehouse2 SET TAG cost_center='ON';

When utilizing the USAGE UI in Snowsight or querying the data using SQL, you have the option to filter the results based on specific tags, enabling easy cost analysis and reporting.

More information can be found in this link: https://docs.snowflake.com/en/user-guide/cost-attributing

Snowflake Budgets

As of Oct 2023, Snowflake just released a new feature called BUDGETS to monitor the overall cost. In case the account is expected to exceed the budget you will get an alert email.

To read more:
https://docs.snowflake.com/en/user-guide/budgets

Conclusion

In this comprehensive post, we covered the essential aspects of Snowflake’s cost calculation, monitoring capabilities, and advanced querying for deeper insights. Moreover, we explored the significance of setting up alerts to address any unexpected occurrences promptly.

Now that you grasp how Snowflake calculates costs, the next step is delving into the second part of this post, where you’ll uncover tips and best practices for effectively reducing expenses and saving money within your Snowflake environment.

Part 2

Tips and best practices for saving Snowflake’s costs

I’m Eylon Steiner, Engineering Manager for Infostrux Solutions. You can follow me on LinkedIn.

Subscribe to Infostrux Medium Blog at https://blog.infostrux.com for the most interesting Data Engineering and Snowflake news. Follow Infostrux’s open-source efforts through GitHub.

--

--