Data Encryption with Customer-Managed Keys

The security of customer data is Snowflake’s first priority. All customer data is encrypted using industry-standard techniques such as AES-256. Encryption keys are organized hierarchically, rooted in a hardware security module (HSM). This allows complete isolation of customer data and greatly reduces the attack vectors.

For customers with the highest security requirements, we are adding another security component: customer-managed keys. With customer-managed keys, the customer manages the encryption key and makes it available to Snowflake. The customer has full control over this key. If the customer disables access to the encryption key, Snowflake can no longer access the customer’s data. Your data. Your encryption keys.

In this blog post, we will explain the benefits of customer-managed keys and their implementation in the Snowflake cloud data warehouse.

Benefits

Customer-managed keys provide the following benefits:

More Control over Data Access: Customer-managed keys make it impossible for Snowflake to comply with requests to access customer data. If data is encrypted using customer-managed keys and the customer disables access to the encryption key, it is technically impossible for Snowflake to decrypt the data. It is therefore the customer’s responsibility to comply with such requests directly.

Stop Data Breaches: If a customer experiences a data breach, they may disable access of customer-managed keys to Snowflake. This will halt all running queries in Snowflake, including queries that may inspect data or unload data. Disabling customer-managed keys allows customers to stop ongoing exfiltration of their data.

More Control over Data Lifecycle: The last reason why customers require this feature is lack of trust with any cloud provider. Customers may have sensitive data that they do not trust Snowflake to manage on their own. Using customer-managed keys, such sensitive data is ultimately encrypted with the customer’s key. It is impossible for Snowflake to decrypt this data without the customer’s consent. The customer has full control over the data’s lifecycle.

Implementation

Before we explain the implementation of customer-managed keys, we should first give a background of Snowflake’s key hierarchy and Amazon’s key management service.

Background 1: Snowflake’s Key Hierarchy

Snowflake manages encryption keys hierarchically. Within this key hierarchy, a parent key encrypts all of its child keys. When a key encrypts another key, it is called “wrapping”. When the key is decrypted again, it is called “unwrapping”.

Encryption key hierarchy - Snowflake

Figure 1: Encryption key hierarchy in Snowflake.

Figure 1 shows Snowflake’s hierarchy of encryption keys. The top-most root keys are stored in a hardware security module (or CloudHSM). A root key wraps account master keys. Each account master key corresponds to one customer account in Snowflake. Account master keys, in turn, wrap all data-level keys, including table master keys, stage master keys, and result master keys. In addition, every single data file is encrypted with a separate key. A detailed overview of Snowflake’s encryption key management is provided in this Blog post.

Background 2: AWS Key Management Service

Amazon’s AWS Key Management Service (KMS) is a service to store encryption keys and tightly control access to them. Amazon provides an audit log of all operations and interactions with KMS by using CloudTrail. This allows customers to manage their own encryption keys and validate their usage via the audit log. KMS also allows customers to disable access to any keys at any time. Combining KMS with Snowflake’s encryption key hierarchy allows us to implement customer-managed keys. More details about AWS KMS can be found on the Amazon website.

Implementation of Customer-Managed Keys

The implementation of customer-managed keys changes the way account master keys (AMKs) are stored within Snowflake’s encryption key hierarchy. Normally, as shown in Figure 1 above, an AMK is wrapped by the root key stored in CloudHSM. For customer-managed keys, this is only partly true. There are two AMKs involved: a first key is wrapped by the root key stored in the CloudHSM and a second key is wrapped by the customer key in KMS. Unwrapping and combining these two keys leads to the composed account master key, which then wraps and unwraps all underlying keys in the hierarchy (table master keys, result master keys, etc.).

Account master key - Customer-Managed Keys

Figure 2: Account master key composed of AMK-S and AMK-C. AMK-C is wrapped by KMS.

Figure 2 shows this concept in detail. With customer-managed keys, the AMK is composed of two keys: AMK-S and AMK-C. AMK-S is a random 256-bit key that is wrapped with the root key stored in HSM. AMK-C is a second random 256-bit key that is wrapped with the customer key stored in KMS. AMK-S and AMK-C are completely random and unrelated. Both wrapped keys are stored in Snowflake’s encryption key hierarchy.

Figure 3: Unwrapping and composing of AMK.

When the customer runs a query in Snowflake that requires access to customer data, the composed AMK is produced as follows (see Figure 3). Both wrapped keys, AMK-S and AMK-C, are retrieved from the encryption key hierarchy. AMK-S is unwrapped using the root key in HSM. AMK-C is unwrapped using the customer key in KMS. The KMS audit log logs an access event to the customer key. Both unwrapped 256-bit keys are combined using XOR to form the composed AMK. The composed AMK is then used to unwrap the underlying table master keys to access the customer data.

The composed AMK is cached within the Snowflake data warehouse for performance reasons. This cache has a timeout period after which the cached AMK is not accessible anymore. The cache is refreshed in the background such that continuous queries are not impacted by any latency to KMS. If access to KMS is revoked, refreshing the cache fails and the AMK is removed from the cache immediately. Any running queries are aborted. New queries fail to start because no AMK can be composed. The customer’s data can no longer be decrypted by the Snowflake service.

Summary

Customer-managed keys provide an extra level of security for customers with sensitive data. With this feature, the customer manages the encryption key themselves and makes it accessible to Snowflake. If the customer decides to disable access, data can no longer be decrypted. In addition, all running queries are aborted. This has the following benefits for customers: (a) it makes it technically impossible for Snowflake to comply with requests for access to customer data, (b) the customer can actively mitigate data breaches and limit data exfiltration, and (c) it gives the customer full control over data lifecycle.

Availability

Customer-managed keys are a primary component of Tri-Secret Secure, a Snowflake Enterprise Edition for Sensitive Data (ESD) feature. To enable Tri-Secret Secure for your ESD account, you need to first create a key in AWS KMS (in your AWS account) and then contact Snowflake Support.

Acknowledgements

We want to thank Difei Zhang for his contributions to this project.

For more information, please feel free to reach out to us at info@snowflake.net. We would love to help you on your journey to the cloud. And keep an eye on this blog or follow us on Twitter (@snowflakedb) to keep up with all the news and developments here at Snowflake Computing.

Snowflake: Seriously Serious About Security

How Serious Are We About Security? Extremely.

No self-respecting security team is ever satisfied with the existing security controls it has in place. Some mistake this dissatisfaction as a personality disorder, referring to their security team members as “control-freaks” or “over-achievers”. Let’s face it: security professionals tend to be an eccentric group. However, for a truly committed and competent security team, this eccentricity is simply the symptom of the healthy paranoia that comes with being responsible for the protection of vital infrastructure and sensitive data.

Snowflake’s security team has channeled this paranoia into a program we call Seriously Serious Security Testing. There are several components of this program, including the audit of all the usual administrative and technical controls you would expect from a cloud company. However, where Snowflake’s eccentricity truly surfaces is in our embrace of the dreaded Penetration Test. Here are the highlights of Snowflake’s security testing program and the key role that penetration testing plays.

First: What is a Penetration Test?

A penetration test is a controlled attempt to exploit vulnerabilities to determine whether unauthorized access or other malicious activity is possible within the target environment. This is a requirement for PCI-DSS compliance, and is also considered best practice for any organization that takes security seriously. Snowflake engages globally-recognized experts to perform this activity within specific constraints and guidelines as described in the Methodology section below.

Frequency

Most companies avoid penetration tests altogether. Others perform them annually at best, which is the minimum frequency required to meet the standards and certifications their auditors tell them they need. What many auditors don’t challenge is whether or not adequate penetration testing has been performed after every “material change” to the company’s product or infrastructure. It’s unlikely that performing penetration tests annually would be sufficient in a cloud environment where most vendors take pride in the frequent deployment of new features and functionality (Snowflake is no different in this regard with releases several times a month, at least). Because of these frequent changes, it’s important to ensure your cloud vendors are performing frequent penetration testing to ensure no new vulnerabilities have inadvertently been introduced.

Security Penetration Test - Frequency

source: https://www.rapid7.com/globalassets/_pdfs/whitepaperguide/rapid7-research-report-under-the-hoodie.pdf

Much to the irritation of our Operations and Engineering teams, Snowflake has performed more than 5 penetration tests in the past 6 months.

Why would we do this to ourselves? Because we want to know what our weaknesses are! The frequency with which we perform these tests provides Snowflake with the assurance that changes to the Snowflake service, as well as newly discovered vulnerabilities within other components of our environment, are not putting Snowflake or (more importantly) Snowflake’s customers and their data at risk.

Methodology

Another example of Snowflake Security’s paranoia is the approach we take with our penetration testers. Typical penetration testing engagements at Snowflake are designed to simulate the compromise of an employee’s or customer’s credentials by providing the tester with limited access to a non-production environment. Engagements run a minimum of two weeks and begin with providing the testers not only with the aforementioned credentials, but also with substantial information about the architecture, network design, and, when applicable, our application code. (This method is sometimes referred to as White Box Testing.) If, after a specific period of time, the testers have not been able to find an entry point, Snowflake gradually provides the testers with slightly more access until they are able to uncover vulnerabilities, or until the time is up.

Why would we divulge so much information? We want to know what ALL our weaknesses are! This provides us with visibility into what would happen if, for example, we had an insider attempting to gain unauthorized access to data. How far would they get? How quickly could we detect them? How would we contain them? And so on. The information is invaluable.

Security Penetration Test - Vulnerabilities

Most common vulnerabilities found by penetration testers

source: https://www.rapid7.com/globalassets/_pdfs/whitepaperguide/rapid7-research-report-under-the-hoodie.pdf

Transparency

The final example of Snowflake’s Seriously Serious Security Testing program is the highly unusual practice of sharing penetration test reports and remediation documentation with qualified prospects and customers (under NDA, transmitted securely, and with the promise of their first born if there is a compromise). By sharing our reports we are able to solicit additional feedback on ways to improve our testing.

I’ve been on both sides of the audit fence for years, and I’ve yet to find an organization as willing to share as much information about its penetration testing frequency and methodology as Snowflake. However, it comes as no surprise to anyone who has worked with Snowflake. Snowflake’s corporate culture is based on teamwork and collaboration, which spills over into Snowflake’s relationships with customers and vendors. We believe that transparency is the cornerstone to trust, and trust is the cornerstone to a healthy partnership between Snowflake and our customers. Providing the penetration test report and remediation evidence allows customers to see for themselves how seriously we take security, as well as how effective we are at achieving it. This allows our customers and prospects to make an informed decision about the risks they’re taking.

Conclusion

Security is a constantly moving target. Our team will never stop this extreme security testing of our infrastructure because threats are constantly evolving.

So…
Call us control freaks.
Call us over-achievers.
Call us paranoid.

One thing you’ll never call us is complacent…seriously.

For more information, please feel free to reach out to us at info@snowflake.net. We would love to help you on your journey to the cloud, securely. And keep an eye on this blog or follow us on Twitter (@snowflakedb) to keep up with all the news and happenings here at Snowflake Computing.