New Snowflake Features Released in H2’17

Looking back at the first half of this year, the main theme at Snowflake has been centered around scaling – the team, the service and our customer base. We introduced and improved the self-service onboarding experience, grew our ecosystem by collaborating with major technology partners and SIs and invested in making our service faster, more secure and easier to use.

In the summer, we announced the general availability of Snowflake’s Instant Data Sharing. As we continue to grow, this will provide the foundation for data-driven organizations of all sizes to instantly exchange data, create new insights and discover new revenue streams.

In the second half of this year, we started a journey to evolve the service in substantial ways and this will continue throughout 2018. 

Instant elasticity, per-second billing and 4XL virtual warehouses
Imagine you have instant access to performance and scalability – and truly pay only for what you use.

Many cloud DW solutions make the claim in the headline above. Nonetheless, taking a closer look exposes fine differences with the degree of scalability and concurrency supported, overall user experience and the types of workloads a user can run at scale. One of our key investments in this area resulted in our instant elasticity feature:

  • The introduction of per-second pricing represents one crucial building block. Now, our customers are billed to the second instead of a full hour when running workloads in Snowflake.
  • We also improved our provisioning algorithms, using past usage data to better serve compute capacity for future use.
  • We added support for our new 4XL virtual warehouses, doubling the maximal compute configuration and allowing our customers to tackle their most challenging workloads and performance SLAs.

Based on customer feedback, the decrease in wait time from minutes to seconds is a huge benefit. You can now ensure the right performance for your executive dashboards and not worry anymore about a full-hour charge. You can also scale your load instantly and have fast access to the most current data.

Improving performance and SQL programmability
Our ongoing mission is to build the fastest data warehouse with the SQL you love.

Throughout the last quarter, we released a wide range of SQL capabilities that address database migration challenges and increase both compatibility with SQL ANSI and overall SQL programmability:

  • Support for JavaScript Table UDFs (in public preview).
  • Support for Double-Dot notation for specifying database objects to ease migration from on-premise database systems.
  • Support for Table Literals.
  • INSERT OVERWRITE: Allowing to rebuild tables to ease migration from Hadoop systems.
  • LEAD/LAG function enhancement: Ignoring NULL values.
  • SAMPLE/TABLESAMPLE – announced general availability.
  • We introduced functions to estimate frequent values in near-linear time.
  • More flexibility and support to configure start of the week and week of the year policy.

In addition, we have been working on substantial performance improvements that address ease-of-use and automation. Please stay tuned for major announcements in early 2018.

Staying ahead with enterprise-ready security and compliance
From day one, security has always been core to Snowflake’s design.

  • Programmatic SSO support – Now, a Snowflake user can leverage browser-assisted single-sign-on programmatically through our Snowflake ODBC, JDBC, and Python drivers and our command line tool, SnowSQL. In contrast to some of our competitors, you don’t need to write code to connect to your SAML 2.0-compliant IDP. (Programmatic SSO support is available in our Enterprise Edition).
  • PrivateLink support for Snowflake – We worked with AWS to integrate PrivateLink with Snowflake. With PrivateLink, Snowflake users can now connect to Snowflake by bypassing the public internet. No cumbersome proxies need to be set up between Snowflake and their network. Users have full control over egress traffic.

Improving our ecosystem and data loading
Enabling developers and builders to onboard new workloads and create applications with their favorite tools, drivers and languages remains a top priority.

Snowpipe
We recently announced Snowpipe, which represents an important milestone in many ways. First, users can now efficiently address continuous loading use cases for streaming data. Secondly, with Snowpipe, we introduced the first service in addition to the core data warehousing service. Snowpipe is serverless so there’s no need to manage a virtual warehouse for data loading into Snowflake. Finally, we continue to deeply integrate with the entire AWS stack: Snowpipe allows customers to take advantage of AWS S3 event notifications to automatically trigger Snowflake data loads into target tables. You can read more about Snowpipe here: Overview and First Steps.

Broader Ecosystem
While we continue adding functionality and enriching Snowflake with additional services, we strongly believe in freedom of choice. Instead of preaching a unified platform and a one-solution-fits-all philosophy, our key objectives are to make integrations easier and with improved performance.

  • For enterprise-class ETL, data integration and replication:
    • Added PowerCenter 10.2 support for Snowflake.
    • Added  support for the ETL use case with AWS Glue – build modern and performing ETL jobs via our Spark connector, pushing compute into Snowflake.  
  • For data analytics and data science:
    • Added a native Snowflake connector for Alteryx supporting in-DB components.
    • Upgraded our Snowflake dplyR library.  
  • For  business intelligence (BI):  
    • Added a native Snowflake connector for ChartIO ( in preview).
  • For parallel data loading & unloading via the COPY command, developers can now:
    • Load non-UTF-8 character-encoded data.
    • Unload to the Parquet file format.

Increasing transparency and usability
These features are designed to strike the right balance between offering a service that is easy to operate and exposing actionable insights into the service itself.

  • Resource monitors can now be created through the Snowflake UI. As a user creates resource monitors, they can also set up notifications. Notifications for “suspend” and “suspend immediately” actions are sent out when quota thresholds are reached for a particular resource monitor. In addition, we added  support for up to 5 warnings that can be defined for each resource monitor.
  • New MONITOR USAGE privilege to grant access to review billing and usage information via SQL or through the Snowflake UI. This provides non-account administrators with the ability to review information about data stored in databases and stages, as well as provide insight into warehouse usage.
  • For improved usability and convenience, we added a new option (COPY GRANTS) which allows users to preserve or copy grants as part of executing the CREATE OR REPLACE TABLE, CREATE OR REPLACE VIEW, CREATE TABLE LIKE, and CREATE TABLE CLONE variations.  
  • We also completely overhauled our worksheet and improved the overall experience. We are going to roll these changes out in stages in the coming weeks, with much more to come in 2018.

Scaling and investing in service robustness
These service enhancements aren’t customer visible, but are crucial for scaling to meet the demands of our rapidly growing base of customers.

  • Our global expansion continues with a new Snowflake region in Sydney, Australia.
  • We introduced a new product edition: Virtual Private Snowflake (VPS). With VPS, customers get a dedicated and managed instance of Snowflake within a separate dedicated AWS VPC, with all the performance, simplicity and concurrency inherent to Snowflake. This also includes completely dedicated metadata services isolated from the metadata activity of other Snowflake customers.
  • We continued our investments in system stability, improving and strengthening various components of our cloud services layer.

Conclusion

We are well underway to reach 1000 customers by the end of 2017, with tens of compressed PBs stored in Snowflake and millions of data processing jobs successfully executed daily. We grew to over 300 Snowflakes and expanded our global presence. 2017 has been truly a defining year in Snowflake’s young history: Self-service, instant Data Sharing, broadening our ecosystem, instant elasticity, Snowpipe, VPS and PrivateLink for Snowflake.  We look forward to an exciting 2018 as we continue to help our customers solve their data challenges at scale.

For more information, please feel free to reach out to us at info@snowflake.net. We would love to help you on your journey to the cloud. And keep an eye on this blog or follow us on Twitter (@snowflakedb) to keep up with all the news and happenings here at Snowflake Computing.

PrivateLink for Snowflake: No Internet Required

Improve Security and Simplify Connectivity with PrivateLink for Snowflake

AWS recently announced PrivateLink, the newest generation of VPC Endpoints that allows direct and secure connectivity between AWS VPCs, without traversing the public Internet. We’ve been working closely with the AWS product team to integrate PrivateLink with Snowflake and we’re  excited to be among the first launch partners. By integrating with PrivateLink, we allow customers with strict security policies to connect to Snowflake without exposing their data to the Internet. In this blog post, we’ll highlight how PrivateLink enhances our existing security capabilities, and how customers can easily set up PrivateLink with Snowflake.

Snowflake is an enterprise-grade, cloud data warehouse with a unique, multi-cluster, shared data architecture purpose-built for the cloud. From day one, security has been a central pillar of Snowflake’s architecture, with advanced security features baked into the solution. Customers get varying levels of security from Snowflake’s five different product editions: Standard, Premier, Enterprise, Enterprise for Sensitive Data (ESD) and Virtual Private Snowflake (VPS).

Across all editions, Snowflake provides a secure environment for customer data, protecting it in-transit and at rest. All customer data is encrypted by default using the latest security standards and best practices, and validated by compliance with industry-standard security protocols. In addition, customers have access to a host of security features and data protection enhancements such as IP whitelisting, role-based access control, and multi-factor authentication.

As shown in figure 1 below, Snowflake’s multi-tenant service runs inside a Virtual Private Cloud (VPC), isolating and limiting access to its internal components. Incoming traffic from customer VPCs is routed through an Elastic Load Balancer (ELB) to the Snowflake VPC.

For customers working with highly sensitive data or with specific compliance requirements, such as HIPAA and PCI, Snowflake offers Enterprise for Sensitive Data (ESD). With ESD edition, customer data is encrypted in transit across all networks including within Snowflake’s own VPC. ESD customers also benefit from additional security features such as Tri-Secret Secure, giving them full control over access to their data. See figure 2 below.

Earlier this year, we also introduced a private, single-tenant version of the Snowflake service – Virtual Private Snowflake. VPS, which is the most advanced and secure edition of Snowflake, includes all features of ESD and addresses the specific needs of regulated companies such as those in the financial industries. With VPS, customers get a dedicated and managed instance of Snowflake within a separate, dedicated VPC. Additionally, VPS customers can use secure proxies for egress traffic control to minimize risks associated with their internal users and systems communicating with unauthorized external hosts, as shown in figure 3 below:

But we recognize that a key area of concern for some customers has been around how data is sent from their private subnet to Snowflake. These customers need to enforce restrictive firewall rules on egress traffic. Others have restrictive policies about their resources accessing the Internet at all. So, how do you send data without allowing unrestricted outbound access to the public Internet and without violating existing security compliance requirements?

Enter AWS PrivateLink: a purpose-built technology that enables direct, secure connectivity among VPCs while keeping network traffic within the AWS network. Using PrivateLink, customers can connect to Snowflake without going over the public Internet, and without requiring proxies to be setup between Snowflake and their network as a stand-in solution for egress traffic control. Instead, all communication between the customer VPC and Snowflake is performed within the AWS private network backbone.

Snowflake leverages PrivateLink by running its service behind a Network Load Balancer (NLB) and shares the endpoint with customers’ VPCs. The Snowflake endpoint appears in the customer VPC, enabling direct connectivity to Snowflake via private IP addresses. Customers can then accept the end point and choose which of their VPCs and subnets to have access to Snowflake. This effectively allows Snowflake to function like a service that is hosted directly on the customer’s private network. Figures 4 and 5 show PrivateLink connectivity from customer VPCs to Snowflake in both multi-tenant (ESD) and single-tenant (VPS) scenarios.

Additionally, customers can access PrivateLink endpoints from their on-premise network via AWS Direct Connect, allowing them to connect all their virtual and physical environments in a single, private network. As such, Direct Connect can be used in conjunction with PrivateLink to connect customer’s datacenter to Snowflake. See figure 6 below.

Snowflake already delivers the world’s most secure data warehouse built for the cloud. Our ESD and VPS product editions are designed to address the highest security needs and compliance requirements of organizations large and small. With PrivateLink, we’re taking that a step further by allowing our customers to establish direct and private connectivity to Snowflake, without ever exposing their data to the public Internet.

PrivateLink is available to all Snowflake customers with ESD and VPS product editions. You can visit our user guide for instructions on how to get started with PrivateLink.

You can also try Snowflake for free. Sign up and receive $400 US dollars worth of free usage. You can create a sandbox or launch a production implementation from the same Snowflake environment.

Financial Services: Welcome to Virtual Private Snowflake

Correct, consistent data is the lifeblood of the financial services industry. If your data is correct and consistent, it’s valuable. If it’s wrong or inconsistent, it’s useless and may be dangerous to your organization.

I saw this firsthand during the financial meltdown of 2007/08. At that time, I had been working in the industry for nearly 20 years as a platform architect. Financial services companies needed that “single source of truth” more than ever. To remain viable, we needed to consolidate siloed data sets before we could calculate risk exposure. Most financial services companies were on the brink of collapse. Those that survived did so because they had access to the right data.

At my employer, we looked for a way to achieve this single source with in-house resources, but my team and I quickly realized it would be an extraordinary challenge. Multiple data marts were sprawled across the entire enterprise, and multiple sets of the same data existed in different places, so the numbers didn’t add up. In a global financial services company, even a one percent difference can represent billions of dollars and major risk.

We ultimately built an analytics platform powered by a data warehouse. It was a huge a success. It was so successful that everybody wanted to use it for wide-ranging production use cases. However, it couldn’t keep up with demand, and no amount of additional investment would solve that problem.

That’s when I began my quest to find a platform that could provide universal access, true data consistency and unlimited concurrency. And for financial services, it had to be more secure than anything enterprises were already using. I knew the cloud could address most of these needs. However, even with the right leap forward in technical innovation, would the industry accept it as secure? Then I found Snowflake. But my story doesn’t end there.

I knew Snowflake, the company, was onto something. So, I left financial services to join Snowflake and lead its product team. Snowflake represents a cloud-first approach to data warehousing, with a level of security and unlimited concurrency that financial services companies demand.

We’ve since taken that a step further with Virtual Private Snowflake (VPS) – our most secure version of Snowflake. VPS gives each customer a dedicated and managed instance of Snowflake within a separate Amazon Web Services (AWS) Virtual Private Cloud (VPC). In addition, customers get our existing, best-in-class Snowflake security features including end-to-end encryption, at rest and in-transit. VPS also includes Tri-Secret Secure, which combines a customer-provided encryption key, a Snowflake-provided encryption key and user credentials. Together, these features thwart an attempted data decryption attack by instantly rendering data unreadable. Tri-Secret Secure also includes user credentials to authenticate approved users.

VPS is more secure than any on-premises solution and provides unlimited access to a single source of data without degrading performance. This means financial services companies don’t have to look at the cloud as a compromise between security and performance.

To find out more, read our VPS white paper and solution brief: Snowflake for Sensitive Data.