New Snowflake Features Released in H2’17

Looking back at the first half of this year, the main theme at Snowflake has been centered around scaling – the team, the service and our customer base. We introduced and improved the self-service onboarding experience, grew our ecosystem by collaborating with major technology partners and SIs and invested in making our service faster, more secure and easier to use.

In the summer, we announced the general availability of Snowflake’s Instant Data Sharing. As we continue to grow, this will provide the foundation for data-driven organizations of all sizes to instantly exchange data, create new insights and discover new revenue streams.

In the second half of this year, we started a journey to evolve the service in substantial ways and this will continue throughout 2018. 

Instant elasticity, per-second billing and 4XL virtual warehouses
Imagine you have instant access to performance and scalability – and truly pay only for what you use.

Many cloud DW solutions make the claim in the headline above. Nonetheless, taking a closer look exposes fine differences with the degree of scalability and concurrency supported, overall user experience and the types of workloads a user can run at scale. One of our key investments in this area resulted in our instant elasticity feature:

  • The introduction of per-second pricing represents one crucial building block. Now, our customers are billed to the second instead of a full hour when running workloads in Snowflake.
  • We also improved our provisioning algorithms, using past usage data to better serve compute capacity for future use.
  • We added support for our new 4XL virtual warehouses, doubling the maximal compute configuration and allowing our customers to tackle their most challenging workloads and performance SLAs.

Based on customer feedback, the decrease in wait time from minutes to seconds is a huge benefit. You can now ensure the right performance for your executive dashboards and not worry anymore about a full-hour charge. You can also scale your load instantly and have fast access to the most current data.

Improving performance and SQL programmability
Our ongoing mission is to build the fastest data warehouse with the SQL you love.

Throughout the last quarter, we released a wide range of SQL capabilities that address database migration challenges and increase both compatibility with SQL ANSI and overall SQL programmability:

  • Support for JavaScript Table UDFs (in public preview).
  • Support for Double-Dot notation for specifying database objects to ease migration from on-premise database systems.
  • Support for Table Literals.
  • INSERT OVERWRITE: Allowing to rebuild tables to ease migration from Hadoop systems.
  • LEAD/LAG function enhancement: Ignoring NULL values.
  • SAMPLE/TABLESAMPLE – announced general availability.
  • We introduced functions to estimate frequent values in near-linear time.
  • More flexibility and support to configure start of the week and week of the year policy.

In addition, we have been working on substantial performance improvements that address ease-of-use and automation. Please stay tuned for major announcements in early 2018.

Staying ahead with enterprise-ready security and compliance
From day one, security has always been core to Snowflake’s design.

  • Programmatic SSO support – Now, a Snowflake user can leverage browser-assisted single-sign-on programmatically through our Snowflake ODBC, JDBC, and Python drivers and our command line tool, SnowSQL. In contrast to some of our competitors, you don’t need to write code to connect to your SAML 2.0-compliant IDP. (Programmatic SSO support is available in our Enterprise Edition).
  • PrivateLink support for Snowflake – We worked with AWS to integrate PrivateLink with Snowflake. With PrivateLink, Snowflake users can now connect to Snowflake by bypassing the public internet. No cumbersome proxies need to be set up between Snowflake and their network. Users have full control over egress traffic.

Improving our ecosystem and data loading
Enabling developers and builders to onboard new workloads and create applications with their favorite tools, drivers and languages remains a top priority.

Snowpipe
We recently announced Snowpipe, which represents an important milestone in many ways. First, users can now efficiently address continuous loading use cases for streaming data. Secondly, with Snowpipe, we introduced the first service in addition to the core data warehousing service. Snowpipe is serverless so there’s no need to manage a virtual warehouse for data loading into Snowflake. Finally, we continue to deeply integrate with the entire AWS stack: Snowpipe allows customers to take advantage of AWS S3 event notifications to automatically trigger Snowflake data loads into target tables. You can read more about Snowpipe here: Overview and First Steps.

Broader Ecosystem
While we continue adding functionality and enriching Snowflake with additional services, we strongly believe in freedom of choice. Instead of preaching a unified platform and a one-solution-fits-all philosophy, our key objectives are to make integrations easier and with improved performance.

  • For enterprise-class ETL, data integration and replication:
    • Added PowerCenter 10.2 support for Snowflake.
    • Added  support for the ETL use case with AWS Glue – build modern and performing ETL jobs via our Spark connector, pushing compute into Snowflake.  
  • For data analytics and data science:
    • Added a native Snowflake connector for Alteryx supporting in-DB components.
    • Upgraded our Snowflake dplyR library.  
  • For  business intelligence (BI):  
    • Added a native Snowflake connector for ChartIO ( in preview).
  • For parallel data loading & unloading via the COPY command, developers can now:
    • Load non-UTF-8 character-encoded data.
    • Unload to the Parquet file format.

Increasing transparency and usability
These features are designed to strike the right balance between offering a service that is easy to operate and exposing actionable insights into the service itself.

  • Resource monitors can now be created through the Snowflake UI. As a user creates resource monitors, they can also set up notifications. Notifications for “suspend” and “suspend immediately” actions are sent out when quota thresholds are reached for a particular resource monitor. In addition, we added  support for up to 5 warnings that can be defined for each resource monitor.
  • New MONITOR USAGE privilege to grant access to review billing and usage information via SQL or through the Snowflake UI. This provides non-account administrators with the ability to review information about data stored in databases and stages, as well as provide insight into warehouse usage.
  • For improved usability and convenience, we added a new option (COPY GRANTS) which allows users to preserve or copy grants as part of executing the CREATE OR REPLACE TABLE, CREATE OR REPLACE VIEW, CREATE TABLE LIKE, and CREATE TABLE CLONE variations.  
  • We also completely overhauled our worksheet and improved the overall experience. We are going to roll these changes out in stages in the coming weeks, with much more to come in 2018.

Scaling and investing in service robustness
These service enhancements aren’t customer visible, but are crucial for scaling to meet the demands of our rapidly growing base of customers.

  • Our global expansion continues with a new Snowflake region in Sydney, Australia.
  • We introduced a new product edition: Virtual Private Snowflake (VPS). With VPS, customers get a dedicated and managed instance of Snowflake within a separate dedicated AWS VPC, with all the performance, simplicity and concurrency inherent to Snowflake. This also includes completely dedicated metadata services isolated from the metadata activity of other Snowflake customers.
  • We continued our investments in system stability, improving and strengthening various components of our cloud services layer.

Conclusion

We are well underway to reach 1000 customers by the end of 2017, with tens of compressed PBs stored in Snowflake and millions of data processing jobs successfully executed daily. We grew to over 300 Snowflakes and expanded our global presence. 2017 has been truly a defining year in Snowflake’s young history: Self-service, instant Data Sharing, broadening our ecosystem, instant elasticity, Snowpipe, VPS and PrivateLink for Snowflake.  We look forward to an exciting 2018 as we continue to help our customers solve their data challenges at scale.

For more information, please feel free to reach out to us at info@snowflake.net. We would love to help you on your journey to the cloud. And keep an eye on this blog or follow us on Twitter (@snowflakedb) to keep up with all the news and happenings here at Snowflake Computing.

PrivateLink for Snowflake: No Internet Required

Improve Security and Simplify Connectivity with PrivateLink for Snowflake

AWS recently announced PrivateLink, the newest generation of VPC Endpoints that allows direct and secure connectivity between AWS VPCs, without traversing the public Internet. We’ve been working closely with the AWS product team to integrate PrivateLink with Snowflake and we’re  excited to be among the first launch partners. By integrating with PrivateLink, we allow customers with strict security policies to connect to Snowflake without exposing their data to the Internet. In this blog post, we’ll highlight how PrivateLink enhances our existing security capabilities, and how customers can easily set up PrivateLink with Snowflake.

Snowflake is an enterprise-grade, cloud data warehouse with a unique, multi-cluster, shared data architecture purpose-built for the cloud. From day one, security has been a central pillar of Snowflake’s architecture, with advanced security features baked into the solution. Customers get varying levels of security from Snowflake’s five different product editions: Standard, Premier, Enterprise, Enterprise for Sensitive Data (ESD) and Virtual Private Snowflake (VPS).

Across all editions, Snowflake provides a secure environment for customer data, protecting it in-transit and at rest. All customer data is encrypted by default using the latest security standards and best practices, and validated by compliance with industry-standard security protocols. In addition, customers have access to a host of security features and data protection enhancements such as IP whitelisting, role-based access control, and multi-factor authentication.

As shown in figure 1 below, Snowflake’s multi-tenant service runs inside a Virtual Private Cloud (VPC), isolating and limiting access to its internal components. Incoming traffic from customer VPCs is routed through an Elastic Load Balancer (ELB) to the Snowflake VPC.

For customers working with highly sensitive data or with specific compliance requirements, such as HIPAA and PCI, Snowflake offers Enterprise for Sensitive Data (ESD). With ESD edition, customer data is encrypted in transit across all networks including within Snowflake’s own VPC. ESD customers also benefit from additional security features such as Tri-Secret Secure, giving them full control over access to their data. See figure 2 below.

Earlier this year, we also introduced a private, single-tenant version of the Snowflake service – Virtual Private Snowflake. VPS, which is the most advanced and secure edition of Snowflake, includes all features of ESD and addresses the specific needs of regulated companies such as those in the financial industries. With VPS, customers get a dedicated and managed instance of Snowflake within a separate, dedicated VPC. Additionally, VPS customers can use secure proxies for egress traffic control to minimize risks associated with their internal users and systems communicating with unauthorized external hosts, as shown in figure 3 below:

But we recognize that a key area of concern for some customers has been around how data is sent from their private subnet to Snowflake. These customers need to enforce restrictive firewall rules on egress traffic. Others have restrictive policies about their resources accessing the Internet at all. So, how do you send data without allowing unrestricted outbound access to the public Internet and without violating existing security compliance requirements?

Enter AWS PrivateLink: a purpose-built technology that enables direct, secure connectivity among VPCs while keeping network traffic within the AWS network. Using PrivateLink, customers can connect to Snowflake without going over the public Internet, and without requiring proxies to be setup between Snowflake and their network as a stand-in solution for egress traffic control. Instead, all communication between the customer VPC and Snowflake is performed within the AWS private network backbone.

Snowflake leverages PrivateLink by running its service behind a Network Load Balancer (NLB) and shares the endpoint with customers’ VPCs. The Snowflake endpoint appears in the customer VPC, enabling direct connectivity to Snowflake via private IP addresses. Customers can then accept the end point and choose which of their VPCs and subnets to have access to Snowflake. This effectively allows Snowflake to function like a service that is hosted directly on the customer’s private network. Figures 4 and 5 show PrivateLink connectivity from customer VPCs to Snowflake in both multi-tenant (ESD) and single-tenant (VPS) scenarios.

Additionally, customers can access PrivateLink endpoints from their on-premise network via AWS Direct Connect, allowing them to connect all their virtual and physical environments in a single, private network. As such, Direct Connect can be used in conjunction with PrivateLink to connect customer’s datacenter to Snowflake. See figure 6 below.

Snowflake already delivers the world’s most secure data warehouse built for the cloud. Our ESD and VPS product editions are designed to address the highest security needs and compliance requirements of organizations large and small. With PrivateLink, we’re taking that a step further by allowing our customers to establish direct and private connectivity to Snowflake, without ever exposing their data to the public Internet.

PrivateLink is available to all Snowflake customers with ESD and VPS product editions. You can visit our user guide for instructions on how to get started with PrivateLink.

You can also try Snowflake for free. Sign up and receive $400 US dollars worth of free usage. You can create a sandbox or launch a production implementation from the same Snowflake environment.