Nov 29, 2017
For anyone who harbors a love-hate relationship with data loading, it’s time to tip the scales.
We all know data can be difficult to work with. The challenges start with the varying formats and complexity of the data itself. This is especially the case with semi-structured data such as JSON, Avro and XML, and it continues with the significant programming skills needed to extract and process data from multiple sources. Making matters worse, traditional on-premise and cloud data warehouses require batch loading of data (with limitations on the size of data files ingested) and huge manual efforts to run and manage servers.
The results? Poor, slow performance and the inability to extract immediate insights from all your data. Data scientists and analysts are forced to wait days or even weeks before they can use the data to develop accurate models, spot trends and identify opportunities. Consequently, executives don’t get the necessary up-to-minute insights to make real-time decisions with confidence and speed.
Common problems that affect data loading include:
- Legacy architecture – Tightly coupled storage and compute necessitate contention with queries as data is loading.
- Stale data – Batch loading prevents organizations from acquiring instant, data-driven insight.
- Limited data – Lack of support for semi-structured data requires transforming newer data types and defining a schema before loading, which introduces delays.
- Manageability – Dedicated clusters or warehouses are required to handle the loading of data.
- High-maintenance – Traditional data warehouse tools result in unnecessary overhead in the form of constant indexing, tuning, sorting and vacuuming.
These obstacles all point to the need for a solution that allows continuous data loading without impacting other workloads, without requiring the management of servers and without crippling the performance of your data warehouse.
Introducing Snowpipe, our continuous, automated and cost-effective service that loads all of your data quickly and efficiently without any manual effort. How does Snowpipe work?
Snowpipe automatically listens for new data as it arrives in your cloud storage environment and continuously loads it into Snowflake. With Snowpipe’s unlimited concurrency, other workloads are never impacted , and you benefit from serverless, continuous loading without ever worrying about provisioning. That’s right. There are no servers to manage and no manual effort is required. Snowpipe makes all this happen automatically.
The direct benefits of Snowpipe’s continuous data loading include:
- Instant insights – Immediately provide fresh data to all your business users without contention.
- Cost-effectiveness – Pay only for the per-second compute utilized to load data rather than running a warehouse continuously or by the hour.
- Ease-of-use – Point Snowpipe at an S3 bucket from within the Snowflake UI and data will automatically load asynchronously as it arrives.
- Flexibility – Technical resources can interface directly with the programmatic REST API, using Java and Python SDKs to enable highly customized loading use cases.
- Zero management – Snowpipe automatically provisions the correct capacity for the data being loaded. No servers or management to worry about.
Snowpipe frees up resources across your organization so you can focus on analyzing your data, not managing it. Snowpipe puts your data on pace with near real-time analytics. At Snowflake, we tip the scales on your love-hate relationship with data so you can cherish your data without reservation.
Read more about the technical aspects of Snowpipe on our engineering blog. For an in-depth look at Snowpipe in action, you can also join us for a live webinar on December 14th.
Try Snowflake for free. Sign up and receive $400 US dollars worth of free usage. You can create a sandbox or launch a production implementation from the same Snowflake environment.