What makes Snowflake a data warehouse?

One of the most common questions I get when speaking to people about Snowflake is: “Why do you call it a data warehouse and not a database?” This is a very reasonable question given some of the characteristics of Snowflake.

At Snowflake, in part, we say we are a full relational database management system (RDBMS) built for the cloud. We are ACID compliant and we support standard SQL. Sounds like a database to me, too. Let’s take a closer look just to be sure.

What is a database?

A database is a collection of information organized to be easily accessed, managed and updated. While there are many types of databases available today, the most common is an RDBMS. But when most folks say “database”, they usually mean a traditional RDBMS that handles Online Transaction Processing (OLTP).

So, what are some of the defining characteristics of an OLTP database?

  • Designed for rapid storage and retrieval of small sets of current data records in support of transactions and interactions within an enterprise.
  • Data is organized in tables and columns, allowing users access via structured query language (SQL).
  • Handles quick, real-time activity such as entering a customer name, recording a sale and recording all accounting activity of that sale.
  • Works well for basic operational reporting of a limited number of records. Analytic reporting is relegated to simple, static reports often driven by IT.

What is a data warehouse?

Some of the defining characteristics of a data warehouse are:

  • A database designed to store and process large volumes of current and historical data collected from multiple sources inside and outside the enterprise for deep analysis.
  • Organizes data into tables and columns, and allows users access via SQL.
  • Optimized for loading, integrating and analyzing very large amounts of data.
  • Designed to support descriptive, diagnostic, predictive and prescriptive analytic workloads.

Snowflake definitely includes the overlapping characteristics of both a database and a data warehouse-ACID compliant, support for standard SQL, etc. But Snowflake also embodies all of the defining characteristics of a data warehouse.

One of the key differentiators of Snowflake, from other solutions, is that it’s specifically designed for data warehousing and high speed analytic processing. Rather than a generalized SQL database that has been “tuned” or even adapted to handle these type of workloads, Snowflake was built from the ground up for the cloud to optimize loading, processing and query performance for very large volumes of data. Therefore, hands down, Snowflake is a data warehouse.

So, why do we still need a specialized data warehouse engine?

As OLTP databases have been able to scale higher and innovations like in-memory databases have emerged, some organizations have questioned whether they still need a separate technology or specialized system for reporting and analytics. The answer, again, requires us to look at the basics: What benefits emerge from storing and analyzing data in a separate system?

  1. It eases the burden of reporting from transactional systems by removing the contention for limited and expensive resources.
  2. It produces more business-friendly data results by allowing the data to be restructured to a more suitable format.
  3. It provides access to a wider array of reports more quickly because all the resources in the data warehouse are dedicated to reporting and analysis.
  4. It integrates valuable data from across the enterprise for richer insight. Something that can’t (and shouldn’t) be done in an OLTP system.

For more information on how you can up your data warehousing game with a modern, built-for-the-cloud approach, check out some of our free resources such as our ebook The Data Warehouse: The Engine That Drives Analytics. We would love to help you on your journey to the cloud so keep an eye on this blog or follow us on Twitter (@snowflakedb and @kentgraziano) to keep up with all the news and happenings here at Snowflake.

Snowflake Vision Emerges as Industry Benchmark

Technology research and analysis firm Gigaom has ranked Snowflake as the #1 cloud data warehouse in a recent study. We surpassed enterprise data warehouse products including, Google BigQuery, Teradata, IBM dashDB, HPE Vertica, Microsoft Azure SQL, SAP HANA and Oracle Exadata. Snowflake emerged with a top score of 4.85 out of a possible 5.0. The competition averaged a score of 3.5. The six “disruption vectors” Gigaom used as its key scoring criteria are congruent with what we wanted to achieve back in the summer of 2012, when we started Snowflake.

But long before we wrote the first line of Snowflake code, we asked one another: “What should a data warehouse deliver that no other product has before? How can we enable organizations to make the best, data-driven decisions? And how will the world’s most powerful data warehouse help organizations achieve their existing goals and help reveal their future goals?” We then set out to answer those questions.

We wanted to enable organizations to easily and affordably store all of their data in one location, and make that data accessible to all concurrent users without degrading performance. We also wanted Snowflake to scale infinitely, with ease, and cost effectively so organizations would only pay for the compute and storage they used. And the product had to work with the tools that users already knew and loved. Finally, we wanted a data warehouse that required zero management by our customers – nothing to tweak, no tuning required. These defining qualities aligned with the new world of cloud services, and they are what formed the foundation of Snowflake.

What’s happened since the early days of Snowflake? We got to work, and we stuck to hiring the best engineers the world has to offer. We built Snowflake from the ground up, for the cloud, and incorporated all of these elements as the core of the product. In early 2015, we offered the first commercial version of Snowflake – the one and only data warehouse built for the cloud. Since then, our engineering team has added more and more industry-leading capabilities to Snowflake, leapfrogging the traditional data warehouse vendors.

Along the way, we’ve hired high-calibre teams to execute the sales, marketing and finance functions of the company so our customers and partners get the highest value from working with Snowflake. We also built a great customer support organization, providing the level of service our users love. In more recent times, we’ve expanded operations outside of North America to Europe, with Asia-Pacific and other regions coming online soon. We’ve also added Snowflake On Demand™ – the easiest way to get started with Snowflake by simply signing up on our website with just a credit card. All of these efforts over the past four years have led to Snowflake’s most recent inflection point – being chosen as the number one cloud data warehouse.

What does all this mean? Snowflake’s current and future customers have every opportunity to explore all of their data in ways they never thought possible. They can gain the insight, solve the problems and create the opportunities they simply couldn’t with their previous data platforms. We committed to building the world’s best data warehouse – the only data warehouse built for the cloud. Our customers, our partners and now the industry have indicated we’ve likely achieved what we set out to do back in the summer of 2012. Going forward, we’ll continue to serve our customers and partners with the best technology, the best solutions and the best services available.

Read the full report >

Migrating to the Cloud? Why you should start with your EDW

Many organizations we engage with are seriously considering transforming their business and moving some (or all) of their IT operations into the cloud. A lot of executives I have encountered are struggling with the same question: “How do I get started?” There is a strong case to be made that starting with your Enterprise Data Warehouse (EDW), or at least a data mart, is the fastest, and most risk-free path, with added upside potential to increase revenue and set you up for future growth. As operational data volumes continue to grow at exponential rates, it’s not a matter of if you go to the cloud to manage your enterprise data, but when.

Before going too far on your cloud journey, I would recommend an exercise in segmenting your business from an IT perspective in a very simple way. To get you started, let me suggest five possible categories, along with some risks to consider for each:

  • Customer-facing Applications – This is the heart and soul of your business. If something goes wrong, you lose business and revenue, and people potentially get fired. Risk: HIGH
  • Internal Applications – Mail, Payroll, General Ledger, AP, AR, things like that. Every person inside the organization relies on at least one of these services, and a lot of analysis needs to take place to figure out all the integration points to ensure nothing gets missed during a migration to the cloud. Risk: HIGH
  • Desktop/Laptop OS and Applications – There are whole books and schools of thought about how to migrate these, which means it’s a big decision and a big deal. Impacting everyone in the company on your first cloud initiative? Risk: HIGH
  • Operations Monitoring and Alerting – Got a Network Operation Center (NOC)? These guys are integrated with every system that is important, so moving them to the cloud could be a large undertaking. Risk: HIGH
  • Reporting and Analytics – Hmmm….if my constituents don’t get their weekly or monthly reports on time, is that a disaster? Can they get by with a small outage during the migration? Risk: LOW

Starting with the Data

Let’s take a closer look at why starting your cloud journey with your EDW could be a viable option, and even have some benefits that could help sell the idea (of the cloud) internally. In no particular order, I would highlight these points:

  • Doesn’t disrupt the business – Many EDW implementations are not mission critical today (as compared to enterprise applications). As more data becomes available through social media or Internet of Things (IOT) applications, businesses need access to much larger volumes of data and they will want access to it earlier in the data pipeline. Traditional DWs contain aggregations and are used for doing trend analysis, analyzing data over a period of time to make strategic, rather than tactical decisions. They are not architected to handle this new influx of raw data in a cost-effective manner. By starting your cloud journey with the EDW, you reduce risk (by going to a more flexible architecture) while getting your team early exposure to working with cloud services.
  • Doesn’t disrupt internal users – When moving to the cloud, you want to show incremental success and don’t want to add a lot of unnecessary risk. It’s simple to keep running your existing EDW in parallel with your new cloud DW, giving you a built-in fall-back plan for the early stages. Or you may decide to start with a small data mart as a pilot project.
  • Start-up costs are a fraction of on-premises, appliance solutions – Some of our customers invested as much as $10 million (or more) years ago on a data warehouse appliance that is now outdated technologically. And the renewal costs to keep that tech going are coming due. If they re-invest another huge sum of money, this will delay them getting to the cloud by another 4-5 years, putting them behind their competition. Rather than outlaying a large capital expenditure to extend the life of the older technology, it may make better sense to move to the cloud. The cloud offers a utility-based model, allowing you to pay for what you use and when you use it, as opposed to what you think you are going to need 2-3 years in the future. As a result, not only is the cost of entry lower, but you are not risking a huge sum of money to make the move.
  • Data is growing at an exponential rate – Will you ever have less data to worry about in your business? If you plan on being successful, I don’t think so. Many organizations are looking at new and different ways to manage and analyze ever-increasing volumes of data coming in various formats from multiple sources (such as semi-structured web logs). Your current on-premises EDW was not designed for this kind of workload or data.  If you are considering changing infrastructure platforms to accommodate it, why not select tools that were built for today’s modern data challenges instead of legacy-based architectures? Moving to the cloud also gives you the opportunity to consolidate operations and streamline business processes.
  • Enable new capability – There are some new analytic paradigms happening in the cloud (such as machine learning). Cloud-based platforms allow you to work with both detailed and aggregated data at scales never imaged (see the case study about DoubleDown as an example). Need to run a complex analytic job with a 256-node Massively Parallel Processing (MPP) cluster for an hour, and then shut it down? No problem. Can your platform support a thousand users without concurrency issues?  How would that change your business if it could dynamically adjust to handle those new demands?

As with any infrastructure move, the benefits have to be clear enough that the status quo mentality can be overcome and analysis paralysis doesn’t push out your journey to the cloud for months or even years. The beauty of the cloud model is that it is easy to start small and scale without risking a huge investment up front. Every business needs some proof before committing time and resources to move anything to the cloud and your EDW is a perfect candidate. Snowflake is the first and only EDW built for the cloud to be truly elastic for all of your analytic and big data needs.

Please feel free to reach out to us at info@snowflake.net. We would love to help you on your journey to the cloud. And keep an eye on this blog or follow us on Twitter (@snowflakedb) to keep up with all the news and happenings here at Snowflake Computing.