Data is Only Transformative with Transformative Technology

At the recent AWS re:Invent show in Las Vegas, The Cube host, Lisa Martin, had the chance to sit down with Bob Muglia, CEO and President of Snowflake. Bob shared his thoughts on Snowflake’s latest addition to its cloud-built data warehouse, Snowpipe, while looking back at Snowflake’s origins and ahead to its future in order to enable the data-driven enterprise.

What is Snowpipe, and how do customers get started with it?

Muglia: Snowpipe is a way of ingesting data into Snowflake in a streaming, continuous way. You simply drop new data that’s coming into S3 and we ingest it for you automatically. Snowpipe makes it simple to bring the data into your data warehouse on a continuous basis, ensuring that you’re always up-to-date and that your analysts are getting the latest insights and the latest data.

In the five years since you launched, how has the opportunity around cloud data warehousing changed? How has Snowflake evolved to become a leader in this space?

Muglia: If you go back five years, this was a timeframe where NoSQL was all the rage. Everybody was talking about how SQL was passé and something you’re not going to see in the future. Our founders had a different view. They had been working on true relational databases for almost 20 years, and they recognized the power of SQL and relational database technology. But they also saw that customers were experiencing significant limitations with existing technology. They saw in the cloud, and in what Amazon had done, the ability to build an all new database that takes advantage of the full elasticity and power of the cloud to deliver whatever analytics the business requires. However much data you want, however many queries you want to run simultaneously, Snowflake takes what you love about a relational database and allows you to operate in a very different way. Our founders had that vision five years ago and successfully executed on it. The product has worked beyond the dreams of our customers, and that response from our customers is what we get so excited about.

How did you identify what data should even be streamed to Snowpipe?

Muglia: As an example, in entertainment we’re experiencing a data explosion. You have streaming video data, subscription data, billing data, social media data and on and on. None of this is arriving in any sort of regular format. It’s coming as semi-structured data, like JSON or XML. Up until Snowflake came onto the scene with a truly cloud-based solution for data warehousing, everyone was struggling to wrangle all these data sets. Snowpipe lets you bring in multiple data sets, merge them in real-time and get the analytics back to your business in an agile way that’s never been seen before.

How does your partnership with AWS extend Snowflake’s capabilities?

Muglia: People don’t want their data scattered all over the place. With the cloud, with what Amazon’s done and with a product like Snowflake, you can bring all of your data together. That can change the culture of a company and the way people work. All of a sudden, data is not power. Data is available to everyone, and it’s democratized so every person can work with that data and help to bring the business forward. It can really change the dynamics around the way people work.

Tell us little bit about Snowflake’s collaboration with its customers. How are they helping to influence your future?

Muglia: As a company, we run Snowflake on Snowflake. All of our data is in Snowflake, all of our sales data, our financial data, our marketing data, our product support data and our engineering data. Every time a user runs a query, that query is logged in Snowflake and the intrinsics about it are logged. When you have a tool with the power of Snowflake, you can effectively answer any business question in just a matter of minutes. And that’s transformative to the way people work. And to me, that’s what it means to build a data-driven culture: The answers to business questions are inside what customers are doing and are encapsulated in the data.

Try Snowflake for free. Sign up and receive $400 US dollars worth of free usage. You can create a sandbox or launch a production implementation from the same Snowflake environment.

Deliveroo Delivers with Real-time Data

In a field of struggling food delivery startups, one notable success story has emerged from the fray. Termed “the European unicorn” by TechCrunch, Deliveroo is a British startup that offers fast and reliable food delivery service from a premium network of restaurants.

Deliveroo recently raised a $385 million funding round, boasts an estimated $2 billion valuation and is credited with transforming the way people think about food delivery. What is this unicorn doing differently? How has it found success where so many others have failed?

“Data is baked into every aspect of the organization,” Deliveroo’s head of business intelligence, Henry Crawford said. “Having instant access to data reveals which geographic areas are experiencing a shortage of restaurants and a shortage of particular cuisines so we can create these hubs right at the consumer’s doorstep.”

Deliveroo analyzes customer behavior, gains insights into market trends and responds with swift decisions and rapid execution by using data-driven insights. Snowflake makes all of this possible.

“With data coming from a variety of sources, including web traffic, transactions and customer behavior, having a data warehouse built for the cloud provides one repository for a single source of truth,” Henry explains.“The shift to Snowflake’s cloud data warehouse has enabled us to make good on our promise that got Deliveroo started: To connect consumers with great food from great restaurants, wherever you are, and whatever it takes.“

Snowflake also accommodates Deliveroo’s 650% growth in 2016. Such rapid momentum prompted Deliveroo to expand its business intelligence team from two employees to 14. Additional team members triggered the need for more access to the same data but without impacting performance.

Since Snowflake is built for the cloud, an unlimited number of users can access all of an organization’s data from a single repository, which is critical to Deliveroo’s success. There’s no replicating data, shifting queries and other workloads to non-business hours, or queueing users to preserve performance. Instead, Snowflake’s true cloud elasticity means Deliveroo can automatically scale up, down and out (concurrency) to load and analyze data without disruption.

“None of these future plans would be possible without real-time, concurrent access to massive volumes of data,” Henry said.

What’s next for Deliveroo? Using real-time logistics algorithms to increase the number and the speed of deliveries. Deliveroo’s expansion plans also include an “Editions” program—delivery-only kitchens so partner restaurants can expand their footprint without opening brick-and-mortar locations.

Learn more about how Snowflake can accelerate your data storage and analytics initiatives.

The Data Sharehouse brings forth a new market

When I mention data sharing to customers, they often say “really”? From that moment forward, the discussion is no longer about replacing their data analytics platform. It’s about growth. Growth of their business, growth of their ecosystem, and growth from the limitless possibilities of sharing live data in a matter of minutes.

Today, we announced the most significant breakthrough of Snowflake’s data warehouse built for the cloud – Snowflake Data Sharing. It extends our data warehouse to what we call the data sharehouse.

The revolutionary architecture of Snowflake paves the way for the data sharehouse. All of the unique benefits Snowflake provides inside the enterprise extends the data warehouse outside the enterprise. No other technology offers such a quick, powerful and inexpensive way to share live data between organizations. A data sharing model that provides read-only access to an enterprise’s entire data warehouse, or just a secure slice of data. No copying and no moving of data required.

The commercial use of sharing data emerged from Nielsen Corporation in 1923. Over the next century, enterprises adopted different data sharing models but with minimal success. Even the digital data sharing methods they’re forced to use today haven’t changed much. But the data sharehouse enables one-to-one, one-to-many and many-to-many data sharing models. And since Snowflake was built for the cloud, the opportunities to share data are endless.

Only when a truly unique product or service emerges, one that is innovative and appeals to nearly every enterprise, that’s when a new market is born. And with the data sharehouse, a true market for sharing data sharing has begun. Unlike other markets, Snowflake has focused on building the infrastructure, the platform, for enterprises to do business with each other.

And unlike data consortiums, which define the interactions between companies and impose fees on transactions, Snowflake data sharing is open to all organizations, removing another barrier that inhibits enterprises from accessing limitless data. The business opportunity that data sharing enables is fully owned by the data providers and consumers. Snowflake is focused on providing the data sharehouse that enables data sharing, without any entanglement in the business of the data providers and consumers. At its core, Snowflake is a database, not a marketplace.

With Snowflake Data Sharing, organizations will share more data with the partners in their ecosystem to improve business efficiencies. They’ll determine that some of their data is just as valuable to other, non-competing companies. And they’ll inquire from other organizations about data they don’t have, and negotiate access to that data. Live data sharing also means that enterprises with vast landscapes populated with dozens or even hundreds of disparate data silos, acquired by years of growth and acquisition, will be able to unite nearly all of their data.

As exciting as this is, what will truly astonish is what’s possible with the data sharehouse that we have yet to imagine. Modern data sharing will enable organizations across industry and across the globe to imagine new ways of doing business, new ways to solve longstanding problems, and provide new insights into manufacturing, healthcare, science and humanitarian issues, to name a few. Until now, there was no easy way to connect enterprises with one another through data. Well, those days are over. The data sharehouse has arrived.

RI (Referential Integrity) Constraints: 3 Reasons to Include Them in Your Data Warehouse

Over the years, I have had numerous conversations about the value of having referential integrity (RI) constraints, such as primary and foreign keys, in a relational data warehouse or data mart.

Many DBAs object that RI constraints slow the load process. This is a valid point if you are talking about enforced constraints that are checked in real time during the load. But this is not an issue if you define the constraints as disabled.

Which then leads to this common question:

Is there any reason to maintain a permanently disabled FK in the data model?  If it is not going to be enabled, then from my perspective, it doesn’t make any sense to define the FK.  Instead, the relationship can be described in the comment of the child column.

So, why would I want RI constraints in my data warehouse?

Mostly it has to do with good design and development best practices. Here is my rationale for why you should consider including RI constraints in your data warehouse design.

#1 – Design Metadata

RI constraints are valuable metadata/documentation. If somebody reverse engineers the database (say with ERWin or Oracle Data Modeler), the PKs and FKs show up in the diagram (much better than having to read a column comment to discover a relationship). This is quite valuable for new people on your project to orient themselves to the existing schema design and understand how the various tables in your data warehouse are related to each other.

RI in a diagram
A picture is worth a thousand words

#2 – BI Metadata

If you want to use any sort of reporting or BI tool against the database (it is a data warehouse,  after all), most modern business intelligence and visualization tools import the foreign key definitions with the tables and build the proper join conditions. This is much better than having someone guess what the join will be and then manually adding it to the metadata layer in the reporting tool. This also ensures that different developers don’t interpret the joins differently.

Examples of tools that can read the Snowflake data dictionary include Looker, Tableau, COGNOS, MicroStrategy, and many others. Some of these tools actually use the FK definitions for join culling to provide better query performance.

#3 – QA your ETL/ELT code

I know you think your ETL code is perfect.

But does every developer test to the same standards? Do you maybe have a QA team who separately validates that the ETL is doing what you expect?

If so, having declared primary, unique, and foreign key constraints in your data warehouse gives the team more information they can use to ensure the quality of the data. In fact, using the Snowflake Information Schema, a QA engineer can potentially generate SQL to test that the loaded data conforms to the defined constraints.

Defining RI Constraints in Snowflake

You, of course, can (and IMHO should) define RI constraints in Snowflake. You can define primary keys, unique keys, foreign keys, and NOT NULL constraints. Because Snowflake is specifically engineered for data warehousing, only the NOT NULL constraints are enforced. The rest are always created as disabled.

The syntax is standard SQL. You can define the constraints both inline and out-of-line.

Here is a simple example of inline constraints:

Create or replace TABLE SAT_REGIONS (
HUB_REGION_KEY NUMBER(38,0) NOT NULL,
SAT_LOAD_DTS DATE NOT NULL,
REGION_COMMENT VARCHAR(152),
HASH_DIFF VARCHAR(32) NOT NULL,
SAT_REC_SRC VARCHAR(50) NOT NULL,
constraint SAT_REGIONS_PK primary key (HUB_REGION_KEY, SAT_LOAD_DTS),
constraint SAT_REGIONS_FK1 foreign key (HUB_REGION_KEY)
references KENT_DB.DATAVAULT_MAIN.HUB_REGION(HUB_REGION_KEY)
);

Example of an out-of-line constraint:

ALTER TABLE SAT_REGIONS ADD constraint 
SAT_REGIONS_PK primary key (HUB_REGION_KEY, SAT_LOAD_DTS);

It’s that easy. So why don’t you have constraints in your data warehouse?

 

Did you hear? Snowflake was declared the #1 Cloud Data Warehouse in a recent GigaOM analyst report. Go here to get your copy of the report.

As always, keep an eye on this blog site, our Snowflake Twitter feeds (@SnowflakeDB), (@kentgraziano), and (@cloudsommelier) for updates on all the action and activities here at Snowflake Computing.

Snowflake Vision Emerges as Industry Benchmark

Technology research and analysis firm Gigaom has ranked Snowflake as the #1 cloud data warehouse in a recent study. We surpassed enterprise data warehouse products including, Google BigQuery, Teradata, IBM dashDB, HPE Vertica, Microsoft Azure SQL, SAP HANA and Oracle Exadata. Snowflake emerged with a top score of 4.85 out of a possible 5.0. The competition averaged a score of 3.5. The six “disruption vectors” Gigaom used as its key scoring criteria are congruent with what we wanted to achieve back in the summer of 2012, when we started Snowflake.

But long before we wrote the first line of Snowflake code, we asked one another: “What should a data warehouse deliver that no other product has before? How can we enable organizations to make the best, data-driven decisions? And how will the world’s most powerful data warehouse help organizations achieve their existing goals and help reveal their future goals?” We then set out to answer those questions.

We wanted to enable organizations to easily and affordably store all of their data in one location, and make that data accessible to all concurrent users without degrading performance. We also wanted Snowflake to scale infinitely, with ease, and cost effectively so organizations would only pay for the compute and storage they used. And the product had to work with the tools that users already knew and loved. Finally, we wanted a data warehouse that required zero management by our customers – nothing to tweak, no tuning required. These defining qualities aligned with the new world of cloud services, and they are what formed the foundation of Snowflake.

What’s happened since the early days of Snowflake? We got to work, and we stuck to hiring the best engineers the world has to offer. We built Snowflake from the ground up, for the cloud, and incorporated all of these elements as the core of the product. In early 2015, we offered the first commercial version of Snowflake – the one and only data warehouse built for the cloud. Since then, our engineering team has added more and more industry-leading capabilities to Snowflake, leapfrogging the traditional data warehouse vendors.

Along the way, we’ve hired high-calibre teams to execute the sales, marketing and finance functions of the company so our customers and partners get the highest value from working with Snowflake. We also built a great customer support organization, providing the level of service our users love. In more recent times, we’ve expanded operations outside of North America to Europe, with Asia-Pacific and other regions coming online soon. We’ve also added Snowflake On Demand™ – the easiest way to get started with Snowflake by simply signing up on our website with just a credit card. All of these efforts over the past four years have led to Snowflake’s most recent inflection point – being chosen as the number one cloud data warehouse.

What does all this mean? Snowflake’s current and future customers have every opportunity to explore all of their data in ways they never thought possible. They can gain the insight, solve the problems and create the opportunities they simply couldn’t with their previous data platforms. We committed to building the world’s best data warehouse – the only data warehouse built for the cloud. Our customers, our partners and now the industry have indicated we’ve likely achieved what we set out to do back in the summer of 2012. Going forward, we’ll continue to serve our customers and partners with the best technology, the best solutions and the best services available.

Read the full report >

Migrating to the Cloud? Why you should start with your EDW

Many organizations we engage with are seriously considering transforming their business and moving some (or all) of their IT operations into the cloud. A lot of executives I have encountered are struggling with the same question: “How do I get started?” There is a strong case to be made that starting with your Enterprise Data Warehouse (EDW), or at least a data mart, is the fastest, and most risk-free path, with added upside potential to increase revenue and set you up for future growth. As operational data volumes continue to grow at exponential rates, it’s not a matter of if you go to the cloud to manage your enterprise data, but when.

Before going too far on your cloud journey, I would recommend an exercise in segmenting your business from an IT perspective in a very simple way. To get you started, let me suggest five possible categories, along with some risks to consider for each:

  • Customer-facing Applications – This is the heart and soul of your business. If something goes wrong, you lose business and revenue, and people potentially get fired. Risk: HIGH
  • Internal Applications – Mail, Payroll, General Ledger, AP, AR, things like that. Every person inside the organization relies on at least one of these services, and a lot of analysis needs to take place to figure out all the integration points to ensure nothing gets missed during a migration to the cloud. Risk: HIGH
  • Desktop/Laptop OS and Applications – There are whole books and schools of thought about how to migrate these, which means it’s a big decision and a big deal. Impacting everyone in the company on your first cloud initiative? Risk: HIGH
  • Operations Monitoring and Alerting – Got a Network Operation Center (NOC)? These guys are integrated with every system that is important, so moving them to the cloud could be a large undertaking. Risk: HIGH
  • Reporting and Analytics – Hmmm….if my constituents don’t get their weekly or monthly reports on time, is that a disaster? Can they get by with a small outage during the migration? Risk: LOW

Starting with the Data

Let’s take a closer look at why starting your cloud journey with your EDW could be a viable option, and even have some benefits that could help sell the idea (of the cloud) internally. In no particular order, I would highlight these points:

  • Doesn’t disrupt the business – Many EDW implementations are not mission critical today (as compared to enterprise applications). As more data becomes available through social media or Internet of Things (IOT) applications, businesses need access to much larger volumes of data and they will want access to it earlier in the data pipeline. Traditional DWs contain aggregations and are used for doing trend analysis, analyzing data over a period of time to make strategic, rather than tactical decisions. They are not architected to handle this new influx of raw data in a cost-effective manner. By starting your cloud journey with the EDW, you reduce risk (by going to a more flexible architecture) while getting your team early exposure to working with cloud services.
  • Doesn’t disrupt internal users – When moving to the cloud, you want to show incremental success and don’t want to add a lot of unnecessary risk. It’s simple to keep running your existing EDW in parallel with your new cloud DW, giving you a built-in fall-back plan for the early stages. Or you may decide to start with a small data mart as a pilot project.
  • Start-up costs are a fraction of on-premises, appliance solutions – Some of our customers invested as much as $10 million (or more) years ago on a data warehouse appliance that is now outdated technologically. And the renewal costs to keep that tech going are coming due. If they re-invest another huge sum of money, this will delay them getting to the cloud by another 4-5 years, putting them behind their competition. Rather than outlaying a large capital expenditure to extend the life of the older technology, it may make better sense to move to the cloud. The cloud offers a utility-based model, allowing you to pay for what you use and when you use it, as opposed to what you think you are going to need 2-3 years in the future. As a result, not only is the cost of entry lower, but you are not risking a huge sum of money to make the move.
  • Data is growing at an exponential rate – Will you ever have less data to worry about in your business? If you plan on being successful, I don’t think so. Many organizations are looking at new and different ways to manage and analyze ever-increasing volumes of data coming in various formats from multiple sources (such as semi-structured web logs). Your current on-premises EDW was not designed for this kind of workload or data.  If you are considering changing infrastructure platforms to accommodate it, why not select tools that were built for today’s modern data challenges instead of legacy-based architectures? Moving to the cloud also gives you the opportunity to consolidate operations and streamline business processes.
  • Enable new capability – There are some new analytic paradigms happening in the cloud (such as machine learning). Cloud-based platforms allow you to work with both detailed and aggregated data at scales never imaged (see the case study about DoubleDown as an example). Need to run a complex analytic job with a 256-node Massively Parallel Processing (MPP) cluster for an hour, and then shut it down? No problem. Can your platform support a thousand users without concurrency issues?  How would that change your business if it could dynamically adjust to handle those new demands?

As with any infrastructure move, the benefits have to be clear enough that the status quo mentality can be overcome and analysis paralysis doesn’t push out your journey to the cloud for months or even years. The beauty of the cloud model is that it is easy to start small and scale without risking a huge investment up front. Every business needs some proof before committing time and resources to move anything to the cloud and your EDW is a perfect candidate. Snowflake is the first and only EDW built for the cloud to be truly elastic for all of your analytic and big data needs.

Please feel free to reach out to us at info@snowflake.net. We would love to help you on your journey to the cloud. And keep an eye on this blog or follow us on Twitter (@snowflakedb) to keep up with all the news and happenings here at Snowflake Computing.

Looking Back at 2016 Predictions

Last December, I made some predictions for 2016. As we approach the end of the year, I thought it only fair to look back and compare what I predicted to what has happened.

Do or die for big old tech

This was an easy one to get right. Big old enterprise tech companies are hunkering down and watching the world pass them by. HP and Dell are vying to be the king of legacy. There is money in this but who really wants to wear that crown?

IBM is trying to move on with Watson but can Ginni Rometty really pivot that aircraft carrier? And can Watson provide Jeopardy-winning answers for a variety of industries without an army of IBM consultants to spoon feed it? Only time will tell but there is reason to be skeptical.

At Oracle, Larry seems to have discovered the cloud (and will probably soon claim that he invented it). But he remains confused about what a cloud really is. When Oracle talks about Exadata Cloud Service, legacy hardware in a managed services datacenter, they demonstrate they’re still lost in the fog.

Overall, 2016 was not a good year for big old enterprise tech.

Public cloud wins, but who loses?

My prediction on the progress of private clouds was almost an understatement. This year, the move towards private clouds has been slower than molasses on a cold winter day. VMware continues to miss the mark, failing to deliver a cost-effective private cloud solution. And Openstack is a confusing grab bag that requires a huge SI investment, which is beyond the reach of almost all customers.

Meanwhile, almost every company, including most financial services, is now committed to adopting the public cloud. Amazon of course is the big winner but Microsoft has shown once again they will persevere and succeed. Last year, I picked Google as the wildcard. Diane Greene appears to have brought focus to Google and they clearly gained ground in 2016. Google possess the technical capability but they still need to get a lot more serious on the sales side as they have no enterprise experience. A recent query on LinkedIn shows 465 sales openings for Microsoft, 604 sales positions for Amazon, and only 85 open sales roles for Google cloud.  Google can’t compete against Amazon and Microsoft with just 85 more sales people.

The other major public cloud player that has emerged strong in 2016 is Alibaba. China cloud is set to explode in 2017. While it will be tough for Alibaba to gain traction in the US, in China it will almost certainly be the winning player.

All of the other public cloud wannabe’s are in a world of hurt. It looks like we’ll have four public clouds – Amazon, Microsoft, Google and Alibaba.

Spark divorces Hadoop

As I predicted last year, 2016 was not a good year for Hadoop and specifically for Hadoop distribution vendors. Hortonworks is trading at one-third its IPO price and the open source projects are wandering off. IaaS cloud vendors are offering their own implementations of the open source compute engines – Hive, Presto, Impala and Spark. HDFS is legacy in the cloud and is rapidly being replaced by blob storage such as S3. Hadoop demonstrates the perils of being an open source vendor in a cloud-centric world. IaaS vendors incorporate the open source technology and leave the open source service vendor high and dry.

Open source data analysis remains a complicated and confusing world. Wouldn’t it be nice if there were one database that could do it all? Wait, there is one, it’s called Snowflake.

What do Donald Trump and EU bureaucrats have in common?

Looking back at 2016, I guess not much. 2016 is a year that EU bureaucrats would rather forget and The Donald will remember forever.

On the privacy side, we saw some encouraging news with the creation of Privacy Shield. That said,  Privacy Shield is already being challenged and this space remains uncertain. On a purely positive note, Microsoft won the case in Ireland that prevents the US government from grabbing data stored in other countries. The ruling was critical for any U.S. cloud company that has a global footprint.

Perhaps the most encouraging thing from 2016 is that Europe has a full plate given the challenges of Brexit, a Donald Trump-led America, ongoing immigration issues and upcoming elections with strong populist candidates. Given these problems, concerns about privacy are likely to take a back seat so the bureaucrats may be content to stand behind Privacy Shield.

About that wall, Donald hasn’t said too much lately but I think we will see something go up on the border. He loves construction.

The True Value of Cloud Data Storage Continues to Emerge

We’re in interesting times. Like most significant trends, the data-driven economy revealed a powerful approach that was unique but always in plain sight. We listened and watched closely as experts across industries and different roles promulgated the benefits of capturing, storing and using data from every corner of cyberspace. And not far behind came a related and more interesting topic of connecting the offline world to capture previously unimagined amounts of data, ranging from kitchen appliances to jet engines. This we now know to be the Internet of Things (IoT).

We all acknowledged this data shift would change how companies do business and how we live our lives. As with all significant themes, comes additional thought on the ‘how’. Once we capture all of this data, how will we manage it? How will we effectively store and access petabytes of data, and more, so we can put that data to work?

These aren’t questions just for governments of the largest countries or for global enterprises. All organizations, from the garage start-up to mid-size companies are keen to harness the insight derived from more and more data. As wonderful as this seems, it all comes down to technology and cost. The cost of storing that data, and the technology to easily derive insight from data. But how does an organization accomplish this within their financial limits?

Our founders placed this at the heart of Snowflake. Before they typed the first line of code that ultimately brought the Snowflake cloud data warehouse to life, they wanted to enable data without limits. Snowflake’s built-for-the-cloud architecture truly separates compute from storage, allowing customers to easily scale either resource up and down. This also means Snowflake customers can focus their efforts on the highest value of data warehousing – compute. This is just one of many strategic advances, along with our unmatched technology, that makes Snowflake the most powerful and affordable data warehouse for all of an organization’s data warehousing and analytics.

With that said, Snowflake lowered its storage pricing in October to match Amazon’s S3 storage price. Today, Snowflake again lowered its price to match Amazon’s latest S3 price reduction. This strategy is a crucial component to truly realizing a data-driven world for all – data without limits. The amount of data the world creates continues to increase at an exponential rate. And to harness the insight from that data, organizations need the best technology at the best price. Snowflake has always been there and always will be.

To read more about our latest pricing announcement, click here.

Challenges and New Opportunities in Data Analytics

Fall is conference season in the industry, and this fall there has been no shortage of discussions and insights about data analytics at events both big and small. The Cloud Analytics City Tour has been a highlight here at Snowflake, but we’ve also seen the analytics conversation front and center at big conferences like Dreamforce.

The Challenges of Data Analytics

Our Cloud Analytics City Tour, now entering its home stretch, has brought together a diverse set of attendees, with small entrepreneurs sharing the room with people from some of the most established companies around. That diverse audience and the thought leaders who participated as speakers have provided some great discussion and insights.

For one, it’s clear that data analytics in the cloud has quickly become a topic of mainstream interest to organizations of all stripes and sizes. In fact, the conversation has moved on from “should we consider data analytics in the cloud at all” to “how do we figure out what to do in the cloud and how”?

That shift was reflected in some of the key themes and insights we’ve been hearing on the City Tour. Among those themes and insights:

  • The challenges are more than just technology. We heard repeatedly that one of the biggest challenges in cloud analytics is getting organizational buy-in. Even though acceptance of cloud has grown, getting people to do things differently still takes a lot of work.
  • Data integration and analytics now need to be a continuous process. The batch, scheduled approach to making updated data and analytics available no longer meets the needs people have today. Continuous data integration is becoming vital as organizations look to drive agile, data-driven decision-making throughout their organizations.
  • Finding great analytics people remains hard. The “people issue” – finding the right talent for analyzing data, is now even more urgent. However, it’s still hard to solve even as a greater number of people become data savvy.
  • Data quality still matters. While the technology to manage large and disparate sets of data is far more accessible in part because of the cloud, the quality of the data is still a challenge – how do you verify and normalize the data as quickly as your system can deliver and parse it?

Bringing Data Analytics to All

The importance of data analytics was also front and center at other conferences. At Dreamforce, the former Salesforce CRM conference that has now evolved into a much broader event encompassing wide-ranging business and technical topics, data-driven decision making for competitive advantage was a key theme. However, the conversation at Dreamforce has evolved from last year’s spotlight on the importance of using “big data” to a focus this year on how the nature of this data is changing, and on how to practically use more of the new types of data in everyday decision-making without being overwhelmed by its complexity.

What was most interesting about this discussion was that there were clearly two camps: increasingly sophisticated organizations with access to the skills and resources to be able to apply the latest data analytics approaches, and organizations that do not have in place or within reach the skills and resources to enable data-driven decision-making for greater insight. Those deep-pocketed enterprises who are rebuilding their entire infrastructures with the help of consultants like Accenture are leap-frogging into new productive use cases and revolutionary advances in deep learning.

The result is that well-funded start-ups who can attract highly skilled resources (and who can start from scratch) and those deep-pocketed enterprises who are rebuilding their entire infrastructures with the help of consultants like Accenture threaten to leapfrog the millions of organizations stuck in the middle who may know what they want to do with data and analytics, but don’t know how to get there. To add to the complexity, not only the technical infrastructure but the mindset within the organization and across departments needs to change.

For organizations across that spectrum, new solutions have emerged. Salesforce’s announcement of Einstein, a data analysis solution for data in Salesforce systems, is one example. But even more importantly, cloud analytics and systems designed to support it are making analytics accessible to more than just the well-resourced 1% of organizations.

As we have learned from the nimble companies that have gone from startup to billion-dollar unicorn in the last five years, thinking and operating in the cloud is the ultimate enabler. For more established companies hindered by legacy systems, changing the technology is now the easy part with solutions such as Snowflake available. But the rewards in overcoming these cultural and process barriers are invaluable to any organization that doesn’t want to be left behind in this next wave data revolution.

To connect with like-minded revolutionaries and learn more about how to move your organization’s data sophistication to the next level, join us at one of our next Data Analytics forums, including this week’s event in San Francisco as well as upcoming events in Chicago and Los Angeles. The best learning happens in person, and we hope you have or will take advantage of our Cloud Analytics City Tour as a great forum for intelligent discussions and meaningful insight.