Financial Services: Welcome to Virtual Private Snowflake

Correct, consistent data is the lifeblood of the financial services industry. If your data is correct and consistent, it’s valuable. If it’s wrong or inconsistent, it’s useless and may be dangerous to your organization.

I saw this firsthand during the financial meltdown of 2007/08. At that time, I had been working in the industry for nearly 20 years as a platform architect. Financial services companies needed that “single source of truth” more than ever. To remain viable, we needed to consolidate siloed data sets before we could calculate risk exposure. Most financial services companies were on the brink of collapse. Those that survived did so because they had access to the right data.

At my employer, we looked for a way to achieve this single source with in-house resources, but my team and I quickly realized it would be an extraordinary challenge. Multiple data marts were sprawled across the entire enterprise, and multiple sets of the same data existed in different places, so the numbers didn’t add up. In a global financial services company, even a one percent difference can represent billions of dollars and major risk.

We ultimately built an analytics platform powered by a data warehouse. It was a huge a success. It was so successful that everybody wanted to use it for wide-ranging production use cases. However, it couldn’t keep up with demand, and no amount of additional investment would solve that problem.

That’s when I began my quest to find a platform that could provide universal access, true data consistency and unlimited concurrency. And for financial services, it had to be more secure than anything enterprises were already using. I knew the cloud could address most of these needs. However, even with the right leap forward in technical innovation, would the industry accept it as secure? Then I found Snowflake. But my story doesn’t end there.

I knew Snowflake, the company, was onto something. So, I left financial services to join Snowflake and lead its product team. Snowflake represents a cloud-first approach to data warehousing, with a level of security and unlimited concurrency that financial services companies demand.

We’ve since taken that a step further with Virtual Private Snowflake (VPS) – our most secure version of Snowflake. VPS gives each customer a dedicated and managed instance of Snowflake within a separate Amazon Web Services (AWS) Virtual Private Cloud (VPC). In addition, customers get our existing, best-in-class Snowflake security features including end-to-end encryption, at rest and in-transit. VPS also includes Tri-Secret Secure, which combines a customer-provided encryption key, a Snowflake-provided encryption key and user credentials. Together, these features thwart an attempted data decryption attack by instantly rendering data unreadable. Tri-Secret Secure also includes user credentials to authenticate approved users.

VPS is more secure than any on-premises solution and provides unlimited access to a single source of data without degrading performance. This means financial services companies don’t have to look at the cloud as a compromise between security and performance.

To find out more, read our VPS white paper and solution brief: Snowflake for Sensitive Data.

 

Rethink What You Know About Creating a Data Lake for JSON

Over the last 10 years, the notion has been that to quickly and cost-effectively gain insights from a variety of data sources, you need a Hadoop platform. Sources of data could be weblogs, clickstreams, events, IoT and other machine-born JSON or semi-structured data. The proposition with Hadoop-based data processing is having a single repository (a data lake) with the flexibility, capacity and performance to store and analyze an array of data types.

It shouldn’t be complicated

In reality, analyzing data with an Hadoop-based platform is not simple. Hadoop platforms start you with an HDFS file system, or equivalent. You then must piece together about a half-dozen software packages (minimum) just to provide basic enterprise-level functionality. Functionality such as provisioning, security, system management, data protection, database management and the necessary interface to explore and query data.

Despite the efforts of open-source communities to provide tools to improve the capabilities of Hadoop platforms to operate at the highest enterprise-class level, there is the constant need for highly skilled resources. Skilled resources to continually support Hadoop to keep it up and running, while enabling users to do more than just explore data. This all adds up to unnecessary complexity.

A much simpler proposition

Snowflake, which is built for the cloud and delivered as a service, provides you with a different option for handling JSON and semi-structured data. Just point your data pipelines to Snowflake, land the data in our elastic storage repository and you have instant access to a bottomless data lake. You also have access to a full-fledged data warehouse. With Snowflake, you can easily load JSON and query the data with relational, robust SQL. You can mix JSON with traditional structured data and data from other sources, all from within the same database. Moreover, you can also support endless concurrent analytic workloads and work groups against the JSON data. Whether it is one workload or 1,000 workloads, Snowflake can handle it all with ease.

As a combined data lake and data warehouse platform, Snowflake allows you do much more. Read more about it with our new eBook, Beyond Hadoop: Modern Cloud Data Warehousing.

Try Snowflake for free. Sign up and receive $400 dollars of free usage credits. You can create a sandbox or launch a production implementation from the same Snowflake account.

Rethink what you’ve been told

In order to gain insights from JSON or other machine data, Hadoop is not a prerequisite

When you need to store, warehouse and analyze JSON and other machine data, rethink what you’ve been told. Snowflake, easily, allows you to develop insights or uncover relationships that can drive business forward. You can support all of your structured and semi-structured data warehousing and analytic workload needs with a single tool. A single tool that is built for the cloud and is ACID-compliant. Unlike the special skills often needed to operate an Hadoop platform, Snowflake is a fully relational SQL environment that utilizes the familiar semantics and commands that are known to millions of SQL users and programmers, and thousands of SQL tools.

Be sure to keep an eye on this blog or follow us on Twitter (@snowflakedb and @miclnixon1) for all the news and happenings here at Snowflake.

 

 

 

 

3 Ways a Data Sharehouse™ Can Catapult Your Data-Driven Initiatives

How would your organization benefit from effortlessly sharing limitless amounts of data with business partners and commercializing that data to share with other organizations?

With most data sharing methods today, the best you can do is imagine these benefits because it’s cumbersome, time consuming and costly to share even small slices of data. If you share data using an FTP approach, you will spend time deconstructing, scripting, securing and governing the data, and your data consumers will spend time reconstructing, scripting and rebuilding the data. Sharing data via email can be very slow and insecure. Email is not even a practical option for large data sizes and is exponentially more difficult if sharing a large database.

Sharing data using a basic cloud storage service is equally inefficient, including the lack of ability for you or for your data consumers to query the data directly. And if you want to enable “direct” data queries, without loading, you will likely evaluate a traditional cloud data warehouse or Hadoop platform that is “integrated” with a data lake storage platform. Do you think this will be simple? Not so fast! You’ll have to factor in the need to manage a separate data catalogue and external tables to share data, or you’ll have to contend with data inconsistency and performance issues.

These limitations all add up to mean that traditional options to share data are more difficult to take data sharing to the next level of capabilities, which goes beyond finding insights to commercializing your data. If any of this feels familiar, consider these three reasons why Snowflake Data Sharing – the technology making it possible to create a Data Sharehouse™ – is so compelling.

1 – Drive simplicity: Allocate more of your time and resources to strategic data sharing projects

Traditional options to share data create unnecessary data processing and platform complexity. Complexity adds burdens and requires a lot of extra time and resources, including infrastructure costs, for you and for your data consumers. 

Snowflake Data Sharing, on the other hand, is an all-in-one, data warehouse-as-service solution that makes sharing data simple and easy. The vision for the Snowflake architecture, from day one, featured separation of compute, storage, and services to allow unlimited scaling, concurrency and data sharing.  

 

 

For example, with Snowflake, it is not necessary to manage an external data catalogue and not required to have a separate effort to build data security. It’s all built-in. In addition, Snowflake metadata management makes it possible to keep tabs and track all data warehouse activities. Thus, to share data, follow these steps:

  • Point your pipeline to land your data into Snowflake and set up your warehouse.
  • To share any portions of your data warehouse, use Snowflake’s included SQL client to CREATE a new database shell that could contain the entire database or any portions of it. CREATE a reference list for the objects you wish to share.
  • Issue GRANT statements that will enable access to the shared-database and any objects referenced within “the share”.

These are all the steps you will need to quickly and easily share data. All of the related metadata activity, including cataloguing, happens automatically within the Snowflake data warehouse service. No additional infrastructure or separate cataloguing effort required.

2 – Enable more valuable insights: Your data consumers will always operate with live and current data

Most options to share data require you to unload the data from your data warehouse in order to transmit or email to your data consumers. Or if using a cloud data warehouse platform to avoid this, it relies on a physically separate storage pool in order to scale and share data. The downside with either option is that the data set is read-only and disconnected from any updates that occur within the data warehouse. If you modify the data set, or if the data set is regularly updated, you must reload (ETL) it into the data warehouse, operate on it, and then unload it again to retransmit or place it in the shared storage pool.

For data consumers that received or were connected to the old data set, this means there is a period of time during which they’ll be exposed to stale and inconsistent data. Fresh data won’t be available until a new transmission is made or until a new connection is made to the fresh data. No one wants to run analytics on stale data.

Snowflake Data sharing delivers a better choice. Because data sets shared with data consumers within the Snowflake environment are live, in real time, and from the data warehouse, data consumers will immediately see fresh data as soon as an update or transaction is committed and successfully executed. Data consistency is maintained, without any extra effort from you. Your data consumers will not have struggle with decisions to run analytics now or wait for a new update.

3 – Support any scale, with ease: Seamlessly and cost-effectively share data with any number of data consumers

At the end of the day, it’s all about growing and expanding your business and services while providing excellent experiences for the customers and consumers of your data. If you anticipate sharing data with tens, hundreds, or thousands of data consumers, each of which with unique data sharing requirements, how can you easily support this? And support the growth without manually building more clusters, managing external tables or metadata stores, suffering through performance penalties or creating data inconsistencies? It would be very difficult or impossible with other architecture approaches to accomplish these objectives.

Snowflake data sharing allows you to scale and easily add data consumers and specify granular secure views, all at the highest performance profile and with data consistency. On the other end, with Snowflake, your data consumers can immediately use and query the shared data, also at the highest performance profile.

Summary: Share and imagine more

These are just a few compelling examples of how data Snowflake Data Sharing, and the Data Sharehouse approach, can transform the creation of high-value business assets from data. It’s a new, exciting, powerful and easy-to-use feature of all Snowflake data warehouse deployments.

Also check out data sharing blogs from Snowflake customers Snagajob and Playfab. To learn more, you can easily signup for Snowflake here. Or jump to our webpage to review our data sharing ebook, white paper, demo, and other resources. In future posts, we’ll cover more details about the technical capabilities and business-value advantages of Snowflake Data Sharing – built for the cloud. Be sure to follow us on Twitter (@snowflakedb and @miclnixon1) for all the news and happenings here at Snowflake. Stay tuned!