Tech Tip: Quick Start for Getting Your First Data into Snowflake

If you are like me and fairly new to this whole cloud thing, then one of your main questions is likely:

“How do I get data from my desktop (or server) into Snowflake so I can query it?”

Which, in reality, translates to:

“How do I load data in the cloud?”

Well, there are several methods, including a variety of integration and ETL tools. In this tech tip I will show you the simplest way to accomplish this via the Snowflake Web UI.

Our brilliant engineers built a nice little wizard to make it simple for cloud novices (like me) to easily upload a data file from your desktop and load it into a table in Snowflake.

Getting Started

First off, you need a table in the database that matches the structure of the data file you are going to load (you modeled that table right?).  You can build that by running a create table DDL in a worksheet once you have logged in.

So something like this:

CREATE TABLE Region_t1


  (

    Name               VARCHAR2 (200) NOT NULL ,

    Code               VARCHAR2 (2) NOT NULL ,

    Abbreviation       VARCHAR2 (2) NOT NULL ,

    Sort_Order         NUMBER (2) NOT NULL ,

    External_Reference VARCHAR2 (1000) NOT NULL ,

    Record_Source      VARCHAR2 (100) NOT NULL ,

    Record_Owner       NUMBER (1) NOT NULL ,

    Comments           VARCHAR2 (2000)

  ) ;

ALTER TABLE Region_t1 ADD CONSTRAINT PK_Region_t1 PRIMARY KEY ( Code ) ;

ALTER TABLE Region_t1 ADD CONSTRAINT UK_Region_t1 UNIQUE ( Name ) ;

Invoke the Load Wizard

To begin the load process, navigate to the Database tab in the UI, drill into your database and then click on the table name. Once you do that, the Load Table widget is displayed (fourth from the left on the toolbar).  Click on the widget, and the Load Data wizard opens and asks for a warehouse. Select a virtual warehouse to use from the drop-down list. (Note: The list displays any warehouse you have the USAGE privilege on.)

select-wh

Select the File to Load

In the next step, you select the source file to load. You have options to get a file from your file system, or from an AWS S3 bucket if you have already moved some files to the cloud.

file-selected

Note that you can also load multiple files! The select dialog box supports multi-select. But keep in mind that the files must all be the same type with the same structure because they’re all going into the same table.

In the next step, you pick a file format from the drop-down list. In this case, I have selected a format that specifies a CSV type file with comma delimiters.

pick-file-format-for-load

Creating a File Format (if necessary)

If you do not have an appropriate file format already, simply click the plus sign (+) next to the drop-down arrow in the UI to define a new one. For a standard CSV file that uses commas as the delimiter, you really just need to give the file format a name and click Finish. However, if the first row of the file includes column headings, be sure to set Header lines to skip to 1.

create-file-format

Setting Load Options

The next step of the wizard allows you to choose how to handle load errors. There are four options. I chose to have the load continue by skipping any problem records (which I will deal with after the load).

load-options

At this point, you are set to go. If you want to see the code for the load (so you can learn something new, right?), just click the Show SQL link before you click Load.

load-sql

Running the Load

Once you click Load, the process starts and does the following:

  1. Uses PUT to copy the file to a Snowflake internal staging area, automatically encrypting the data during the process.
  2. Uses COPY to load the data into the target table and deletes the file from the staging area upon completion.
  3. Reports the results.

load-success

And voila, you have now loaded data from your desktop into the cloud-native Snowflake Elastic Data Warehouse. Congratulations.

For a much deeper dive into loading data into Snowflake check out the Data Loading topic in our user guide.

As always, keep an eye on this blog site, our Snowflake Twitter feeds (@SnowflakeDB), (@kentgraziano), and (@cloudsommelier) for more tips and tricks, along with updates on all the action and activities here at Snowflake Computing.

Top 10 Cool Things I Like About Snowflake

I have now been with Snowflake Computing for a little over two months (my how time flies). In that time, I have run the demo, spoken at several trade shows, and written a few blogs posts. I have learned a ton about the product and what it means to be an Elastic Data Warehouse in the Cloud.

So for this post I am going to do a quick rundown of some of the coolest features I have learned about so far. 

#10 Persistent results sets available via History

Once you execute a query, the result set will persist for 24 hours (so you can go back and check your work). It may seem minor to some, but it sure is convenient to be able to pull up the results from a previous query without having to execute the query a second time. Saves on time and processing. Read more

#9 Ability to connect with JDBC

Again seems like a no brainer but very important. I had no real clear concept of how I would connect to a data warehouse in the cloud so this was good news.  After getting my favorite data modeling tool, Oracle SQL Developer Data Modeler (SDDM),  installed on my new Mac, I was able to configure it to connect to my Snowflake demo schema using JDBC and reverse engineer the design. 

So why is this cool? It means that whatever BI or ETL tool you use today, if it can talk over JDBC, you can connect it to Snowflake. Read more

#8 UNDROP

With UNDROP in Snowflake you can recover a table instantaneously with a single command:

UNDROP TABLE <tablename>

No need to reload last night’s backup to do the restore. No need to wait while all that data is pulled back in. It just happens!

Now that is a huge time (and life) saver. Read more

#7 Fast Clone

Even cooler than UNDROP is the fast clone feature.

The Snowflake CLONE command can create a clone of a table, a schema, or an entire database almost instantly. It took me barely a minute to create a clone of a 2TB database without using additional storage! And I am not a DBA, let alone a “cloud” DBA.

This means you can create multiple copies of production data without incurring additional storage costs. No need to have separate test/dev data sets.

Hence why I think it is way cool! Read more

#6 JSON Support with SQL

During the first demo of Snowflake I attended (before I even applied for a job here), this one got my attention.

Using the knowledge and skills I already had with SQL, I could quickly learn to query JSON data, and join it to traditional tabular data in relational tables.

Wow – this looked like a great stepping stone into the world of “Big Data” without having to learn complex technologies like Hadoop, MapReduce, or Hive! Read more

Yes, I call that a very cool feature. And the fact that the JSON documents are stored in a table and optimized automatically in the background for MPP and columnar access. This gives you the ability to combine semi-structured and structured data, in one location. For further details check out my detailed 2 part blog here and here.

#5 ANSI compliant SQL with Analytic Functions

Another key feature in Snowflake, that is required to be called a relational data warehouse, is of course the ability to write standard SQL. More so, for data warehousing, is access to sophisticated analytic and windowing functions (e.g., lead, lag, rank, stddev, etc.).

Well Snowflake definitely has these.  In fact we support everything you would expect including aggregation functions, nested virtual tables, subqueries, order by, and group by. This means it is fairly simple for your team to migrate your existing data warehouse technologies to Snowflake. Read more

#4 Separation of Storage and Compute

The innovative, patent-pending, Multi-Cluster, Shared Data Architecture in Snowflake is beyond cool. The architecture consists of three layers; storage, compute, and cloud services. Each layer is decoupled from the other, each layer is independently scalable. This enables customers to scale resources as they are required, rather than pre-allocating resources for peak consumption. In my 30+ years working in IT, I have not seen anything like it.  It is truly one of the advantages that comes from engineering the product, from the ground up, to take full advantage of the elasticity of the cloud. Read more

#3 Support for Multiple Workloads

With this unique architecture, Snowflake can easily support multiple disparate workloads. Because of the separation of compute and storage, you can easily spin up separate Virtual Warehouses of different sizes to run your ELT processes, support BI report users, data scientists, and data miners. And it makes total sense to be able to keep disparate workloads separate, to avoid resource contention, rather than just saying we support “mixed” workloads.

And even better – no special skills or secret configuration settings are required to make this work. It is the way Snowflake is built by design. Nice! Read more

#2 Automatic Encryption of Data

Security is a major concern for moving to the cloud. With Snowflake, your data is automatically encrypted by default. No setup, no configuration, no add-on costs for high security features.

It is just part of the service! To me that is a huge win. Read more

#1 Automatic Query Optimization. No Tuning!

As a long time data architect, and not a DBA, this is my favorite part of Snowflake. I do not have to worry about my query performance at all. It is all handled “auto-magically” via meta data and an optimization engine in our cloud services layer. I just model, load, and query the data.

So, no indexes, no need to figure out partitions and partition keys, no need to pre-shard any data for distribution, and no need to remember to update statistics.

This feature, to me, is one of the most important when it comes to making Snowflake a zero management Data Warehouse as a Service offering. Read more

Well, that is the short list of my top 10 favorite features in Snowflake. Keep a look out for future posts in the coming weeks, to provide details on these and other key features of the Snowflake Elastic Data Warehouse.

Now check out this short intro video to Snowflake!

If you want to learn more about Snowflake, sign up for one of our frequent webinars, or just drop me a line at kent.graziano@snowflake.net and I will hook you up!

P.S. Keep an eye on my Twitter feed (@kentgraziano) and the Snowflake feed (@SnowflakeDB) for updates on all the action and activities here at Snowflake Computing. Watch for #BuiltForTheCloud and #DWaaS.