On a recent customer call, the data architects were asking if Snowflake provided a data model diagramming tool to design and generate data warehouse tables or to view a data model of an existing Snowflake data warehouse. Or if we knew of any that would work with Snowflake.
Well, we do not provide one of our own – our service is the Snowflake Elastic Data Warehouse (#ElasticDW).
The good news is that there are data modeling tools in the broader ecosystem that you can of course use (since we are ANSI SQL compliant).
You may have seen several posts (on my personal blog) where I used Oracle SQL Developer Data Modeler (aka SDDM) to reverse engineer and re-engineer a Snowflake database. (As a reminder, SDDM is a free-to-use enterprise class data modeling tool that you can download from Oracle here.)
If you have read my previous posts on using JSON within the Snowflake, you also know that we have a new data type called VARIANT for storing semi structured data like JSON, AVRO, and XML.
In this post I will bring it together and show you the steps to customize SDDM to allow you to model and generate table DDL that contain columns that use the VARIANT data type.
Creating a User Defined Data Type
One of the features of Snowflake is that you can run most Oracle DDL syntax (among others) without edits to create your tables, views, and constraints in a Snowflake database (so as to make it easier for you to migrate your current data warehouse to our cloud-based data warehouse service). So that means you can create your data warehouse model in SDDM, generate the DDL using Oracle syntax, then run it without any edits.
Of course Oracle does not have a VARIANT data type, so SDDM does not have it either. Nicely though you can pretty easily define your own custom data type to use in models and output in create table DDL statements.
I knew the feature was there but had a hard time getting it to do what I wanted, so I posted a question on the OTN Forum for Data Modeler and as usually Philip (the lead architect for Data Modeler) answered and laid out the steps for me. As he said, and you will see, it is not totally straightforward and a bit of a chicken and egg scenario.
Define a User Defined Native Type
First you have to go to Tools -> Types Administration and go to the tab for user defined native types. Then pick the RDBMS type you want to associate the new type with. I used Oracle Database 12c (remember there is no native Snowflake option). Press the green plus (+) to create a new entry. Under native type give it a name (in my case VARIANT), then under Logical type select one of the standard types. I used CLOB for now (this really is a temporary setting that I will change shortly).
Notice that there are other options for “Has size” and “Has precision and scale” that you might use if the type you want to define will need those properties, but for VARIANT you do not define size or precision so I left those unchecked.
Add a new Logical Type
This was the step that was not obvious at all to me. It turns out when you are defining the data type on a column you are usually selecting a Logical Data Type (unless you are picking a Domain). To make this work we need to add a new Logical Type to map our VARIANT data type to.
So now we need to switch to the first tab in Types Administration dialog. It is the one labeled “Logical types to native types.”
On this tab click the Add button on the bottom left. Now you can type in a new logical type name. I named mine VARIANT to prevent any confusion. The key piece now is to map it to the Native Type that was defined in the previous step. To do that you pick from the drop down on the database you set up (in this case Oracle 12c). In that list is the VARIANT type. Once that is selected you need to be sure to hit APPLY and then SAVE.
Now change the original Native Type
Again not that obvious a step, but go back to the “User defined native types” tab (the 3rd tab) and click on the Logical type drop down. Scroll all the way to the bottom and you should see your newly defined logical type. Pick it then press the SAVE button again. Now you can CLOSE the dialog (and save your design to be sure!)
Now you have the custom data type defined so it can be used!
Apply the User Defined Type
At this point you can either define a new column using your custom data type or edit an existing one to change the data type.
In my case, I reversed engineered a table in Snowflake that had a VARIANT column in it (before I made these changes to the data types).
Since SDDM did not know what VARIANT was, it guessed and assigned it to the Source Type of VARCHAR.
Note that the Data Type property is set to the Logical radio button – hence the need to set up a custom Logical Type too.
With the new type defined, I can now simply edit the column and pick VARIANT from the drop down list. It does appear that custom types go to the bottom of the list.
Don’t forget to press APPLY or OK to save the reassignment.
Generate the DDL
With the assignment done now do a right mouse click on the table in the diagram and pick DDL Preview. You should see the DDL includes the columns using the newly defined data type.
In my case I just copied the DDL from the preview window and pasted it into the Snowflake Web UI, then executed it.
Success! It worked!
I can now use Oracle SQL Developer Data Modeler to design tables for a Snowflake database that include columns using our innovative VARIANT data type.
And now you know how to do it to!