Snowflake vs. Azure DWH: choose for your long-term needs
When choosing a data warehouse engine organizations are also determining their technological landscape for decades ahead. Data warehouses are a reminiscence of old-school mainframes – in other words, they tend to outlive the traditional 7-year software lifecycle by decades.
Therefore, a decision of this magnitude is never an easy one. Do you choose for stability, performance or robustness? Do you plan a linear or exponential growth of your data in the next five years? How do you account for any upcoming innovation that requires either quadrupling the dataset or the processing power? Simply depending on the elasticity of the cloud is not enough.
My technology is superior… or is it?
Before we start, let me take some disclaimers out of the way: Code Runners is a Microsoft Gold Partner. As such, our opinion may or may not be biased towards Microsoft (Azure) technologies. The goal of this article is not to push a decision one way or the other. We are no longer living in a world where a technology can be simply labelled “superior”. Navigating the complexities of contemporary tech includes profound understanding of its landscape, supporting technologies and integrations. Hence, this article doesn’t aims at providing answers, but at raising questions. These questions should guide your choice and prepare your organization for the next decade of digital transformation.
With little deviations, integration, deployment and support are rated similarly for both Azure DWH and Snowflake. You can find out more on Gartner’s peer insights page.
Tools of the trade
One rather significant aspect is the larger family of tools that complement your data warehouse. Yet beware, APIs are abundant nowadays which commoditizes interoperability – what you should be looking for are deep, efficient, cloud-friendly integrations that would make your data easy to transport, store, load, analyze. That means you shouldn’t:
- start a DWH project without a 3-to-5-year roadmap
- explore DWH options without deep understanding of what you’re going to use it for
- assess DWH options without an idea of other tools needed to complete your data lifecycle
You would never buy a toolbox without considering if it would fit your worktable, right? You shouldn’t do it with your data either. Otherwise you might find yourself in one of those large scale, overbudget nightmares of a project with limited impact that everyone dislikes.
Traditions are important; knowledge is expensive
With all of the above in mind, a major determining factor for any new DWH initiative would be the overall organizational experience with a new technology. If your company has a long history of using Microsoft products, Azure DWH may feel natural and the learning curve may be much flatter than normal.
On the other hand, if your organization prefers to experiment and to stay away from vendor locks, an important unique selling point of Snowflake is its cloud independence. It is, however a standalone technology without directly related tools, so keep in mind that significant knowledge transfer is needed.
Flexibility is in the eye of the beholder
Microsoft DWH could be considered as a family of 7 products: Microsoft SQL Server, Microsoft Azure SQL Database, Azure Data Lake Store, Azure DataBricks, Microsoft Azure Synapse Analytics (formerly SQL Data Warehouse), CosmosDB, Microsoft Analytics Platform System – each of which may fit your specific use case better than the rest. Snowflake is a single product that proudly employs the “one size fits all” slogan. Again, dependent on the structure of your company you may prefer juggling few solutions that each fit a business line – or go with an unified approach.
Price is difficult to forecast
If you’ve ever used a cloud solution, you may be aware how difficult it is to predict its total cost of ownership (TCO) and from there – the ROI (return on investment). This uncertainty is further amplified by the many dimensions of the data you’re looking to store – size, speed of change, frequency of ETL, number and size of DMs (think – number of business applications), frequency of analysis… This is why you should focus on the points above and ensure that your DWH program is adopted quickly and used widely.
A few caveats to be aware of
Last but not least, you should consider some of the caveats that lie ahead of you.
I already covered the limited price predictability, so the next largest risk are the nightly updates of your product. We, unfortunately, no longer live in the age of heavily tested, built-to-last software, so bugs tend to happen relatively often. Make sure you’re prepared for moments when bugs do happen and rely on sanity checks to ensure data consistency. Remember, your data warehouse is going to be the single source of truth for your organization. Nothing speeds up the failure of a digital project faster than the lack of trust in the data therein. Make sure to validate each step and the dataset as often as possible; diligently alert users of any findings.
Both Snowflake and Microsoft Azure have pros and cons so make your 3-5 years roadmap leading when you make a decision for your DWH.