Time to shift gears away from the world of relational databases whether that is in the cloud, on-prem, Linux-based, containers or even sitting within Kubernetes. Everyone has heard of Synapse. They face stiff competition from the likes of Snowflake. Snowflake does not really have this concept of Control and compute nodes like within the Microsoft world to build out this MPP based architecture.
They use a multi cluster shared disk approach total separation of compute and storage layers and they utilise 3 levels of cache. With snowflake you can run it in Azure (which I am experienced in) and has excellent synergy with Azure Storage, Azure Event Grid, Azure Private links and Power BI – now this is how I came into snowflake, very much from the Azure angle and all the possible integrations. I have to say it is pretty good. You don’t have to run it on Azure, it probably has even more synergy within the AWS eco-system. For example their built in connectors to Sage Maker is very good.
I expect snowflake to get even more popular in time, it has slowly been climbing the db-engine rankings, its hard not to think about it.
My favourite benefits of snowflake
- Performance is very strong. Their scale “across” methods mean you can spin up many warehouses (compute wrappers) and query the same snowflake database without any concurrency issues.
- Data sharing – probably the best feature of the product. Cross share your data to other teams in your business real time and fast (or outside the business)
- The Data market place – official real-world datasets you can integrate into your business within minutes.
- Zero copy clone a TB database fast and with ease, so useful for fast agile development.
- Querying semi structured data is fast.
- I like HLL functions within snowflake.
- Built for the cloud. Snowflake is not a VM with some software on top of it. Below shows you the tight integration it has with cloud providers, in Azure it’s called a snowflake pod.
The Snowflake Pod
As I said, this isn’t just a VM with software installed. This is built for the cloud.
- Built on top the cloud provider’s availability zones (3) you can extend this with things like replication feature going across a different zone / region within the cloud provider (or to a different one).
Hopefully this is enough details to get you intrigued, stay tuned. I am no snowflake expert but have been very hands on with it within Azure Eco-system I will provide some learnings in the near future.