Azure Databricks and Azure Key Vault

Posted on October 28, 2020 by blobeater

The key vault should always be a core component of your Azure design because we can store keys, secrets, certicates thus abstract / hide the true connection string within files. When working with databricks to mount storage to ingest your data and query it ideally you should be leveraging this to create secrets and secret scopes.

Continue reading →

Azure Databricks – Pin Cluster

Posted on October 22, 2020 by blobeater

Before discussing why you would want to pin a cluster it would be useful to understand the different states of a cluster. We can have:

Continue reading →

Azure Databricks to Power BI

Posted on October 20, 2020 by blobeater

A very common approach is to query data straight from Databricks via Power BI. For this you need Databricks token and the JDBC address URL. This is found within Account settings of the cluster.

Continue reading →

Azure Databricks – The Notebook

Posted on October 14, 2020 by blobeater

Data engineers, pipe line developers, general data enthusiasts will be spending most of their time within a notebook. Here you develop your code, nice visualisations and commentary boxes are possible too, a very rich web-based interface and is best experienced with google chrome ( in my opinion).

Continue reading →

Azure Databricks Overview

Posted on October 6, 2020 by blobeater

I have spent many long weekends getting stuck into Azure Databricks, plenty of time to understand the core functionality from mounting storage, streaming data, knowing the delta lake and how it fits into the bigger picture with tech like Event hubs, Azure SQL DW, Power BI etc.

So, I am going to show you how easy it is to create a cluster (that’s the end goal), you will appreciate the ease of deployment for huge amounts of infrastructure.

Continue reading →