Azure Databricks – Adding Libraries

It is a really common requirement to add specific libraries to databricks. Libraries can be written in Python, Java, Scala, and R. You can upload Java, Scala, and Python libraries and point to external packages in PyPI, Maven, and CRAN repositories.

Libraries can be added in 3 scopes. Workspace, Notebook-scoped and cluster. I want to show you have easy it is to add (and search) for a library that you can add to the cluster, so that all notebooks attached to the cluster can leverage the library.

Within the Azure databricks portal – go to your cluster.

Continue reading

Azure Databricks and Azure Key Vault

The key vault should always be a core component of your Azure design because we can store keys, secrets, certicates thus abstract / hide the true connection string within files. When working with databricks to mount storage to ingest your data and query it ideally you should be leveraging this to create secrets and secret scopes.

Continue reading

Azure Databricks – The Notebook

Data engineers, pipe line developers, general data enthusiasts will be spending most of their time within a notebook. Here you develop your code, nice visualisations and commentary boxes are possible too, a very rich web-based interface and is best experienced with google chrome ( in my opinion).

Continue reading

Azure Databricks Overview

I have spent many long weekends getting stuck into Azure Databricks, plenty of time to understand the core functionality from mounting storage, streaming data, knowing the delta lake and how it fits into the bigger picture with tech like Event hubs, Azure SQL DW, Power BI etc.

So, I am going to show you how easy it is to create a cluster (that’s the end goal), you will appreciate the ease of deployment for huge amounts of infrastructure.

Continue reading

Tips and Tricks Azure Databricks

It has been a while since I posted an entry for TSQL Tuesday, which, for today is hosted by Kenneth . The subject being a non SQL Server tip. For a while now I have been using other technologies besides SQL Server and recently Azure Databricks and I have a handy tip for when starting this journey. It is not ground breaking but useful!


Continue reading

Azure Databricks to Azure SQL DB

Recently I got to a stage where I leveraged Databricks to the best of my ability to join couple of CSV files together, play around some aggregations and then output it back to a different mount point ( based on Azure Storage) as a parquet file, I decided that I actually wanted to move this data into Azure SQL DB, which you may want to do one day.

Continue reading