Quite simply the objective as follows: Move data from Azure SQL Database to Azure SQL DW via Azure Data Factory v2 (ADF).
Category Archives: Azure DWH
Pausing / Resuming Azure SQL DW
Before writing about pausing (and resuming) Azure SQL Data Warehouse (DW) it makes sense to discuss the architecture of this product. At a high level it involves a control node, a MPP (Massively Parallel Processing) engine compromising of compute nodes and storage. Perfectly summarised by this image.
Azure SQL Data Warehouse Query Store
I personally think that query store has been a fantastic feature. I find myself using it for query performance troubleshooting (plan regressions is a big one). This has always been available since SQL Server 2016 and even Azure SQL Database but now Microsoft have made it available for Azure SQL Data Warehouse (DW).
If you connect to the Azure SQL DW via SQL Server Management Studio (SSMS) there is no Query Store Node (for the version, I was checking 17.9.1). I was even wondering if it was on by default?
Azure SQL DW – Query Labels
Using a query label in Azure SQL DW (Data Warehouse) can be a really handy technique to track queries via DMVs. You might want to do this to see what problematic queries are doing under the covers.
Let’s check out an example. First I will show you how things would look without using a query label. I connect to SQL DW and issue the following basic example query.
SELECT sum(p.StandardCost),ps.[EnglishProductSubcategoryName] FROM [dbo].[DimProduct] p JOIN [dbo].[DimProductSubcategory] ps on P.ProductSubcategoryKey = ps.ProductSubcategoryKey GROUP BY p.standardCost, ps.[EnglishProductSubcategoryName] ORDER BY ps.[EnglishProductSubcategoryName]
So what query would you use to see what is going on within this MPP environment? You would normally use sys.dm_pdw_exec_requests and/or sys.dm_pdw_request_steps. Something like:
SELECT r.session_id, r.start_time, r.end_time,rs.location_type, rs.status,rs.command,rs.step_index, rs.operation_type,rs.request_id FROM sys.dm_pdw_exec_requests r JOIN sys.dm_pdw_request_steps rs ON r.request_id = rs.request_id
Output:
However you probably would not want to manually go through the output to try and piece the output together, this is where labels are handy. The original code now becomes:
SELECT sum(p.StandardCost),ps.[EnglishProductSubcategoryName] FROM [dbo].[DimProduct] p JOIN [dbo].[DimProductSubcategory] ps on P.ProductSubcategoryKey = ps.ProductSubcategoryKey GROUP BY p.standardCost, ps.[EnglishProductSubcategoryName] ORDER BY ps.[EnglishProductSubcategoryName] OPTION (LABEL = 'QUERY: MySalesCodeSum');
We then use this LABEL within the DMV based query.
SELECT r.session_id, r.start_time, r.end_time,rs.location_type, rs.status,rs.command,rs.step_index, rs.operation_type,rs.request_id FROM sys.dm_pdw_exec_requests r JOIN sys.dm_pdw_request_steps rs ON r.request_id = rs.request_id WHERE r.[label] = 'QUERY: MySalesCodeSum' ORDER BY step_index
Now you will have the output relevant to the original query only.
High level SQL DW is creating temp tables such as:
CREATE TABLE [tempdb].[dbo].[TEMP_ID_16] ([ProductSubcategoryKey] INT, [StandardCost] MONEY ) WITH(DATA_COMPRESSION=PAGE);
DMS does the famous ShuffleMoveOperation then the compute node actually ends up running:
SELECT [T1_1].[col] AS [col], [T1_1].[EnglishProductSubcategoryName] AS [EnglishProductSubcategoryName] FROM (SELECT SUM([T2_2].[StandardCost]) AS [col], [T2_1].[EnglishProductSubcategoryName] AS [EnglishProductSubcategoryName] FROM [tempdb].[dbo].[TEMP_ID_15] AS T2_1 INNER JOIN [tempdb].[dbo].[TEMP_ID_16] AS T2_2 ON ([T2_1].[ProductSubcategoryKey] = [T2_2].[ProductSubcategoryKey]) GROUP BY [T2_2].[StandardCost], [T2_1].[EnglishProductSubcategoryName]) AS T1_1 ORDER BY [T1_1].[EnglishProductSubcategoryName] ASC
Quite cool right?
Azure SQL Data Warehouse and Backups
Just because the cloud movement is strong doesn’t mean the end of “DBA’s”, it does mean a change in skills and no doubt you will (one day) create Azure SQL Data Warehouse (DW) in Azure. If you are from an operational background like me then backups will be on your mind for this product. The question is how are backups done with Azure SQL DW?
Azure SQL DW – Let’s Shuffle?
Initially, the main focus of this post was going to be quick and about using the latest version of SSMS (SQL Server Management Studio) to check out execution plans for Azure SQL Data warehouse (DW). I got a little side tracked by a certain operation called – SHUFFLE, because, I like the name.
Azure SQL DW – Database Has Been Paused
I was looking forward to do some work with Azure SQL Data Warehouse (DWH) when I hit the new query button and had the following message pop up –
Cannot connect to database when it is paused. (.Net SqlClient Data Provider)
Database Has Been Paused
I was looking forward to do some work with Azure SQL Data Warehouse (DWH) when I hit the new query button and had the following message pop up –
Cannot connect to database when it is paused. (.Net SqlClient Data Provider)