My backup and SLEEP_BPOOL_FLUSH

I noticed that my backup was taking longer than usual.

I went to Perfmon to look at some counters, more specifically Physical disk:Avg sec/read(orange/yellow line) and Physical disk:Avg sec/write(blue line), yes it was very busy, much busier than normal.

bak2

Waits stats was painting the following picture: I was very intrigued with the sleep_bpool_flush so I headed straight over to the waits library.

waits1bak

You get this wait type when the checkpoint process realizes that it’s saturating the I/O subsystem. (https://www.sqlskills.com/help/waits/sleep_bpool_flush/), so it seemed that my backup could not even issue a checkpoint successfully. With that thinking I decided to do a manual checkpoint which just made the problem worse.

blockme

After some detective work I found the answer. Someone enabled TDE and it was going though the encryption process whilst I was trying to take a backup so my backup was never going to complete anytime soon.

The error message I was ended up with was  “Msg 5901, Level 16, State 1, Line 7 One or more recovery units belonging to database ‘NEWdb’ failed to generate a checkpoint”.

You can use the following query to understand the state of the encryption process for your databases.

SELECT
db.name,db.is_encrypted,dm.encryption_state,dm.percent_complete,dm.key_algorithm,dm.key_length
FROM
sys.databases db
LEFT OUTER JOIN sys.dm_database_encryption_keys dm
ON db.database_id = dm.database_id

bak3

When you have a state of 2 it means encryption is in progress, you may have heard of this as an encryption scan.

This behaviour is confirmed via Books on line and totally makes sense. (Last bullet point)

The following operations are not allowed during the CREATE DATABASE ENCRYPTION KEY, ALTER DATABASE ENCRYPTION KEY, DROP DATABASE ENCRYPTION KEY, or ALTER DATABASE…SET ENCRYPTION statements.

  • Dropping a file from a filegroup in the database.
  • Dropping the database.
  • Taking the database offline.
  • Detaching a database.
  • Transitioning a database or filegroup into a READ ONLY state.
  • Using an ALTER DATABASE command.
  • Starting a database or database file backup.

You live and learn.

Using Extended Events Telemetry Session

You are probably wondering what this is, quite honestly I was wondering too. Well it is an extended events session known as telemetry_xevents that is shipped with SQL Server 2016. This is not the same as the system_health session that we have come to know and love, it captures very different information.

There are a lot of queries within this blog post but I think you will find them really useful because getting practical with it is the best way to show everyone what it is all about.

You can see the telemetry_xevents session via SSMS (SQL Server Management Studio).

tele

The script definition of telemetry_xevents is massive but I think it is important to show what it captures.

CREATE EVENT SESSION [telemetry_xevents] ON SERVER
ADD EVENT qds.query_store_db_diagnostics,
ADD EVENT sqlserver.always_encrypted_query_count,
ADD EVENT sqlserver.auto_stats,
ADD EVENT sqlserver.column_store_index_build_low_memory,
ADD EVENT sqlserver.column_store_index_build_throttle,
ADD EVENT sqlserver.columnstore_delete_buffer_flush_failed,
ADD EVENT sqlserver.columnstore_delta_rowgroup_closed,
ADD EVENT sqlserver.columnstore_index_reorg_failed,
ADD EVENT sqlserver.columnstore_log_exception,
ADD EVENT sqlserver.columnstore_rowgroup_merge_failed,
ADD EVENT sqlserver.columnstore_tuple_mover_delete_buffer_truncate_timed_out,
ADD EVENT sqlserver.columnstore_tuple_mover_end_compress,
ADD EVENT sqlserver.data_masking_ddl_column_definition,
ADD EVENT sqlserver.data_masking_traffic,
ADD EVENT sqlserver.data_masking_traffic_masked_only,
ADD EVENT sqlserver.database_cmptlevel_change,
ADD EVENT sqlserver.database_created,
ADD EVENT sqlserver.database_dropped,
ADD EVENT sqlserver.error_reported(
WHERE ([severity]>=(16) OR ([error_number]=(18456) OR [error_number]=(17803) OR [error_number]=(701) OR [error_number]=(802) OR [error_number]=(8645) OR [error_number]=(8651) OR [error_number]=(8657) OR [error_number]=(8902) OR [error_number]=(41354) OR [error_number]=(41355) OR [error_number]=(41367) OR [error_number]=(41384) OR [error_number]=(41336) OR [error_number]=(41309) OR [error_number]=(41312) OR [error_number]=(41313)))),
ADD EVENT sqlserver.json_function_compiled(
ACTION(sqlserver.database_id)),
ADD EVENT sqlserver.missing_column_statistics,
ADD EVENT sqlserver.missing_join_predicate,
ADD EVENT sqlserver.natively_compiled_module_inefficiency_detected,
ADD EVENT sqlserver.natively_compiled_proc_slow_parameter_passing,
ADD EVENT sqlserver.query_memory_grant_blocking,
ADD EVENT sqlserver.reason_many_foreign_keys_operator_not_used,
ADD EVENT sqlserver.rls_query_count,
ADD EVENT sqlserver.sequence_function_used(
ACTION(sqlserver.database_id)),
ADD EVENT sqlserver.server_memory_change,
ADD EVENT sqlserver.server_start_stop,
ADD EVENT sqlserver.stretch_database_disable_completed,
ADD EVENT sqlserver.stretch_database_enable_completed,
ADD EVENT sqlserver.stretch_database_reauthorize_completed,
ADD EVENT sqlserver.stretch_index_reconciliation_codegen_completed,
ADD EVENT sqlserver.stretch_remote_column_execution_completed,
ADD EVENT sqlserver.stretch_remote_column_reconciliation_codegen_completed,
ADD EVENT sqlserver.stretch_remote_index_execution_completed,
ADD EVENT sqlserver.stretch_table_alter_ddl,
ADD EVENT sqlserver.stretch_table_codegen_completed,
ADD EVENT sqlserver.stretch_table_create_ddl,
ADD EVENT sqlserver.stretch_table_data_reconciliation_results_event,
ADD EVENT sqlserver.stretch_table_hinted_admin_delete_event,
ADD EVENT sqlserver.stretch_table_hinted_admin_update_event,
ADD EVENT sqlserver.stretch_table_predicate_not_specified,
ADD EVENT sqlserver.stretch_table_predicate_specified,
ADD EVENT sqlserver.stretch_table_query_error,
ADD EVENT sqlserver.stretch_table_remote_creation_completed,
ADD EVENT sqlserver.stretch_table_row_migration_results_event,
ADD EVENT sqlserver.stretch_table_row_unmigration_results_event,
ADD EVENT sqlserver.stretch_table_unprovision_completed,
ADD EVENT sqlserver.stretch_table_validation_error,
ADD EVENT sqlserver.string_escape_compiled(
ACTION(sqlserver.database_id)),
ADD EVENT sqlserver.temporal_ddl_period_add,
ADD EVENT sqlserver.temporal_ddl_period_drop,
ADD EVENT sqlserver.temporal_ddl_schema_check_fail,
ADD EVENT sqlserver.temporal_ddl_system_versioning,
ADD EVENT sqlserver.temporal_dml_transaction_fail,
ADD EVENT sqlserver.window_function_used(
ACTION(sqlserver.database_id)),
ADD EVENT sqlserver.xtp_alter_table,
ADD EVENT sqlserver.xtp_db_delete_only_mode_updatedhktrimlsn,
ADD EVENT sqlserver.xtp_stgif_container_added,
ADD EVENT sqlserver.xtp_stgif_container_deleted,
ADD EVENT XtpCompile.cl_duration,
ADD EVENT XtpEngine.parallel_alter_stats,
ADD EVENT XtpEngine.serial_alter_stats,
ADD EVENT XtpEngine.xtp_db_delete_only_mode_enter,
ADD EVENT XtpEngine.xtp_db_delete_only_mode_exit,
ADD EVENT XtpEngine.xtp_db_delete_only_mode_update,
ADD EVENT XtpEngine.xtp_physical_db_restarted
ADD TARGET package0.ring_buffer(SET occurrence_number=(100))
WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=120 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=OFF,STARTUP_STATE=ON)
GO

As you can see it is very different when comparing it to the system health session and more importantly there is no XEL file target, querying the ring buffer is quite different.

To see things at a high level you would run the following basic query.


DECLARE @ShredMe XML;
SELECT @ShredMe = CAST(target_data AS XML)
FROM sys.dm_xe_sessions AS s
JOIN sys.dm_xe_session_targets AS t
ON t.event_session_address = s.address
WHERE s.name = N'telemetry_xevents';

SELECT
qp.query('.') AS event_data
FROM @ShredMe.nodes('RingBufferTarget/event') AS q(qp);

tele1

It’s much better if we can shred the XML to be more specific. I gave it a go using XQuery because I had a lot of the auto_stats events.

DECLARE @ShredMe XML;
SELECT @ShredMe = CAST(target_data AS XML)
FROM sys.dm_xe_sessions AS s
JOIN sys.dm_xe_session_targets AS t
ON t.event_session_address = s.address
WHERE s.name = N'telemetry_xevents';

SELECT
QP.value('(@timestamp)[1]', 'datetime2') AS [timestamp logged],
QP.value('(data[@name="database_id"]/value)[1]', 'INT') as [DatabaseID],
QP.value('(data[@name="object_id"]/value)[1]', 'INT') as [ObjectID],
QP.value('(data[@name="index_id"]/value)[1]', 'INT') as [IndexID],
QP.value('(data[@name="job_type"]/text)[1]', 'VARCHAR(MAX)') as [Job Type],
QP.value('(data[@name="statistics_list"]/value)[1]', 'VARCHAR(MAX)') as [Stats List]
FROM @ShredMe.nodes('RingBufferTarget/event[@name=''auto_stats'']') AS q(QP);
GO

statsxe

Looking at the above you can see that you have the ability to get a time when a stats object has been created,loaded and/or updated.

I think there is great information available here, I started to dig in more.

Example 1

I am going to find database creation times. I created couple of databases and then issued the following query:

DECLARE @ShredMe XML;
SELECT @ShredMe = CAST(target_data AS XML)
FROM sys.dm_xe_sessions AS s
JOIN sys.dm_xe_session_targets AS t
ON t.event_session_address = s.address
WHERE s.name = N'telemetry_xevents';

SELECT
QP.value('(data[@name="database_name"]/value)[1]', 'varchar(20)') as [DatabaseName],
QP.value('(@timestamp)[1]', 'datetime2') AS [timestamp created]

FROM @ShredMe.nodes('RingBufferTarget/event[@name=''database_created'']') AS q(QP);

xedb

Example 2

How about finding when the compatibility level of a database changed?

DECLARE @ShredMe XML;
SELECT @ShredMe = CAST(target_data AS XML)
FROM sys.dm_xe_sessions AS s
JOIN sys.dm_xe_session_targets AS t
ON t.event_session_address = s.address
WHERE s.name = N'telemetry_xevents';

SELECT
QP.value('(data[@name="database_id"]/value)[1]', 'INT') as [DatabaseID],
QP.value('(@timestamp)[1]', 'datetime2') AS [timestamp of change],
QP.value('(data[@name="previous_value"]/value)[1]', 'INT') as [Previous cmptlevel],
QP.value('(data[@name="new_value"]/value)[1]', 'INT') as [Newcmptlevel]

FROM @ShredMe.nodes('RingBufferTarget/event[@name=''database_cmptlevel_change'']') AS q(QP);

teledb1

Example 3

Finding the times when a database was dropped.


DECLARE @ShredMe XML;
SELECT @ShredMe = CAST(target_data AS XML)
FROM sys.dm_xe_sessions AS s
JOIN sys.dm_xe_session_targets AS t
ON t.event_session_address = s.address
WHERE s.name = N'telemetry_xevents';

SELECT
QP.value('(data[@name="database_name"]/value)[1]', 'varchar(20)') as [DatabaseName],
QP.value('(@timestamp)[1]', 'datetime2') AS [timestamp when dropped]
FROM @ShredMe.nodes('RingBufferTarget/event[@name=''database_dropped'']') AS q(QP);

tele2

Example 4

Memory change tracking.

DECLARE @ShredMe XML;
SELECT @ShredMe = CAST(target_data AS XML)
FROM sys.dm_xe_sessions AS s
JOIN sys.dm_xe_session_targets AS t
ON t.event_session_address = s.address
WHERE s.name = N'telemetry_xevents';

SELECT
QP.value('(data[@name="memory_change"]/text)[1]', 'varchar(256)') as [Memory Change Desc],
QP.value('(@timestamp)[1]', 'datetime2') AS [timestamp changed],
QP.value('(data[@name="new_memory_size_mb"]/value)[1]', 'int') as [New Memory size(MB)]
FROM @ShredMe.nodes('RingBufferTarget/event[@name=''server_memory_change'']') AS q(QP);

2016ram

After playing with the memory settings I then received the below:

ram2.JPG

Example 5

Checking if someone created a temporal table, then you can use the ID (which I labelled it as TableId) and find it in sys.tables.

DECLARE @ShredMe XML;
SELECT @ShredMe = CAST(target_data AS XML)
FROM sys.dm_xe_sessions AS s
JOIN sys.dm_xe_session_targets AS t
ON t.event_session_address = s.address
WHERE s.name = N'telemetry_xevents';

SELECT
QP.value('(data[@name="database_id"]/value)[1]', 'INT') as [DatabaseID],
QP.value('(data[@name="table_id"]/value)[1]', 'BIGINT') as [TableID],
QP.value('(@timestamp)[1]', 'datetime2') AS [timestamp created temporal DDL],
QP.value('(data[@name="period_type"]/text)[1]', 'varchar(20)') as [Type]
FROM @ShredMe.nodes('RingBufferTarget/event[@name=''temporal_ddl_period_add'']') AS q(QP);

versions.JPG

As mentioned you can then confirm the name of the table via the below query:

select * from sys.tables
where object_id = 645577338

temptable

Example 6

Finally I check for error messages greater than level 16.

DECLARE @ShredMe XML;
SELECT @ShredMe = CAST(target_data AS XML)
FROM sys.dm_xe_sessions AS s
JOIN sys.dm_xe_session_targets AS t
ON t.event_session_address = s.address
WHERE s.name = N'telemetry_xevents';

SELECT
QP.value('(data[@name="severity"]/value)[1]', 'INT') as [severity level],
QP.value('(@timestamp)[1]', 'datetime2') AS [timestamp logged],
QP.value('(data[@name="message"]/value)[1]', 'VARCHAR(MAX)') as [Message]
FROM @ShredMe.nodes('RingBufferTarget/event[@name=''error_reported'']') AS q(QP);
GO

errors.JPG

The queries you can build to get information are endless.

As a side not I did get missing join predicate warnings but the XML didn’t really return anything for me to query.

Do not forget about the drawbacks. The ring buffer isn’t perfect, if anything I found it quite annoying.  It has a maximum amount of memory in kilobytes (KB) to use that being 4096 and 100 occurrences which means the preferred number of events of each type to keep in the buffer. After a restart the information will be lost.

Looking at wait stats differently with SQL 2016

Wait statistics is a great place to start looking into performance issues. Normally people would issue commands/queries based on  sys.dm_os_wait_stats and/or sys.dm_os_waiting_tasks. Did you know from SQL 2016 you have the ability to work with wait stats at a session level? This is done via sys.dm_exec_session_waits_stats. As stated by Microsoft documentation the “DMV resets the information for a session when the session is opened or when the session is reset” .

This is what is returned. The key bit here is the session_ID.

waitsdmv1

So let’s try and put this to use. Please note I use SQLskills queries written by Paul Randal for my wait stats analysis as shown here http://www.sqlskills.com/blogs/paul/wait-statistics-or-please-tell-me-where-it-hurts/, probably because that is where I learnt about the topic.

So I run a specific workload such as

DBCC CHECKDB ('AdventureWorks2016')
GO

Then I use the query from the above link which is based on  sys.dm_os_wait_stats  to check wait stats returning the following snippet (I did not clear anything out beforehand):

waitsmain

This tells me about the waits since my last reboot or since a manual reset of the stats. It’s probably why you should do at least time-based analysis or reset the wait stats before starting, that is if you are interested in something time specific or if you want to understand certain workloads at a given time.

So the other option is that you could go down the session level route. With the session based analysis I took the query and changed it slightly to query sys.dm_exec_session_wait_stats and also pull back the session_id that I am interested in.

First thing is that I used the following to get an ID of interest. This was actually the connection for when I ran the CHECKDB command – SPID 56.


SELECT @@SPID AS [Current SPID]

spid56

I then use this ID within the WHERE clause of the query so I could track it to get session level wait stats.

checkwait

As you would expect it is very specific and quite relevant for the workload (CHECKDB) and for that moment in time only. Its definitely not a replacement technique for the other traditional ways but it could be something that you could use if you want to track a session ID real time.

Here is another example, this time tracking session_id 53 which is a big SELECT statement.

select.JPG

Session level waits:

rbar

Using the DMV in its most simplest form would return the following:

SELECT * FROM sys.dm_exec_session_wait_stats
WHERE session_id = 54

simple

 

I guess you could even use sys.dm_exec_session_waits_stats to get some further details when/if you query sys.dm_exec_sessions or anywhere else where you can join onto the session_id, such as:

SELECT
s.program_name,s.login_time,s.database_id, cws.wait_type,cws.wait_time_ms
FROM sys.dm_exec_session_wait_stats AS cws
JOIN sys.dm_exec_sessions AS s
ON cws.session_id = s.session_id
WHERE cws.session_id = 54
ORDER BY wait_time_ms DESC

Snippet:

waits2016_2

Checking Time Zones in SQL Server

I was building some basic queries around time zone manipulation and I am happy to say that I enjoyed myself as I found a way to get a time based on a geographic region.

This new feature is available to you from SQL 2016 which obviously means SQL database too (Azure). It is called AT TIME ZONE.

AT TIME ZONE implementation relies on following registry hive where time zones are installed: KEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Time Zones. This is exposed via the following query


SELECT * FROM sys.time_zone_info

Looking at books on line via https://msdn.microsoft.com/en-us/library/mt612790.aspx it is wrong.

Look at the screen shot, it states it is supported from SQL 2008 – well it is not, it only works on 2016.

times

Running it on a non SQL 2016 server you will get the following message:

Msg 208, Level 16, State 1, Line 1 Invalid object name ‘sys.time_zone_info’.

On a correct SQL version it will return the following. (Snippet)

times2

This information is key as we will use it to build the time zone queries (which is very basic).

SELECT SYSDATETIMEOFFSET() AT TIME ZONE 'GMT Standard Time' AS [My Location]
GO
SELECT SYSDATETIMEOFFSET() AT TIME ZONE 'US Mountain Standard Time' AS [US Mountain Standard Time]
GO
SELECT SYSDATETIMEOFFSET() AT TIME ZONE 'Central Asia Standard Time' AS [Central Asia Standard Time]
GO
SELECT SYSDATETIMEOFFSET() AT TIME ZONE 'AUS Central Standard Time' AS [AUS Central Standard Time]

Based on the zone offset you will get the relevant correct time.

times3

 

I used to shrink my databases!

This is an entry level post and a response to SQLEspresso’s blog challenge(http://sqlespresso.com/2017/01/10/ooops-was-that-was-me-blog-challenge) where we share mistakes from our “younger” days. My post takes me back 11 years and while it is nothing ground breaking I still  want to convey what shrinking does to your database. Why? Well it was something that I USED to do. ( Again – I will reinforce the point it was a long time ago)

Well not only does it generate a lot of I/O, consumes CPU but it also affects your fragmentation levels in indexes which is what we will look at today.

These tables and indexes are based from Jonathan’s script found here: https://www.sqlskills.com/blogs/jonathan/enlarging-the-adventureworks-sample-databases/

So I issued the below TSQL where I would expect to see no fragmentation after the rebuild (Index level 0 being the leaf node). Yes the page count is low but it’s the concept I want to talk about.


ALTER INDEX PK_SalesOrderHeaderEnlarged_SalesOrderID ON Sales.SalesOrderHeaderEnlarged REBUILD

DECLARE @db_id SMALLINT;
DECLARE @object_id INT;  

SET @db_id = DB_ID(N'AdventureWorks1997');
SET @object_id = OBJECT_ID(N'Sales.SalesOrderHeaderEnlarged');  

IF @db_id IS NULL
BEGIN;
    PRINT N'Invalid database';
END;
ELSE IF @object_id IS NULL
BEGIN;
    PRINT N'Invalid object';
END;
ELSE
BEGIN;
    SELECT * FROM sys.dm_db_index_physical_stats(@db_id, @object_id, NULL, NULL , 'detailed');
END;
GO  

shrink

Now let’s SHRINK. You can actually do this via SQL Server Management Studio.


DBCC SHRINKDATABASE(N'AdventureWorks1997' )
GO

DECLARE @db_id SMALLINT;
DECLARE @object_id INT;  

SET @db_id = DB_ID(N'AdventureWorks1997');
SET @object_id = OBJECT_ID(N'Sales.SalesOrderHeaderEnlarged');  

IF @db_id IS NULL
BEGIN;
    PRINT N'Invalid database';
END;
ELSE IF @object_id IS NULL
BEGIN;
    PRINT N'Invalid object';
END;
ELSE
BEGIN;
    SELECT * FROM sys.dm_db_index_physical_stats(@db_id, @object_id, NULL, NULL , 'detailed');
END;
GO

99.85% avg_fragmentation, in essence the order has been reversed.

shrinkafterOk, for a one-of activity I don’t mind  you could just sort out the fragmentation afterwards, but to do it as part of a maintenance routine, not the best option out there. SHAME ON ME!