Differential Changed Map (DCM): How SQL Server Tracks Changed Pages

2014-05-07 - Backup, General, Series, SQL Server Internals, Storage Wednesday

Introduction

SQL Server has three backup modes:

  • Full
  • Differential
  • Log

The full backup writes a copy of all database pages (including empty ones) to the backup file. The full backup also captures enough of the database log to be able to recover the database to a transactionally consistent state at restore time.

The differential backup works in a similar way, but includes only those extents in the backup that were changed since the last full backup.

The log backup is independent of the other two backup types and backs up the piece of the log file that has changed since the last log backup was taken. If no log backup had been taken of the database since it was created (or switched out of simple recovery mode), the log backup backs up the piece of the log since the last full backup.

The Differential Change Map (DCM)

For SQL Server to be able to backup only those extents during a differential backup that changed since the last full backup, it needs a way to identify pages that changed. In the page header of each page, SQL Server stores the last LSN that changed that page (in the m_lsn field).
It also stores the last LSN number included in the last full backup. That means, SQL Server could read each page, compare the two LSN values and copy the page to the backup file, if its LSN is larger than the full-backup-LSN. However, that would require every page to be read into memory, quite an expensive operation for large databases.

Instead SQL Server uses a bitmap, similar to the one in GAM pages, to keep track of extents that changed since the last full backup. This bitmap is called Differential Changed Map or DCM. The DCM is spread over several DCM pages in the database. Each DCM page covers a single GAM interval. The DCM page for each GAM interval is the seventh page within that interval, and as such it is part of the GAM extent. If the bit for a particular extent is set in its DCM page, at least one page within that extent has been changed since the last full backup.

A DCM Example

To demonstrate the DCM in action, we first need a simple table:

[sql] IF OBJECT_ID('dbo.ATable') IS NOT NULL DROP TABLE dbo.ATable;
CREATE TABLE dbo.ATable(id INT IDENTITY(1,1) PRIMARY KEY CLUSTERED, cc CHAR(8000));
GO
INSERT INTO dbo.ATable(cc) SELECT CAST(NEWID() AS VARCHAR(MAX));
GO 7000
[/sql]

After creating that table, we can take a full backup:

[sql] BACKUP DATABASE [Image ©sqlity.net 2014] TO DISK = 'c:\temp\Backup.bak' WITH FORMAT,INIT;
[/sql]

After every backup, SQL Server stores information about that backup in the database. That causes some pages to change right away, so you will never see a completely empty DCM:

The DCM after the first full backup.

The complete output of the DBCC PAGE command for the first DCM page (page 6) is below. (Remember to set trace flag 3604 before using DBCC PAGE.)

[sourcecode] PAGE: (1:6)

BUFFER:

BUF @0x0000000004812C40

bpage = 0x00000001F3D1E000 bhash = 0x0000000000000000 bpageno = (1:6)
bdbid = 16 breferences = 15 bcputicks = 670
bsampleCount = 3 bUse1 = 47227 bstat = 0xb
blog = 0x7a7a7a7a bnext = 0x0000000000000000

PAGE HEADER:

Page @0x00000001F3D1E000

m_pageId = (1:6) m_headerVersion = 1 m_type = 16
m_typeFlagBits = 0x0 m_level = 0 m_flagBits = 0x0
m_objId (AllocUnitId.idObj) = 99 m_indexId (AllocUnitId.idInd) = 0 Metadata: AllocUnitId = 6488064
Metadata: PartitionId = 0 Metadata: IndexId = 0 Metadata: ObjectId = 99
m_prevPage = (0:0) m_nextPage = (0:0) pminlen = 90
m_slotCnt = 2 m_freeCnt = 6 m_freeData = 8182
m_reservedCnt = 0 m_lsn = (285:2064:1) m_xactReserved = 0
m_xdesId = (0:0) m_ghostRecCnt = 0 m_tornBits = -2051141691
DB Frag ID = 1

Allocation Status

GAM (1:2) = ALLOCATED SGAM (1:3) = NOT ALLOCATED PFS (1:1) = 0x44 ALLOCATED 100_PCT_FULL
DIFF (1:6) = CHANGED ML (1:7) = NOT MIN_LOGGED

DIFF_MAP: Header @0x000000006E28A064 Slot 0, Offset 96

status = 0x0

DIFF_MAP: Extent Alloc Status @0x000000006E28A0C2

(1:0) - (1:16) = CHANGED
(1:24) - (1:56) = NOT CHANGED
(1:64) - = CHANGED
(1:72) - (1:144) = NOT CHANGED
(1:152) - = CHANGED
(1:160) - (1:7680) = NOT CHANGED

DBCC execution completed. If DBCC printed error messages, contact your system administrator.
[/sourcecode]

In the header you can see that the page type of DCM pages is 16. At the end of the output is the list of the extents that contain changed pages. There are currently five changed extents listed (1:0, 1:8, 1:16, 1:64, 1:152).

Now let us apply a few changes to the table:

[sql] UPDATE dbo.ATable SET
cc = CAST(NEWID() AS VARCHAR(MAX))
WHERE id IN (100,800,1500,2200);
[/sql]

This will cause at least four pages to change. In the example, they happen to be located in four different extents too:

The DCM after the first UPDATE.

The full differential changed map looks like this:

[sourcecode] DIFF_MAP: Header @0x000000006E28A064 Slot 0, Offset 96

status = 0x0

DIFF_MAP: Extent Alloc Status @0x000000006E28A0C2

(1:0) - (1:16) = CHANGED
(1:24) - (1:56) = NOT CHANGED
(1:64) - = CHANGED
(1:72) - (1:144) = NOT CHANGED
(1:152) - = CHANGED
(1:160) - (1:432) = NOT CHANGED
(1:440) - = CHANGED
(1:448) - (1:1152) = NOT CHANGED
(1:1160) - = CHANGED
(1:1168) - (1:1856) = NOT CHANGED
(1:1864) - = CHANGED
(1:1872) - (1:2552) = NOT CHANGED
(1:2560) - = CHANGED
(1:2568) - (1:7680) = NOT CHANGED
[/sourcecode]

In addition to the extents that were marked as changed before (1:0, 1:8, 1:16,1:64, 1:152), now there are four new ones: 1:440, 1:1160, 1:1864 and 1:2560

That means that there are currently nine extents or 72 pages marked as changed since the last full backup was taken. Let us try to take a differential backup now:

[sql] BACKUP DATABASE [Image ©sqlity.net 2014] TO DISK = 'c:\temp\Backup.bak' WITH DIFFERENTIAL;
[/sql]

This backup writes 96 pages to disk, 24 more than expected:

Pages affected by differential backup after first update.

Those additional pages do not come from a second GAM interval as you might think. They are just additional internal pages that SQL Server requires to be part of the differential backup.

Let us change a few more records in that table:

[sql] UPDATE dbo.ATable SET
cc = CAST(NEWID() AS VARCHAR(MAX))
WHERE id IN (3100,3800,4500,5200);
[/sql]

This will again change four pages, each in a separate extent:

The DCM after the second backup

The complete DCM now looks like this:

[sourcecode] (1:0) - (1:16) = CHANGED
(1:24) - (1:56) = NOT CHANGED
(1:64) - = CHANGED
(1:72) - = NOT CHANGED
(1:80) - = CHANGED
(1:88) - (1:128) = NOT CHANGED
(1:136) - = CHANGED
(1:144) - = NOT CHANGED
(1:152) - = CHANGED
(1:160) - (1:296) = NOT CHANGED
(1:304) - = CHANGED
(1:312) - (1:432) = NOT CHANGED
(1:440) - = CHANGED
(1:448) - (1:1152) = NOT CHANGED
(1:1160) - = CHANGED
(1:1168) - (1:1856) = NOT CHANGED
(1:1864) - = CHANGED
(1:1872) - (1:2552) = NOT CHANGED
(1:2560) - = CHANGED
(1:2568) - (1:3456) = NOT CHANGED
(1:3464) - = CHANGED
(1:3472) - (1:4160) = NOT CHANGED
(1:4168) - = CHANGED
(1:4176) - (1:4864) = NOT CHANGED
(1:4872) - = CHANGED
(1:4880) - (1:5568) = NOT CHANGED
(1:5576) - = CHANGED
(1:5584) - (1:7680) = NOT CHANGED
[/sourcecode]

It lists 13 changed extends. Therefore, we would expect 104 pages to be included in the next differential backup. Let us try:

[sql] BACKUP DATABASE [Image ©sqlity.net 2014] TO DISK = 'c:\temp\Backup.bak' WITH DIFFERENTIAL;
[/sql]

Again, that actual number is a little higher:

Pages affected by differential backup after second update.

In fact, it is higher by 24.But that is the exact same number that we had before as overage. So it seems that SQL Server backs up three extends in addition to the extents listed in the DCM.

The DCM is reset, every time a full backup is taken (unless COPY_ONLY is specified):

The DCM after the second full backup.

These are the same five changed extents that we had after the first full backup.

Summary

The Differential Changed Map or DCM is a bitmap that SQL Server uses to track which pages changed since the last full backup. It is stored in DCM pages, one per GAM interval. A differential backup will backup the pages marked as changed in the DCM. There are usually three additional extents included in a differential backup.

Categories: Backup, General, Series, SQL Server Internals, Storage Wednesday
Tags: , , , , , ,

3 Responses to Differential Changed Map (DCM): How SQL Server Tracks Changed Pages

  1. Pingback: Bulk Changed Map (BCM): The Minimally Logged Advantage - sqlity.net

Leave a Reply