Thursday, 20 June 2013

Snapshot Conundrum
VMware have a KB article dealing with understanding Snapshots:
In it they provide a diagram that shows how a VM with 3 snapshots against it operates. I've copied the diagram below.

I just completed reading a VMware book and they made a statement which tweaked my curiosity. "If you have three snapshots, any new data is written to all three". This conflicted with what I had learned and assumed from my VCP days. The diagram above appears to support this statement (apart from the VM write to the Parent Disk which the Authors have issued an errata for) but the KB article itself makes no similar assertion.

My interest was around performance. If I create multiple snapshots how much does it degrade performance? I understand that Reads may have to traverse the whole string of snapshots back to the Parent disk to retrieve a file and ensure its the most up to date version, but writes?

My own understanding was Writes only happen to the active Snapshot disk, the others are essentially frozen. Being confused by this contradiction I wondered if it would be possible to test this is my lab? If I could use ESXTOP to view the disk statistics for the parent file and each snapshot disk I could tell straightaway where the reads and writes were occurring. Easy? Goes to show I need to use ESXTOP more!

The disk statistics in VMware's ESXTOP aggregates all disks for the VM under one heading. Even if stored on different Datastores. This means I can't see "inside" each disk file to tell where the writes are occurring based on performance counters in ESXi. I can see individual disk latency in the ViClient but nothing more granular than that.
So, maybe we can use the modified date & timestamp and view the disk size after a change to determine where the writes are happening. We can view the directory in Putty and use the "ls -alh" to see what the directory looks like for a single VM with one virtual disk and 3 snapshots against it.

This is the VM's folder view in Putty with no snapshots, our baseline:
This is the VM's folder view in Putty after 3 Snapshots have been created, a few minutes apart:
Now, we're interested in which disk gets modified and grows if we drop a file into the VM. I'm going to copy the vSphere Client, a 110MB file onto the C: drive and run the ls -alh command again to see what happens. We will either see the 000003 disk modified time change and the size grow accordingly, or we'll see all three snapshot disks change the same way. The results are as follows:
And the result? Only the most recent Snapshot disk in the chain is affected by the file copy. The Modified Time and Size adjust appropriately but there is no associated change to the previous snapshots in the chain. The VM Parent Disk "Snapshot-flat.vmdk" is also unaffected. 

[Edit 21/6/13: Note the Snapshot file grows in 16MB chunks, hence the size is multiples of 16MB and may be larger than the file copied into the VM - Thanks to VMware's Rick Blyths Post on Snapshots for that nugget!]

A ViClient Datastore Browser view of the same thing is here:

I must admit I love reading something which challenges your preconceptions and motivates you to investigate. I found the rest of the book a great read, almost too short as the case studies and experiences relayed are rarely found outside Blogs and Live events. As writing a book and getting it past editors and published is a significant achievement I've nothing but respect for the Authors. I'm going to deliberately hold off naming anyone as my interest is from a technology perspective but if the Authors leave a comment on my Blog I'm more than happy to publish it.

So, that's it for my very first blog post! I'll be posting again as soon as I find interesting topics to share. If you have any suggestions on the article above, can suggest alternative tests or spot obvious faults in my methodology please let me know. I'm still learning!

Mike

[Update: 21/6/13] Well, I found a way to monitor which snapshot files are being touched when a write occurs. How did I do it? Windows Server 2012! I enabled NFS and was able to view the contents of the folder with Windows Explorer and, more importantly, view the individual file writes with Resource Monitor. I copied the ViClient as before and then copied the vCenter install folder (1.2GB) to get a longer reading and captured the results. This is the Windows Folder "E:\NFS\Snapshot" Explorer view after both file copies:
This is the output from Windows Resource Monitor, Disk View, scoped to the System process. The NFS Root Share was created in E:\NFS and the VM name was called "Snapshot" as shown in the path below:
 
So that's it, Myth-Busted as they say! There are NO writes to the Parent or first two snapshot files, only the most recent Snapshot file (000003). If I do a command line directory read of the Windows Folder files & subdirectories (dir /s) which should all come from the parent I get the following result:
The Read traverses through the third and first snapshots before hitting the Parent disk where the files are. So there is a read overhead but no write overhead. Again, the only writes occur on the 00003 Snapshot disk. 

Note: I initially tried Windows 2012 iSCSI first but it uses a VHD file which you can't get inside with Resource Monitor. It had been a while since I'd attempted to use UNIX services for Windows, and I'm glad its come a long way since then! Now that's over I can get some sleep!