How to Work with VMware Snapshots
Wouldn't it be nice to go back in time when something goes wrong? IT staff in virtual environments have that option, thanks to the much-loved snapshot capability, which preserves the state of a virtual machine (VM) at a particular point in time.
Before IT managers perform a major change to virtualized systems, they create a snapshot. After the system change, they can either remove the snapshot and continue working, or revert back to the previous state of the system and begin anew.
As useful as snapshots are, however, they also bring some potential dangers. Heed these tips to use snapshots safely.
1. Never use a defragmenter.
A defragmenter can potentially change every block in a VM and create a giant snapshot. A snapshot starts as a small file and grows in increments of 16 megabytes.
With a defragmenter, every change that would normally be written to the virtual machine disk (vmdk) is written to the snapshot file, causing it to grow rapidly. If someone changes a file twice, only the last version will be found in the snapshot. In theory, if every file or block in the vmdk is changed, the snapshot will be the same size as the original vmdk. Even if a block changes multiple times, only the last change is stored so the snapshot can never become larger than the vmdk.
2. Be aware of the performance impact.
When a VM has one or more active snapshots, every READ operation has to be done first in the original VM, and then in the snapshot, to make sure the latest version is read. If multiple snapshots are active, they all must be checked for the latest version of a file. This degrades performance.
3. Check for active snapshots daily.
It's wise to keep snapshots running for no longer than a day. Perform a change, check to see if it works, and then remove the snapshot. If it's not immediately clear whether a change is successful, allow the snapshot to run a bit longer. But be careful — forgotten snapshots can easily fill a data store, causing the VM to freeze.
4. Set vCenter alarms.
In vCenter, go to the data stores section and create a new alarm at the VM level to monitor snapshot size. The smallest value is one gigabyte, but it can be used to remind IT managers about forgotten snapshots. They can also regularly run a powershell script to display all active snapshots:
Backup software increasingly offers vdmk-level backup and uses snapshots for it. Occasionally, a snapshot will remain active on a VM and will cause issues when nobody's aware that it's active. This presents yet another reason to closely monitor environments for running snapshots.
5. Stop database servers or mail servers before making a snapshot.
If something goes wrong during a change, IT managers can always revert to an earlier state — but it's not always wise to do so. In some cases, data can be lost. When a snapshot is made of an Exchange mail server, but Exchange services remain running, mail will still come in. If an organization reverts back to the previous instance of that VM, it will lose all new emails. The same goes for SQL servers or other types of data collectors.
6. Never snapshot a domain controller.
Special considerations must be made for domain controllers because they send updates to each other and record this using their update sequence number (USN). They also know about the USN of their replication partners. If someone reverts a domain controller, the USN of this domain controller will also revert. The next time this domain controller connects to its replication partners, they'll detect the "wrong" USN and inform the forest to quarantine this "rogue" domain controller to protect the forest.