Friday, March 7, 2008

Snapshotting in Hyper-V does not equal VMware checkpointing.

The snapshotting concept under Hyper-V

For all of you familiar with ESX server and Virtual Center (or using a 3rd party product to do backups using checkpoints) you will have to think about things a little differently.

First of all.
Is using Hyper-V snapshots a good method to backup your VMs?

Answer: No. It isn't a backup method at all. It provides the ability to return to a defined point in time in the life of your VM. It is a very useful tool for testing application upgrades and service packs.

What are snapshots then (if they aren’t the same as a checkpoint)?

The best way that I have been able to describe snapshots is by using a timeline or traveling in time.
A snapshot is a moment in time that you can return to. And going right along with that - you can freely move forward in time, but moving backward in time will alter your future.

"Don't alter the timeline" - we have heard it hundreds of times (especially if you watch Star Trek Enterprise). Well, then you also hear the Vulcans state that time travel is impossible..

Snapshots in Hyper-V are all linked together, a single snapshot cannot stand alone (at least not without doing some tricks) as it is linked to its parent and so on.

The concept looks like this:

You begin with a base VHD. When a snapshot is taken an AVHD disk is created and the configuration of the VM is updated to use the AVHD as the current virtual hard disk.
(An AVHD is a snapshot specific differencing disk that is being used as the running point of a VM)

The AVHD disk is now the point where all system changes are written from this time forward - the base VHD is no longer being modified. And the AVHD is linked to (dependent upon) its parent disk. If you were to move one of these two files the VM is broken.

You can continue to spawn off additional snapshots. Each snapshot is linked to its parent in a linear (timeline) arrangement. They cannot link in a branched tree arrangement because that would create dead branches.

When you go back to a previous point in time (return to a snapshot) everything to the right of that point in time is destroyed (rendered unusable) because you altered the VM at a previous point.


Managing snapshots
Okay, what can I do with snapshots besides take them?

You can select a shapshot for a VM and “apply” it or “revert” to it.
This takes you to that moment in time and begins running the VM at that point in time. (You begin using the AVHD file that was created at that point in time).
“Revert” takes you back one step while “apply” can take you back many snapshots in one step.

Deleting a snapshot removes it from the tree.

You might wonder “if all of the snapshots are linked together how can I delete a snapshot out of the middle?”

This is because in the background Hyper-V is maintaining the integrity of your VM by combining the various states of your VM (or flattening the structure of your differencing disk timeline). If this didn’t happen your entire VM would be broken – that would be a serious bummer.

The merge process happens quietly in the background when a VM is powered off.
That is an important thing to remember – the VM must be powered off for all of the disks to be merged together.
This counts for anytime that snapshots are modified or merged into the Parent.

Delete a snapshot and its subtree.
This process deletes a snapshot and any other snapshots that are to the right of it in the timeline. The result is just that any snapshots taken to the right of the point in your timeline that you elected to perform this operation are all committed and merged into your current snapshot.

What if you want to “flatten” your tree and have everything written back to the base VHD?
Delete each of your snapshots one by one, even the current running one, and wait for the changes to be merged back into your base VHD. This can take a long time, and the VM needs to be powered down.

Deleting snapshots does not move you back in time, only applying a snapshot does that. Deleting snapshots merges your changes into another disk within the timeline.

Oh, and why don’t I have a bunch of screenshots of the GUI?
Because I figure that most Server Admins are smart enough to handle a GUI ;-)

18 comments:

Anonymous said...

Brian,

The way "snapshots" are implemented in Hyper-V makes no sense ot me at all. Consider what you wrote:

"Delete a snapshot and its subtree. This process deletes a snapshot and any other snapshots that are to the right of it in the timeline. (This makes sense to me)

**** The result is just that any snapshots taken to the right of the point in your timeline that you elected to perform this operation are all committed and merged into your current snapshot."

The part after the "****" strikes me as maddness. It seems to say that deleting a snapshot and its subtree causes all snapshots to the right (in time) of the deleted snapshot to be merged into the current snapshot. To me delete means NOT included -- nothing is merged back into anything.

BrianEh said...

A couple suggestions.
Take a look at the other snapshot posts that I have.
Play around.

Both will give me a chance to put something better together.

The keys are: Differencing Disks. The level within your tree that you perform a snapshot management action. And then what happens in the background.

This is one reason why SCVMM only presents take and revert - far less confusing.

Joel said...

I have to say I agree with the previous comment, its sheer madness. but also snapshot deleting does do just what brian says it merges the differencing data to the .vhd. Semantics seems to be the missing link for MS engineers. does anyone explain things the way they do? I suppose the answer is only other engineers.
I guess on the way to work I will now delete into traffic on I-405, and merge files on my computer that I don't need anymore.

you know what I mean right? ;)

BrianEh said...

I know exactly what you mean. However, I-405 does not have a differencing disk.

If you considered I-405 the differencing disk to I-5 then the traffic (disk I/O in this case) gets spread between the two files.

Deleting I-405 would cause the data bits (the traffic) to be merged into I-5.

Yes, it would cause mass panic and gridlock (bad example from that point).

Lets just put it this way..

Differencing disks have been around in virtualization for years. ESX used to support it back in the stone ages when I first started with it, then VMware dropped it (except the case of the nonpersistant disk - the undo disk - this is still a differencing disk that is thrown away after each reboot).

Why? Most likely because of the confusion we have here.

Eventhough, VMware has differencing disks back under the covers of Workstation - more so under 6.5 - however the implementation is a bit different.

MSFT has had differencing disks as part of VHD from the beginning.

How will this smell and change as we move forward, hard to say.

Anonymous said...

Hi Brian

I'm new to this Hyper-V virtual server environment and would really appreciate if you could answer this for me.

I have a running virtual server and have taken 4 snapshots, mainly taken before applying MS patches.

I want to keep the server how it is now but also cleanup the snapshots.

So, do I delete the very first snapshot (that is the oldest) and then power the virtual off so that it then merges all of the snapshots back into the base VHD ?

Basically I want to keep the server and all current data as at NOW.

So from what I understand I don't apply the snapshot as it will take me back in time and all the NOW data will be lost. So my only option is the delete and merge and hope ?

Thanks for your time.

BrianEh said...

Revert will put you backward in time
Delete will cause a merge
Deleting all snapshots will cause a merge that leaves a single vhd
And,you already know that the VM must be powered down for any merge to happen.

Anonymous said...

Hi Brian

Under the Snapshots panel I have :

Snapshot 1
Snapshot 2
Snapshot 3
Now

Do I delete Snapshot 1 and then turn off and wait for the merge to complete and then do the same for the other snapshots ?

Or Do I delete Snapshot 1 then Snapshot 2, then Snapshot 3 then turn off the virtual and the merge will commence ?

What happens with the Now ?

thanks.

BrianEh said...

You can delete as few or as many as you wish at any one time.
Hyper-V will systematically merge them together one disk at a time, until all the deleted snapshots have been merged.
Also, you can power-down the VM prior to deleting snapshots as well. In this case the process generally begins sooner (less delay to detect that the VM has been shut down).

Steve Endow said...

Thanks for the explanations of the HyperV Snapshots.

Deleting a Snapshot in order to merge it back to my main VHD has to be the most unintuitive and scary operation I have done recently in a software app.

It seems that if only they would add a "Merge..." menu option when you right click on the snapshots that would be a vastly superior option than having us delete a snapshot and pray that we aren't making a mistake.

Thanks for explaining the process.

Unknown said...

I found that running a copy of disk2vhd on your live server and exporting the resulting VHD to an external host results in less down time and gets you around the time consuming and nerve wreaking wait for merge process...

BrianEh said...

That would result in a new VM - which defeats using snapshots in the first place.

Unless you have simply gotten yourself into a situation where "now" is what you want and you don't care about snapshots.

Also, it leaves the snapshots in place which can greatly increase storage use.

Jordan said...

Hey Brian,

First of all, thank you for your exhaustive explanation about snapshots. It has helped me to understand that Hyper-V is tough to understand as a new brain in the I.T. world. I have been put over a server build that we originally hired someone else to build and i was simply going to maintain but he has since pretty much vanished after being paid, so the baton has moved to me.
I want to make sure I am understanding correctly. I have 3 virtual servers spread out on 2 1TB HD's... I was looking at the AVHD files that are created and their file sizes... The VHD file size is the big one but the AVHD files are def nothing to be ignored since we will be needing to take a snapshot of each server every night. Will these AVHD files just continue to build and take up HD space unless they are manually deleted and merged with their Parents?? This seems like a pretty ridiculous way of doing things since that means the servers will have to be taken offline and users will not be able to access our web sites or anything. Is there anyway to automate this process? We have thousands of employees all over the world so downtime has a big impact. Thanks again so much for your insight!

BrianEh said...

first of all. You had commented "snapshotting every night". I have to ask: "Why?"
Snapshots in the Hyper-V world are not copies, they are links in a chain over time. Each one dependent on the previous.
Unless your process is going to delte the snapshot, power off hte VM, wait for the merge, and then power on again I would seriously consider re-thinking your strategy and the reason why.

To merge, the manual process is only a recovery process. The system handles it all, but the VM must be powered off for the merge to happen.

I have written many posts about snapshots. About the mechanics and the problems that folks have gotten themselves in to.

James said...

Excellent post. Quick question - in a multi-snapshot scenario, when deleting a snapshot, does the data get merged to the left or the right in the chain?

BrianEh said...

This is a good question and I am attempting to remember.

If I recall correctly the merging happens to the 'future' or to the right in the tree. The future wins. Technically it isn't to the right or left, it is to a new disk that then replaces the one right in the tree.

The differencing disks that get merged into a new differencing disk follow the rule that the newest disk block overwrites the older disk block - thus keeping the "newest" version of the timeline.

It is only in the case of of one VHD and one snapshot that you might think it merges left (back to the VHD) but it actually merges right (to a new VHD).

It is easier to think of left as back in time and right as forward in time - as differencing disks are all about moments in time. That is really how it works when you put all the moving parts together.

Anonymous said...

First off, thanks for all of your help with this stuff Brian. I've seen a couple of your other related articles. You've definitely provided some much needed clarification. For the sake of myself and others who will read this thread in the future, would you mind clarifying the following illustration?

So...let's say...3 snapshots (which are each represented by new .avhd (differencing) disks) are deleted and merged back onto the root VHD...

---------------------------------------------
Step 0
---------------------------------------------
root.vhd -> A.avhd -> B.avhd -> C.avhd

--------------------------------------------
Step 1
--------------------------------------------
- Delete C snapshot (NOW)
- C.avhd's differences are merged onto B.avhd
(files C1 => B1, C2 => B2, and C3 => B3)

RESULT:
root.vhd -> A.avhd -> Bc.avhd**
** B.avhd now contains C's changes, so we'll call it Bc.avhd


---------------------------------------------
Step 2
---------------------------------------------
- Delete B snapshot
* Bc.avhd's differences already contain the changes from the old C.avhd.
* Now, Bc.vhd's files overwrite the corresponding A.avhd files.
(files Bc1 => A1, Bc2 => A2, Bc3 => A3)

RESULT:
root.vhd -> Abc.avhd**
** Abc.avhd contains Bc's changes, so we'll call it Abc.avhd


---------------------------------------------
Step 3
---------------------------------------------
- Delete A snapshot
* The combined changes from C onto B onto A are now merged onto root.
(files Abc1 => root1, Abc2 => root2, Abc3 => root3)

RESULT:
[root]abc.vhd
- you now have 1 root VHD which has merged all of the changes.


...Right?

BrianEh said...

You are close.

If you delete C first (the last taken), you actually revert back to B. But I don't think that is your intent.

Most commonly the older ones are deleted. Thus merging together. And the newer overlays the older when a merge happens.

But this does not happen at a file level. It happens at a storage block level.
So, in reality the entire block on the older is removed and the block of the newer replaces it (if the block on the newer has changes).

Unknown said...

Gotcha. And dang, what a timely response! Pretty awesome after all these years. I also found this article:
http://blogs.msdn.com/b/virtual_pc_guy/archive/2009/04/15/what-happens-when-i-delete-a-snapshot-hyper-v.aspx

I think the most important thing it outlines is that if an older snapshot (lets say snapshot X) is deleted that has two or more dependent snapshots (Y and Z), it doesn't merge the X.avhd into *both* of its child snapshots because, as the writer puts it "could result in the whacky scenario where deleting a snapshot would fail because there was not enough space available." Instead, snapshot X would simply be removed from the GUI, its configuration files and I think any memory page files(?) are removed as well, but the X.avhd wouldn't go anywhere since later snapshots (Y and Z) are dependents upon it. That writer then notes, "if Snapshot [X] were deleted, and later on both Snapshots [Y] and [Z] were deleted – we would detect this and merge the AVHD for Snapshot [X] away as soon as possible."

Phew. All in a day's work I guess. Thanks again Brian. I really appreciate your help and diligence. Cheers.