Wednesday, May 21, 2014

SCVMM is deploying using the wrong VHD or VHDX

I like this one.  It is interesting.
This is a classic case of 'as designed' being exposed through changes in behavior as an implementation evolves.  Combined with what you see in the UI not really being what happens in the system.

I like it because I consider it a bug due to the fact that the behavior has changed between release and applying Update Rollups.

The symptom is this: you deploy a VM (or a bare metal deployment) and you observe that SCVMM is using the incorrect virtual disk.
And you noticed this happening consistently after applying UR2 (IMHO - that is the key differentiator in the behavior change).

In a nutshell - SCVMM is selecting the virtual disk from the Library.  And it is actually getting more than one from the Library and it is using the incorrect one.

Lets back up a bit.  How is SCVMM ending up with an incorrect virtual disk from the Library in the first place.

Objects in the SCVMM Library have this concept of equivalency.  Multiple objects being equivalent to each other, though they are different objects.
And virtual disks are one of these objects.

Lets say that you create one VHD and this is Server 2012 R2, sysprep'd, all ready to go.  You assign that to a Host Profile or a Service Template, or a WAP Gallery Item.  You set the version to and the family to 'Server 2012 R2' and the OS to 'Server 2012 R2 Datacenter'.
You perform some test deployments and all is good, you move ahead and use it.

After a bit, you get new hardware and you create a new VHD.  Identical to the first except you add some additional drivers for the new storage cards and NICs.
You set the version and family and OS the same.

You then open the Host Profile or Service template and select this new disk.
You deploy, and ... your hardware does not work.  Your new device drivers are missing.  You jump up and down and bang your head on the wall.
You run around in circles.

What happened was this.  SCVMM didn't select the single object that you think you selected.  It selected the OS, family, and version.  And it got two back from the Library.  It then used the "first" one, not the second one.  (this is the behavior change - prior to UR2 it was as if the one with the later date was used, but no longer).

Make the version unique (or the family name) and you will fix the problem.

According to the VMM team - this is 'as designed'.  That is developer speak for:  We built it to work that way.
I hope the documentation will soon describe this equivalency concept.  Since it currently mentions it in passing and never really describes anything about it.

Until then, I hope search brings you here.


Anonymous said...

Thank you!

Anonymous said...

Thanks a million! I'm having this running in circles currently. Since I deployed multiple images where I had only one before to the library it went havoc selecting SKUs arbitrarily. Same family (2016) but one time Datacenter Core or Standard Core when both should have been Standard with GUI :-) Very amusing ...