Windows Server 2003 won’t boot after a P2VPosted: December 6, 2011
So far, our Physical-to-Virtual migrations of Exchange 2003 on x86 Server 2003 Enterprise boxes have gone mostly smoothly – until this evening, that is. In the past, a failure soon after the P2V process started was resolved with a reboot or by disabling the TCP Offload Engine on the Broadcom NICs (this was easily accomplished with the cmd.exe command netsh int ip set chimney DISABLED).
This evening’s P2Vs were a bit more challenging.
Our contract for this project stipulated that we simply P2V our customer’s Exchange servers. We were not to fix their broken or poorly configured servers. We were using VMware Converter 5. So when we saw they had partitioned their system disk to store their Exchange logs (instead of using a separate local RAID 5 group), we knew we’d have to work around it. Their mailstores were located on a direct-attached Dell PowerVault MD3000.
This hasn’t necessarily presented a huge problem in the past, but instead of migrating one physical volume to one virtual disk, we were forced to migrate two physical volumes to one virtual disk. The problem appeared when we tried booting the virtual machines after conversion. They booted to a black screen – not even a blinking cursor was shown. At first, I thought that perhaps the SCSI IDs automatically assigned to the virtual hard drives were somehow incorrect. I turned back on the physical servers to look at their setup and tried to match it up on the VM to no avail.
The other engineer working with me on the project suggested we get out a Server 2003 ISO and boot to a recovery console. We did, and fortunately, the recovery console saw all the system files. We ran chkdsk to make sure there wasn’t corruption – there wasn’t. But neither VM would boot.
Now, coming up from the IT trenches as a helpdesk guy and then a server admin, I’ve seen those times when a system wouldn’t boot because the system disk or files were somehow corrupted. But many times, the system would tell you it couldn’t find certain files or that there was corruption. This wasn’t the case here, though. But I was encouraged, at least, by dir listing all the right contents.
To finally resolve it (read below for the addendum and real solution to this post!), I started replacing system files like NTLDR and NTDETECT.COM. I ran fixmbr and fixboot, then rebuilt the boot.ini file with bootcfg /rebuild. I specified Windows Server 2003 as the operating system identifier and then /fastdetect as the boot option. I ended with an exit to save my changes.
We were nearing the end of our outtage window and this had to work – I really didn’t want to tell our customer’s we failed this evening’s P2V. I crossed my fingers and booted the VM – success!
I’m not sure why both Exchange boxes wound up being corrupt – but I still have a feeling it has something to do with P2V’ing two volumes to a single VMDK. We have a couple more Exchange boxes to P2V this week so I may try something different. I want to P2V just the OS partition, then add VMDKs for each remaining volume. I’ll then simply try Robocopying data from the physical machine to the new VMDKs.
NOTE ADDED 7 DECEMBER
Although the procedure above resolved the problem, I didn’t point out what the problem was. Was it missing or corrupted system files? Was it a misconfigured boot.ini file? In fact, the credit for finding the specific reason behind the failure to boot after a P2V goes to another engineer working on this project, Dennis E. His team ran into the same issue and after conferring with him, I agree on his determination.
The problem is specific to these boxes running Exchange 2003 and the best-practice configuration of the boot.ini file. The boot.ini switches included on the boxes that didn’t boot after a P2V were as follows:
/noexecute=optout /fastdetect /3GB /USERVA=3030 /BURNMEMORY=12288
These switches are useful for Exchange 2003 servers. But one in particular caused the server not to boot after the P2V. For a quick background, there was 16 GB RAM initially installed on these servers when Exchange was installed. They were later upgraded to 96 GB with the intention of P2V’ing each Exchange server and rebuilding them as ESXi hosts.
As we know, Exchange can essentially only use 4 GB RAM. Specifically, with the /3GB switch, the Exchange application can use up to 3 GB, leaving the operating system 1 GB. But with 16 GB RAM installed, the /BURNMEMORY=12288 switch essentially tells the OS to ignore 12 GB of memory and use only 4 GB, which is just right for a server running only Exchange 2003.
Let me interject here with a couple links that were extremely useful on this subject. An excellent discussion of the effects of the /BURNMEMORY switch can be found in this Ars technica forum post. A Microsoft TechNet article discusses the issue, as well.
When we converted the Exchange boxes with such a boot.ini configuration, we changed the allocated virtual RAM to 4 GB. Without knowing about the boot.ini config beforehand, this configuration seemed correct. When the VM tried booting, however, the /BURNMEMORY=12288 switch negated the installed 4 GB of virtual memory (it told the OS to ignore 12 GB of RAM when there was only 4 GB installed), leaving no RAM with which to boot. The fix, then, was to rebuild the boot.ini file without the /BURNMEMORY switch. When I rebuilt the boot.ini through the Recovery Console, though, with only the /fastdetect option, the server did, indeed, boot, but without boot.ini options that should be included on an Exchange 2003 server.
The rebuilt boot.ini file booted without using the /BURNMEMORY option. This is a partial solution. The other boot.ini switches should be included for this solution to be complete.