Prevent the incrementing of eth devices on Linux systems after guest customization of a cloned VM

I’ve ran into this issue before and found the following article written by Chris Greene from Orchestration.io.

After the guest customization process runs on cloned VMs in some VMware products, you may notice that on your Linux systems the eth device number gets incremented. For example, when the system is first built, the first eth device will be eth0. If the system is cloned & customized, the eth device will become eth1. This may not be a problem on some systems, but people often need/prefer the first eth device to be eth0 or at least to not change after the system is customized.

The issue arises because of old entries in the udev network file – /etc/udev/rules.d/70-persistent-net.rules. After an initial install of a Linux system that has a NIC with a MAC of “00:50:56:02:00:7c”, /etc/udev/rules.d/70-persistent-net.rules will look something like

When you perform a clone & customization (as in creating a new vApp from a template in vCloud), the source VM is cloned and has NIC with a new MAC address. When the cloned VMs boots, udev notices the new NIC and updates /etc/udev/rules.d/70-persistent-net.rules with the new NIC and gives it the name eth1.

A new file named /etc/sysconfig/network-scripts/ifcfg-eth1 will be created that points to the eth1 device

Now when ifconfig is ran, you will see eth1 instead of eth0.

To prevent the issue from occurring, delete the /etc/udev/rules.d/70-persistent-net.rules file before shutting down the VM and turning it into a template. This will cause a new /etc/udev/rules.d/70-persistent-net.rules to be created when the customizing VM boots up. The new file will only contain the NICs on the system and they should be labelled as eth0, eth1, etc.

Another thing you may want do before shutting the VM down to be added as a template is modify /etc/sysconfig/network-scripts/ifcfg-eth0 so that ONBOOT is set to no (ONBOOT=no). I’ve seen issues in vCloud where multiple vApp templates are being deploying onto the same network and the VMs have the same IP (that was initially on the VM before it was turned into a template). Then the systems boot, ifup is ran, which runs arping. I’ve seen arpping return an error in these situations, which causes VMware tools to not start. This then causes guest customization to fail since VMware tools is relied up by guest customization.

vusb seen as an available network adapter?

Recently, I had to install ESXi 4.1 on an IBM x3850 X5. Once the install was completed and I started configuring the host, I noticed something odd. I had a new vswitch and a network adapter that was defined as vusb0.

vusb0

WTF? At first, I thought I was imagining things. I opened the network adapters view, and sure enough there it was. I thought how could this be. Of course, I’ve never seen this before, so a little googling was in order. Lo and behold, I found this little jewel (VMware Forum Post: “Extra NICs showing up as vusb?”).
Continue reading “vusb seen as an available network adapter?”

Missing NIC

Had an interesting event yesterday. A new application build was underway that wasn’t going very smoothly. Snapshots were being made and reverted rather frequently. After one one snapshot reversion, progress on the install was being made. Approximately three hours went by and a high priority ticket was raised. Something had gone wrong, and no one could access the VM. It was unresponsive to pings, RDP, etc. Eventually, it was discovered that the NIC was missing. Once the NIC was readded, access to the VM was restored and the installation group was off and running.

A forensic investigation was conducted into the root cause of the missing NIC. It was suggested that one of the snapshots were corrupted. The event logs within virtualcenter were vague – they provided the timeline of what had been occurring, but failed to indicate what had transpired with the NIC. I downloaded the vm logs from the datastore and examined them. I wanted to see if there were problems during a snapshot capture, or a snapshot reversion. There were no problems or failures with snapshot creation or any rollbacks of the snapshots. I did however, come across an entry that had me puzzled.

E1000: Syncing with mode 6.

Of course, Professor Google was up for my challenge and provided me with this bit of info within the VMware Communities:
Network card is removed without any user intervention. I struck gold. One of the commentors,”NMNeto”, had hit the nail on the head with this comment:
“The problem is not in VMWARE ESX, it is in the hot plug technology of the network adapter. This device (NIC) can be removed by the user like a usb drive. … When you click to remove the NIC, VMWare ESX removes this device from the virtual machine hardware.”

Removable_NIC
"Safely Remove Hardware Icon"

From here, I had a strong lead. Now that I knew what I was looking for, I opened the Virtual Machine’s Event Viewer and started itemizing each entry – looking for the device removal. After about five minutes, I found the who, when, where, and how. I felt like I was winning a game of “Clue”.

“NMNeto” had also posted an adjoining link within the community for a thread that posted a resolution to prevent the issue going forward. Here’s an image from that link that gives you step by step instructions on how to prevent this on other VMs.

I will take this data and propose a change to add this entry to our VMs so we do not have this event to reoccur.

Script: Detailed NIC Information Report

Over the last few days, my coworker and I have been working on a script that will go out and collect detailed network information for each NIC on our ESX hosts. Not wanting to reinvent the wheel, I strolled over to Alan Reneuf’s website (www.virtu-al.net). I had remembered seeing a script (“More Network Info”) he created that I knew could collect the info we wanted. Thank you, Alan.

I copied Alan’s script into Notepad++ and started to work with it. One of the things we needed to know was duplex information. Since that wasn’t in the original script, I added it. I didn’t need to know the number of active virtual switch ports, so I removed that portion.

Here are the end results of this modified script.

NETWORKINFO.PS1

The above script may not actually display properly within this blog entry. If this is the case, you can download the script here.

Fixing Renumbered vmnics

I’ve been in a situation where I’ve installed a new NIC and it renumbered the vmnics within an ESX host. The reason this became an issue was that the Service Console vmnic was one of the vmnics that was affected. Thus I lost my access to the ESX host. Thankfully, I could just drive over to the Datacenter and fix it via the commandline. It was just inconvenient to do so. A quick google for instructions and I was off.

So apparently there are multiple ways to fix this. For me, Method 1 got me up and running within a few minutes with little to no fuss.

Method 1 – Editing the esx.conf file
• Login to the Service Console
• Check your existing NIC numbering by typing ‘esxcfg-nics –l
• Type ‘cd /etc/vmware’ to change to the correct directory
• Type ‘cp esx.conf esx.con.bak’ to make a backup of this file as it is a critical configuration file for ESX
• Type ‘nano esx.conf’ to open the file for editing
• Type CTRL-W and then enter ‘vmnic2’ to search for the new first NIC
• Change ‘vmnic2’ to ‘vmnic0
• Change the subsequent NIC’s from ‘vmnic3’ to ‘vmnic1’, ‘vmnic4’ to ‘vmnic2’ and ‘vmnic5’ to ‘vmnic3
• Type CTRL-O to save the file
• Type CTRL-X to exit the Nano editor
• Type esxcfg-boot -b to rebuild the config files
• Shutdown and restart the ESX server, when the server comes back up the NIC’s will be numbered vmnic0 – vmnic3, verify this by typing ‘esxcfg-nics –l

Method 2 – Modify your vswitch configuration
• Login to the Service Console
• Check your existing NIC numbering by typing ‘esxcfg-nics –l
• Check your current vswitch configuration by typing ‘esxcfg-vswitch –l’ , note which NIC’s are assigned to which vswitches (uplink column)
• Remove the old NIC’s that have been renamed by typing ‘esxcfg-vswitch –U vmnic# vswitch_name’, ie. esxcfg-vswitch –U vmnic0 vSwitch1
• Add the new NIC’s with the correct names by typing ‘esxcfg-vswitch –L vmnic# vswitch_name’, ie. esxcfg-vswitch –L vmnic2 vSwitch1
• Repeat this process for any additional NIC’s. Once you have the vswitch that contains the Service Console corrected you can also log in via the VI Client and correct the other vswitches that way
• Your newly renamed NIC’s should now be assigned to the original vswitches and your networking should now work again

This information was found within the VMware Knowledge Base. (Awesome!)
How To Configure Networking from the Service Console Command Line
VI Client loses connectivity to the ESX Server Host after you add a new network adapter