Tuesday, June 11, 2013

Centos Guest VM Hanging at eth0 every alternate Boot on ESXi 5.0

Description
Symptom:
Every alternate reboot on Centos VM hangs on eth0.

Troubleshooting:
- add set -x /etc/sysconfig/network-scripts/ifup-eth to find exactly where it is hanging.
- in this case it hang at arping trying to find the duplicate IP.
 if ! /sbin/arping -q -c 2 -w 3 -D -I ${REALDEVICE} ${ipaddr[$idx]}


 
Solution
Root Cause:
Arping Uses real time instead of relative time to wait for 3 seconds , 
so if real time goes back by an hour during this 3 seconds, 
it will wait for 1 hour 3 seconds instead of 3 seconds. So the 
root cause was time difference between Centos VM and ESXi.

Workaround:

- adding 2 seconds so there is no race condition between time changes.
or
- make sure ESXi time and Centos VM time have correct time ( in one 
customer  case, they had wrong time set on Centos VM
and it was off by 2 hours, even if NTP is defined in Centos VM,
the time difference was too large for NTP ) - Most preferable.
or
- if Centos VM has to have different time than ESXi,then remove time sync
via vmware tools.
   vmware KB
TagsTroubleshooting