Problem Description:
Host ABCD (x.y.z.150) is unable to start vSphere HA. The current state is "vSphere HA Agent Unreachable". I have tried to start HA twice, but this did not resolve the issue.
KBs to review:
ttp://kb.vmware.com/ selfservice/microsites/search. do?language=en_US&cmd= displayKC&externalId=2011974
Logs to look for in ESXi: /var/log/vpxa.log and /var/log/fdm.log ( /var/run/log)
fdm.log snippet:
2013-08-04T21:01:18.968Z [FFDD3B90 error 'Cluster' opID=SWI-79b9207c] [ClusterDatastore:: DoAcquireDatastoreWork] open(/vmfs/volumes/9e9989cf- f687e31c/.vSphere-HA/FDM- F78AC28A-8862-48C5-BC1C- F369CCABE58E-1480-9c9b8fc- ANTHMSASVC5/protectedlist) failed: Device or resource busy
2013-08-04T21:01:44.224Z [38498B90 error 'Default' opID=SWI-4593d696] SSLStreamImpl::BIOWrite (0d3fe098) Write failed: Broken pipe
2013-08-04T21:01:44.224Z [38498B90 error 'Default' opID=SWI-4593d696] SSLStreamImpl:: DoClientHandshake (0d3fe098) SSL_connect failed with BIO Error
2013-08-04T21:01:44.224Z [38498B90 error 'Message' opID=SWI-4593d696] [MsgConnectionImpl:: FinishSSLConnect] Error N7Vmacore3Ssl12SSLExceptionE( SSL Exception: BIO Error) on handshake
Workaround:
- check if there are high latencies on the storage
- restart services.sh
- enable/refresh HA again in the vcenter.
Errors on FDM.log:
2013-08-04T23:23:34.578Z [FFF18B90 verbose 'Cluster' opID=SWI-5a8f10c4] [ClusterManagerImpl::IsBadIP] x.y.z.199 is bad ip
Workaround:
On x.y.z.199, review the fdm.log and run services.sh restart.( you could disconnect and connect the host, which restarts services, but I find services.sh restart fixing more issues)
thank you for the blog visit us forSAN Solutions in Dubai
ReplyDeleteEight years later this post saved my day after spending useless hours browsing VMWare KBs... Thanks a lot, really appreciate!
ReplyDelete