nutanix

Tuesday, December 11, 2012

ESXtop:
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1008205

SIOC:
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1019687

Dynamic Add node and IPV6

Nutanix IPV6 requirements

Nutanix software version 2.6 requires IPv6 Link Local
a. to discover Nutanix nodes for configuring the nodes for the first time,
b. discover during dynamic add node
c. Reconfiguring IP address.
Most switches support IPv6 neighbour discovery protocol by IPV6 link local address even if IPv6 is not enabled in the routers, so this blog explains the link local address and procedure to verify if it is enabled in the switch and verify it in the controller VM. Please note that all the controller VMs should be connected
same broadcast domain, so that IPV6 link local addresses are reachable.

Verify IPv6 connectivity from the Controller VM

nutanix@NTNX-Ctrl-VM-1-NTNX:172.16.8.84:~$ ifconfig eth0|grep inet6

inet6 addr: fe80::20c:29ff:fef2:cb25/64

nutanix@NTNX-Ctrl-VM-2-NTNX:172.16.8.85:~$ ifconfig eth0|grep inet6

inet6 addr: fe80::20c:29ff:feb0:3e61/64

nutanix@NTNX-Ctrl-VM-1-NTNX:172.16.8.84:~$ ping6 -I eth0 fe80::20c:29ff:feb0:3e61

PING fe80::20c:29ff:feb0:3e61(fe80::20c:29ff:feb0:3e61) from fe80::20c:29ff:fef2:cb25 eth0: 56 data bytes

64 bytes from fe80::20c:29ff:feb0:3e61: icmp_seq=1 ttl=64 time=18.0 ms

64 bytes from fe80::20c:29ff:feb0:3e61: icmp_seq=2 ttl=64 time=0.212 ms

64 bytes from fe80::20c:29ff:feb0:3e61: icmp_seq=3 ttl=64 time=0.180 ms

Dynamic ADD NODE:

Make sure Rackable Unit Serial is different from Existing Nodes. Newer Factory
installed systems do have different Rackable unit Serial Number.

If you see less than the number of nodes that are ready to be added to existing cluster
, probably they are not connected to same switch.

From the existing clusters' CVM:
ncli cluster discover-nodes
Cluster Id :
Hypervisor Address : 10.x.y.158
Ip : fe80::20c:29ff:feab:7822%eth0
Ipmi Address : 192.168.2.116
Node Position : A
Node Serial : 487c804a-dd23-49cc-bcc1-f8e7123dc0b3
Rackable Unit Model : NX-2000
Rackable Unit Serial : 2
Service Vm Address : 10.x.y.166
Svm Id :
Svm Version : ServiceVM-1.23_Ubuntu
Cluster Id :
Hypervisor Address : 10.14.23.161
Ip : fe80::20c:29ff:fe4d:5273%eth0
Ipmi Address : 192.x.y.119
Node Position : D
Node Serial : ac77f7cf-a248-46af-9d56-023392978bd9
Rackable Unit Model : NX-2000
Rackable Unit Serial : 2
Service Vm Address : 10.x.y.169
Svm Id :
Svm Version : ServiceVM-1.23_Ubuntu
Cluster Id :
Hypervisor Address : 10.x.y.160
Ip : fe80::20c:29ff:fea3:2642%eth0
Ipmi Address : 192.x.y.118
Node Position : C
Node Serial : b9237d59-4979-4829-9a83-dfaa64dd4b5c
Rackable Unit Model : NX-2000
Rackable Unit Serial : 2
Service Vm Address : 10.x.y.168
Svm Id :
Svm Version : ServiceVM-1.23_Ubuntu

Cluster Id :
Hypervisor Address : 10.x.y.159
Ip : fe80::20c:29ff:fee9:a13c%eth0
Ipmi Address : 192.x.y117
Node Position : B
Node Serial : 6c31971d-fe7b-43e5-979f-2953a48a9d62
Rackable Unit Model : NX-2000
Rackable Unit Serial : 2
Service Vm Address : 10.x.y.167
Svm Id :
Svm Version : ServiceVM-1.23_Ubuntu

ncli cluster add-node ncli cluster add-node node-serial=487c804a-dd23-49cc-bcc1-f8e7123dc0b3;ncli cluster add-node node-serial=6c31971d-fe7b-43e5-979f-2953a48a9d62;ncli cluster add-node node-serial=b9237d59-4979-4829-9a83-dfaa64dd4b5c;ncli cluster add-node node-serial=ac77f7cf-a248-46af-9d56-023392978bd9
Node added successfully
Node added successfully
Node added successfully

Node added successfully
ncli host list|grep "Service VM Address"
Service VM Address : 10.x.y.49
Service VM Address : 10.x.y.48
Service VM Address : 10.x.y.50
Service VM Address : 10.x.y.51
Service VM Address : 10.x.y.166
Service VM Address : 10.x.y.167
Service VM Address : 10.x.y.168
Service VM Address : 10.x.y.169
cluster status|grep CVM

CVM: 10.x.y.166 Up
CVM: 10.x.y.167 Up
CVM: 10.x.y.168 Up
CVM: 10.x.y.169 Up
CVM: 10.x.y.48 Up
CVM: 10.1x.y.49 Up
CVM: 10.x.y.50 Up
CVM: 10.x.y.51 Up, ZeusLeader
2012-10-21 20:38:10 INFO cluster:906 Success!

nodetool -h localhost ring ( one by one nodes will get added in Limbo state and they become normal) - wait for all nodes to become normal.
Address Status State Load Owns Token
pVOFDgRpkq7rwjiZf0A7PdlGDLSswKByL8RZOTKcrHowOfT5FYbhPvy7PJvJ
10.x.y3.51 Up Normal 5.2 GB 20.00% 8zNLWFUeWeHJqvTxC9Fwc0CeIXGI5Xx7LnDjM2prxJR7YmfBrU1GnbaHPDnJ
10.x.y.49 Up Normal 4.02 GB 20.00% E1bIw6wcpRQ0XIGqRXkkN1Y5Af0b9ShinS36jxJxH9r56yZMqJxPztsE3Jiz
10.x.y.50 Up Normal 2.96 GB 20.00% TWCak3rlqTeO315iAG3asx0QNlPXfLkiqZswbC91t5TrLz1hsdBRDRCSR2OK
10.x.y.166 Up Limbo 774.37 MB 20.00% eZvBW6nzS9dTKtTMw6HVJ5RVNmeijP0UI2l8OyI76MYQLPsPcOjVoJzLcndo
10.x.y.48 Up Normal 4.35 GB 20.00% pVOFDgRpkq7rwjiZf0A7PdlGDLSswKByL8RZOTKcrHowOfT5FYbhPvy7PJvJ

. nodetool -h localhost ring -- all 8 nodes in Normal mode

Connect to nutanix console

(https://CVM_IP). Edit Storage Pool - add SATA drives ( + until 5) and PCIe ( + until 1) on new nodes
Edit container and add datastore on 4 new nodes

add 4 new nodes to vcenter

Friday, June 1, 2012

Vcenter Disconnects due to APD or Firewall

When ESXi5.0 disconnects temporarily from the Vcenter, more often or not, it is related to the Storage issues and other times it could be ESXi5.0 firewall.

Storage Issues: (APD)

Symptoms:

Vcenter Disconnects and Errors in VMKernel Log:

2011-07-30T14:47:41.361Z cpu1:2642)WARNING: NMP: nmpDeviceAttemptFailover:658:Retry world failover device "naa.60a98000572d54724a34642d71325763" - failed to issue command due to Not found (APD), try again...

To prevent APD during Storage Array maintenance:

1. If you have maintenance window, on Vcenter -> ESXi host -> Configuration -> Storage - unmount the datastore (datastore tab) and then detach the device( devices tab). ( u need VMs to be powered off, no heartbeat datastore and there are quite a few prerequistes)

2. The following commands also can be executed to prevent APD, if you have too many devices to do the unmount.( depending on SCSI timeout values, it is good to follow step 1)

To enable w/o requiring downtime during storage Array maintenance, execute: ( this might help in not having to unmount/detach datastores/devices)

# esxcfg-advcfg -s 1/VMFS3/FailVolumeOpenIfAPD

Note: This command might prevent new storage devices to be discovered. So use only during maintenance.

Revert the changes back via the following command.

To check the value of this option, execute:

# esxcfg-advcfg -g /VMFS3/FailVolumeOpenIfAPD

Revert it back after the maintenance:

# esxcfg-advcfg -s 0 /VMFS3/FailVolumeOpenIfAPD

Firewall Issues Causing Vcenter Disconnects:

Symptoms : Other than vcenter disconnects

Vpxa.log (on ESXi host):

Stolen/full sync required message:
"2012-02-02T18:32:49.941Z [6101DB90 info 'Default'
opID=HB-host-56@2627-b61e8cd4-e4] [VpxaMoService::GetChangesInt]
Forcing a full host synclastSentMasterGen = 2627 MasterGenNo from vpxd
= 2618
2012-02-02T18:32:49.941Z [6101DB90 verbose 'Default'
opID=HB-host-56@2627-b61e8cd4-e4] [VpxaMoService::GetChangesInt] Vpxa
restarted or stolen by other server. Start a full sync"

Difficulty translating between host and vpxd:
2012-01-24T19:26:05.705Z [FFDACB90 warning 'Default']
[FetchQuickStats] GetTranslators -- host to vpxd translation is empty.
Dropping results
2012-01-24T19:26:05.706Z [FFDACB90 warning 'Default']
[AddEntityMetric] GetTranslators -- host to vpxd translation is empty.
Dropping results
2012-01-24T19:26:05.706Z [FFDACB90 warning 'Default']
[AddEntityMetric] GetTranslators -- host to vpxd translation is empty.
Dropping results
2012-01-24T19:26:05.706Z [FFDACB90 warning 'Default']
[AddEntityMetric] GetTranslators -- host to vpxd translation is empty.
Dropping results
2012-01-24T19:26:05.706Z [FFDACB90 warning 'Default']
[AddEntityMetric] GetTranslators -- host to vpxd translation is empty. Difficulty translating between host and vpxd:
Vpxd.log (on vCenter Server):

Timeouts/Failed to respond and Host Sync failures:
2012-01-24T18:50:15.015-08:00 [00808 error 'HttpConnectionPool']
[ConnectComplete] Connect error A connection attempt failed because
the connected party did not properly respond after a period of time,
or established connection failed be
cause connected host has failed to respond.

After trying few options , our team successfully able to avoid vcenter disconnects with:

esxcli network firewall load - if it is unloaded , then you can't enable HA, so we have to load it

esxcli network firewall set --enabled false
esxcli network firewall set --default-action true

esxcli network firewall get

rc.local: (to persist between reboots)

1) echo -e "# disable firewall service\nesxcli network firewall load\nesxcli network firewall set --enabled false\nesxcli network firewall set --default-action true\n# disable firewall service" >> /etc/rc.local

2) auto-backup.sh

Addendum:

We had a useful simple and dirty scripts to monitor dropping, as Vcenter disconnects were intermittent and will recover after couple of minutes and we can't sit infront of the computer all the day

a. cat check_vcenter_disconnect_hep.sh ( ./check_venter_disconnect_hep.sh >> public_html/disconnect.txt )
#!/bin/sh
while true
do
dddd=`date -u +%Y-%m-%dT%H`
echo $dddd
for seq in `seq 1 4`
do
echo "hep$seq"
~/scripts/hep$seq |grep $dddd
done
sleep 1800
done

b. cat hep1 ( we had hep1-hep4)
#!/usr/bin/expect
spawn ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no root@172.x.y.z
expect "word:"
send "esxi12345"
expect "#"
#send "rm /var/run/log/vpxa.\[1-9\]\r"
#expect "#"
send "gunzip /var/run/log/vpxa*.gz\r"
expect "#"
send "egrep stolen /var/run/log/vpxa*\r"
expect "#"
send "egrep -i dropping /var/run/log/vpx*\r"
expect "#"
send "egrep -i performance /var/log/vmkernel.log\r"
expect "#"
send "exit\r"

Useful Webpages:
http://www.virtualizationpractice.com/all-paths-down-16250/
PDLs/APDs http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2004684

Monday, May 21, 2012

Creating Full clones on Nutanix via NFS VAAI

Aim: to create 320 VMs on Nutanix NFS datastore and power on 320 VMs .
Guest VM size - Windows 7, 20G HDD on NFS, 2G mem with VmwareTools.
Number of ESX hosts - 4 ESX hosts ( 80 VMs per node)
Storage - Same ESX servers ( no additional hardware, other than Arista switch interconnecting these
ESX servers) - Compute and Storage convergence.

Script help and source: http://www.vmdev.info/?p=202, Tabrez and Steve Poitras

Vcenter before running the script:( except for the clone in local datastore, there are no VMs other than Nutanix controller VMs)

On Nutanix: - Create storage pool, container and NFS datastore - from a clean cluster.
a. Create Storage Pool

b. Create container

c. Create NFS datastore

ESXi :now sees the datastore

esxcfg-nas -l
NTNX-ctr1 is /ctr1 from 192.168.5.2 mounted available

Script to create the full clone of thick vdisk from 1Win7-Clone on NFS datastore :(1Win7-clone - has vmware tools and power budget disabled in Windows 7 so that it does not go into standby mode)

Connect-VIServer 10.2.8.59 -User administrator -Password ntnx
1. $vm = Get-VM 1Win7-clone |Get-View
2. $cloneFolder = $vm.parent
$cloneSpec = new-object Vmware.Vim.VirtualMachineCloneSpec
$cloneSpec.Location = new-object Vmware.Vim.VirtualMachineRelocateSpec
3. $cloneSpec.Location.DiskMoveType = [Vmware.Vim.VirtualMachineRelocateDiskMoveOptions]::moveAllDiskBackingsAndAllowSharing
4. $cloneSpec.Location.Transform = [Vmware.Vim.VirtualMachineRelocateTransformation]::flat

5.$global:testIterations = 320
for($i=1; $i -le $global:testIterations; $i++){
$cloneName = "Windows7-$i"
$vm.CloneVM( $cloneFolder, $cloneName, $cloneSpec ) }

Explanation:

1. Get-View of our clone master

2. use same datastore as clone

3. allow sharing - To create full clone by copying all disks (but not snapshot metadata), from the root to the child-most disk, except for non-child-most disks previously copied to the target

4.flat -causes it to be created as a thick disk

5. from 1 to 320 creates Windows7-$num with clone spec defined.

The following Vcenter snapshot shows that clone creation with NFS VAAI in progress and 320 VMs being created

Maintenance:
#To Remove VM
Remove-VM Windows7-*

# To Power on
Start-VM Windows7-*

#To start VMs on specific ESX server:

Get VM-Host ip| Get-VM Windows7-*| where {$_.'PowerState' -eq "PoweredOff"} | Start-VM -RunAsync -Confirm:$false

Get VM-Host ip| Get-VM Windows7-*| where {$_.'PowerState' -eq "Suspended"} | Start-VM

#Migrate VM: (DRS should do it when powering on)

$global:testIterations = 80

for($i=1; $i -le $global:testIterations; $i++){

Get-VM -Name Windows7-$i | Move-VM -Destination (Get-VMHost 10.2.8.51) -RunAsync
}$global:testIterations = 240

for($i=161; $i -le $global:testIterations; $i++){Get-VM -Name Windows7-$i | Move-VM -Destination (Get-VMHost 10.2.8.53) -RunAsync

}

$global:testIterations = 320

for($i=241; $i -le $global:testIterations; $i++){Get-VM -Name Windows7-$i | Move-VM -Destination (Get-VMHost 10.2.8.54) -RunAsync

}

# Get IP from VM to see if it is booted (vmware tools need to be installed ) -
Get-VM NTNX* |where {$_.'PowerState' -eq "PoweredOn"}| %{
write-host $_.Guest.IPAddress[0]}
10.2.8.59
10.2.8.60
10.2.8.55
10.2.8.56
10.2.8.57
10.2.8.58

to get count:

$global:count = 0
Get-VM Windows7-* |where {$_.'PowerState' -eq "PoweredOn"}| %{
$ip = $_.Guest.IPAddress[0]
if ($ip -ne " ") { write-host $ip
$global:count +=1
}}
write-host "Count of IPs is " $global:count

<snippet>

169.254.127.112

169.254.165.80
169.254.104.109
169.254.11.254
169.254.248.101
169.254.239.204
169.254.186.164
169.254.127.112
169.254.24.136
169.254.123.158
169.254.129.15
169.254.212.87
169.254.47.86
Count of VMs with ip is 320 ( monitor upto 320 is up)

Tuesday, May 8, 2012

Bringing Hadoop closer to live data!

Of late, I have been reading and listening to my collegues talk about Hadoop , MapReduce and Twitter's Spout and bolts!

Most important part of Hadoop HDFS - MapReduce is already being implemented in Nutanix. This allows us to bring live data to the Hadoop cluster in read-only mode accessing same vdisks rather than waiting for server to dispatch the data nightly. Instead of Map Reduce, we could run Spot and Bolts to map reduce continuously.

If we plan to run Hadoop Jobs nightly, we could even have an Adaptive Chameleon Compute Clusters, which at day time runs regular jobs (VDI,etc) and in night time it runs Hadoop. VMware has a lot of tools and PowerCLI commands could acheive this by powering off or changing resources to Hadoop VMs.

Just so excited to work with Nutanix and so much more we could do with decoupling from Centralized
Storage to Distributed Storage.

We are just scratching the surface. That is my assignment to read further and step into the future.

Friday, May 4, 2012

ESXi reboot reason:
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1019238

Beyond SAN and FCOE

SAN Introduction:

When Ethernet was 100 Mb/s and 10 Mb/s and Direct Attach Storage was restricting the growth and clustering of servers ( veritas clustering/sun clustering), to provide Networking of Storage Arrays and reliable transportation layer, new technology came into play (fiber channel). With reliable Fiber Channel, 10/8 bit encoding ( providing elasticity, additional frames for IDLE/K28.5), SCSI CDBs were encoded in FCF.With LOGIN mechanisms (FLOGI/PLOGI/PRLI/TPLS), FC Zoning/Port-Security/Lun masking provided the access control. FSPF/RCF/BF/DIA/Principal Switch to manage multiple SAN/FC switches.
Then came NPIV to reduce the pain of managing multiple domains, that brought the pain of one host flapping causing other host to fail (bad TCAM programming). Even after few years of SAN administration, very select group of engineers understand this complex technology.

FCOE Introduction:

While this revolution was going on, Gigabit ethernet went through its own metamorphosis into 1G/10G/40G/100G. Question came to Engineers on relevance of FC, as we can have reliable GigE via LossLess Switches with much less Latency and PAUSE frames ( to manage Congestion). Multiple vendors came up with their own method of encapsulating FC frames (CEE/DCE). Vendors started building CNA adapters, with less thought given whether FC or Ethernet should have MSI or MSI-X. Only adapter that has all
the functionalities SR_IOV/VNTAG/VIFs is Palo Adapterv(M81KR) that works only with UCS. There are two technologies VEPA/VNTAG competing on CNA and it is not interoperable. A specific FCOE switch is needed to support VEPA or VNTAG.

http://infrastructureadventures.com/2010/12/08/io-virtualization-overview-cna-sr-iov-vn-tag-and-vepa/

I feel it is force fitting FC in the GigE to extend the life of FC, so the vendors can sell expensive Storage Arrays and expensive Switches ( FCOE/Lossless/PFC).
But we don't need storage arrays or SAN switch with advent of powerful CPUs, larger hard drives, faster Solid State Devices, PCI storages where you can bring Storage back to the server.
Storage Array:
Storage Array is multiple disks formatted in vendor specific RAID with front end cache with NFS/FC/CNA adapter connected to expensive SAN switch through FC cables and hosts also have expensive adapters (FC HBAs/GigE or CNAs).
Most of the array vendors take 6 months to a year , to certify the latest HDDs, SDDs , Adapters, memory and have tedious process of upgrade the storage Array.

Nutanix Introduction:

Radical or not so Radical approach, as we have powerful CPUs, bigger network Pipe (10G/40G), with advent of faster spindle (local drives), solid state devices and PCI storage, Nutanix makes it easier to adapt newer technologies in this area without having replace your storage array.
Whenever a new CPU comes into market, HDD with faster spindle, new network adapters, we can easily replace and get better performance.

Nutanix depends on standard GigE infrastructure to make this all work, without having spend on seperate Storage Processors, Storage Array , FC/FCOE infrastructure and GigE infrastructure.

Instead of hodge podge approach of migrating SAN from FC to FCOE approach with multiple (non)standards, this approach provides future safe and leapfrogging current approach.

Conclusion:
Regular big vendors, don't want disruption rather want slow process of changing one component at a time, so that they have time to reinvent and have their customers buy over provisioned switches/servers/storage array/vendor specific HDDs which will be End of Life in few years.Customers will be stuck in reprovision, upgrade in baby steps, migrating to new storage Array/switch, learning new technologies and spending training dollars, instead of being productive with few basic technology moving parts.
Best of the FCOE technology that needs to be incorporated in future is:
vntag to create VIFs(SRIOV)/ LossLess or LessLoss Ethernet/PFC/SPMA/FPMA are awesome technologies that needs to make it to next Gen Ethernet Switches, but not VFC/FCoE/FIP.

It is time to build a futuristic datacenter with Nutanix Complete Cluster.