Monday, April 1, 2013

Reclaim the space on storage from a VM

When files are deleted within the  guest VM running on a NFS storage mounted on ESXi, basically only the inode entries are updated with info that these blocks can be reused. The actual NFS storage does not free up this space.


Above blog, talks about reclaiming via vmware tools and  sdelete.

On linux, you can do this

on linux try this on user VM , to if free space is 32G, then 

df -h 

/mnt/local_disk1 has 32G free, you can zero out about 32G. 

dd if=/dev/zero of=/mnt/local_disk${i}/zero.out bs=$((4*1024*1024)) count=8000 && rm /mnt/disk1/zero.out
then delete zero.out

Friday, March 15, 2013

Vcenter disconnects troubleshooting.

Recently we found from a customer where the ESXi connection to vcenter drops during backup or
during vmotion.

Synopsys:

1. Vmware might not support more than 100 connections per vmk port.
2. It is good to seperate vmotion into a seperate vmk nic  and vlan than being with management port. ( make sure vmks are in different subnet)
3. Firewall in ESXi might not be able to handle the traffic
4. Check the ethtool for any network drops. ( checking duplicate IPs always helps)
5. NFS datastore has more than 20,000 files. (ls -lR |wc -l)
6. ESXi resources (cpu/memory - there might be other performance issues)
7. Disable HA/DRS to see if disconnects disappear. (related to the datastores with
a lot of files)
8. vpxa/vpxd crashes.

Details:

1. How to check the number of connections per vmk ?

Quick script:


~ # esxcli network ip connection list | sort -k 4| awk '{print $4}'|cut -d ":" -f 1| uniq -c
 1 ------------------
14 0.0.0.0
394 10.X.Y.150   <<<< all were stuck at FIN_WAIT_2
40 127.0.0.1
7 192.168.5.1
1 Send
~ # esxcli network ip connection list | sort -k 4

2. Seperate the vmotion and management vmknic/vlan

3. Firewall settings:
~ # esxcli network firewall get
   Default Action: PASS
   Enabled: false 
   Loaded: true ( it should be loaded for HA to work)

4.  This will determine any errors or packet drops - due to switch /cable/port

     "ethtool -S vmnic0 |grep error| grep -v :\ 0"
     "ethtool -S vmnic0 |grep rx_no_buffer_count | grep -v :\ 0"
     "ethtool -S vmnic0 |grep drop"

5. Duplicate IP address - run arp commands from different node, when pinging ESXi.
or ssh to ESXi and see if keeps disconnecting.

 esxcli  network ip neighbor list -- see if any MAC entries are flapping .
arp -a (linux)



6. Make sure esxcfg-vmknic -l ( all the ips don't have duplicate ip/conflict/multiple vmks in same subnet) 
7. Finally review vpxa.log/vmkernel.log(APD) and hostd.log on ESXi as well as vpxd.log in vcenter. - vmkernel logs for vpxa crash

8. Fix when it is hung is to restart services ( services.sh restart)

PR 831801: The default value of FIN_WAIT_2 timer was erroneosly set to TCPTV_KEEPINTVL * TCP_KEEPCNT = 75* 0x400. This discrepancy results in the socket at FIN_WAIT_2 state to exist for a much longer time and if multiple such sockets are accumulated, they might impact new socket creation.

10. Disable firewall on ESXi  ( use these rules exactly, so HA will work)
 # esxcli network firewall get
   Default Action: DROP or PASS
   Enabled: false
   Loaded: true

12. Increase VPXA thread.
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2009217

13. NFS datastore has more than 20,000 files - vpxa could hang.

14. If there is memory contention, make sure ESXi has 2G of memory reserved.
 
 



Thursday, March 7, 2013

Heat Map Analysis

Versions Affected2.6.4; 3.0.2
Description
Command heat_map_printer is available in 3.0.2 and can be downloaded for
2.6.4 cluster ( ithaca: ~jerome/heat_map_printer.2.6.4)

heat_map_printer should be run on a  running active cluster.

heat map printer calculates how many extent groups were accessed in each
 tier during last six hours on all the nodes.

On 400 G SSD-PCIe card, 200G is used for extent groups (stargate-data),
with replication factor 2, only 100 G is available per node.
Solution
Output:

Column 1 : tier accessed during last X seconds.
Column 2:  Access between that time frame
(  between 300 sec and 600sec - 3930 extent groups were accessed)
Column 3: Cumulative access.
 (  from 0 to  600 sec 15907 = 11977 + 3930 extent groups was accessed )

Extent Group is 16 MB in size, so within less than 5 minutes, 11977 extent groups
were accessed, so total  GB will be ( 11977 * 16)/1024.

It divides based on SSD, HDD and total histogram.  The data accessed will be
more accurate within last 15 minutes  in SSD and after 30 minutes in HDD, because
of less ILM  migrate activities in that tier.

The last one lists the  name of VMDK and how busy each of the VMDKs were in
descending order.
Histogram for storage tier SSD-PCIe
 <300      : 11977      11977
 <600      : 3930       15907
 <900      : 2457       18364
 <1200     : 747        19111
 <1500     : 1134       20245
 <1800     : 1245       21490
 <2100     : 1445       22935
 <2400     : 935        23870
 <2700     : 988        24858
 <3000     : 1076       25934
 <3300     : 811        26745
 <3600     : 618        27363
Histogram for storage tier DAS-SATA
 <300      : 1082       1082
 <600      : 743        1825
 <900      : 663        2488
 <1200     : 433        2921
 <1500     : 374        3295
 <1800     : 854        4149
 <2100     : 724        4873
 <2400     : 660        5533
 <2700     : 969        6502
 <3000     : 729        7231
 <3300     : 806        8037
 <3600     : 481        8518
Total histogram
 <300      : 13059      13059
 <600      : 4673       17732
 <900      : 3120       20852
 <1200     : 1180       22032
 <1500     : 1508       23540
 <1800     : 2099       25639
 <2100     : 2169       27808
 <2400     : 1595       29403
 <2700     : 1957       31360
 <3000     : 1805       33165
 <3300     : 1617       34782
 <3600     : 1099       35881
VDisk id per egroup access: total egroups(35889)  - renamed different
Usernames with Username
     47854:       1292 (03 pct) fname:Username-flat.vmdk
     22204:       1180 (03 pct) fname:Username-flat.vmdk
     22205:       1130 (03 pct) fname:Username-flat.vmdk
    132938:       1125 (03 pct) fname:Username-flat.vmdk
     78361:       1110 (03 pct) fname:Username-flat.vmdk
    139761:       1089 (03 pct) fname:Username-flat.vmdk
     81750:       1045 (02 pct) fname:Username-flat.vmdk
     80434:       1007 (02 pct) fname:Username-flat.vmdk
     22240:        967 (02 pct) fname:Username-flat.vmdk
    161460:        951 (02 pct) fname:Username-flat.vmdk
     61030:        949 (02 pct) fname:Username-flat.vmdk
    130895:        948 (02 pct) fname:Username-flat.vmdk
     41526:        842 (02 pct) fname:Username-flat.vmdk
    104769:        833 (02 pct) fname:Username-flat.vmdk
    128032:        796 (02 pct) fname:Username-flat.vmdk
     22282:        758 (02 pct) fname:Username-flat.vmdk
     63566:        752 (02 pct) fname:Username-flat.vmdk
    140027:        743 (02 pct) fname:Username-flat.vmdk
    103007:        670 (01 pct) fname:Username-flat.vmdk
     75787:        646 (01 pct) fname:Username-flat.vmdk
     22246:        642 (01 pct) fname:Username-flat.vmdk
    135960:        640 (01 pct) fname:Username-flat.vmdk
    155611:        565 (01 pct) fname:Username-flat.vmdk
    127394:        564 (01 pct) fname:Username-flat.vmdk
     59341:        540 (01 pct) fname:Username-flat.vmdk
      2535:        518 (01 pct) fname:Username-flat.vmdk
     30385:        509 (01 pct) fname:Username-flat.vmdk
     82923:        498 (01 pct) fname:Username-flat.vmdk
    161143:        443 (01 pct) fname:Username-flat.vmdk
     50827:        424 (01 pct) fname:Username-flat.vmdk
    162668:        413 (01 pct) fname:Username-flat.vmdk
     39376:        411 (01 pct) fname:Username-flat.vmdk
     33416:        396 (01 pct) fname:Username-flat.vmdk
     29321:        393 (01 pct) fname:Username-flat.vmdk
     31597:        355 (00 pct) fname:Username-flat.vmdk
     41560:        355 (00 pct) fname:Username-flat.vmdk
     58304:        302 (00 pct) fname:Username-flat.vmdk
    138675:        300 (00 pct) fname:Username-flat.vmdk
   1138125:        284 (00 pct) fname:Username-flat.vmdk
     38653:        282 (00 pct) fname:Username-flat.vmdk
    106381:        273 (00 pct) fname:Username-flat.vmdk
    131231:        267 (00 pct) fname:Username-flat.vmdk
     54950:        261 (00 pct) fname:Username-flat.vmdk
     46777:        261 (00 pct) fname:Username-flat.vmdk
    139497:        246 (00 pct) fname:Username-flat.vmdk
    124993:        237 (00 pct) fname:Username-flat.vmdk
     79091:        236 (00 pct) fname:Username-flat.vmdk
     49942:        225 (00 pct) fname:Username-flat.vmdk
    151952:        225 (00 pct) fname:Username-flat.vmdk
   1222704:        223 (00 pct) fname:Username-flat.vmdk
     22018:        220 (00 pct) fname:Username-flat.vmdk
     65750:        219 (00 pct) fname:Username-flat.vmdk
    108415:        211 (00 pct) fname:Username-flat.vmdk
    112976:        208 (00 pct) fname:Username-000001-delta.vmdk
    137102:        208 (00 pct) fname:Username-flat.vmdk
    131914:        206 (00 pct) fname:Username-flat.vmdk
     46847:        205 (00 pct) fname:Username-flat.vmdk
     22284:        200 (00 pct) fname:Username-flat.vmdk
    103593:        199 (00 pct) fname:Username-flat.vmdk
     22251:        199 (00 pct) fname:Username-flat.vmdk
    105040:        190 (00 pct) fname:Username-flat.vmdk
     48304:        187 (00 pct) fname:Username-flat.vmdk
    114661:        186 (00 pct) fname:Username-flat.vmdk
     49179:        186 (00 pct) fname:Username-flat.vmdk
    125596:        184 (00 pct) fname:Username-flat.vmdk
     81201:        183 (00 pct) fname:Username-flat.vmdk
    108229:        181 (00 pct) fname:Username-flat.vmdk
     83459:        181 (00 pct) fname:Username-flat.vmdk
     74649:        169 (00 pct) fname:Username-flat.vmdk
    112339:        165 (00 pct) fname:Username-flat.vmdk
     32009:        156 (00 pct) fname:Username-flat.vmdk
     38308:        155 (00 pct) fname:Username-flat.vmdk
     37874:        154 (00 pct) fname:Username-flat.vmdk
    113781:        152 (00 pct) fname:Username-flat.vmdk
     73465:        150 (00 pct) fname:Username-flat.vmdk
     60248:        150 (00 pct) fname:Username-flat.vmdk
     73301:        148 (00 pct) fname:Username-flat.vmdk
    345401:        145 (00 pct) fname:Username-flat.vmdk
     50071:        136 (00 pct) fname:Username-flat.vmdk
     37870:        133 (00 pct) fname:Username-flat.vmdk
     58256:        125 (00 pct) fname:Username-flat.vmdk
     61625:        123 (00 pct) fname:Username-flat.vmdk
    570923:         93 (00 pct) fname:Username-flat.vmdk
     14826:         53 (00 pct) fname:Username-flat.vmdk
    930800:          4 (00 pct) fname:vmware.log
    930512:          2 (00 pct) fname:vmware.log
      4221:          1 (00 pct) fname:protectedlist_
    133299:          1 (00 pct) fname:vmware.log
 perl -ne 's/$1/Username/  if /fname:(.*?)-/;print' heatmap.20130307095821

Tuesday, December 11, 2012

ESXtop:
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1008205

SIOC:
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1019687

Dynamic Add node and IPV6

Nutanix IPV6 requirements



Nutanix software version 2.6 requires IPv6 Link Local
a. to discover Nutanix nodes for configuring the nodes for the first time,
b. discover during dynamic add node
c. Reconfiguring IP address.
Most switches support IPv6 neighbour discovery protocol by IPV6 link local address even if IPv6 is not enabled in the routers, so this blog explains the link local address and procedure to verify if it is enabled in the switch and verify it in the controller VM.  Please note that all the controller VMs should be connected
same broadcast domain, so that IPV6 link local addresses are reachable.




Verify IPv6 connectivity from the Controller VM

nutanix@NTNX-Ctrl-VM-1-NTNX:172.16.8.84:~$ ifconfig eth0|grep inet6

inet6 addr: fe80::20c:29ff:fef2:cb25/64

nutanix@NTNX-Ctrl-VM-2-NTNX:172.16.8.85:~$ ifconfig eth0|grep inet6

inet6 addr: fe80::20c:29ff:feb0:3e61/64

nutanix@NTNX-Ctrl-VM-1-NTNX:172.16.8.84:~$ ping6 -I eth0 fe80::20c:29ff:feb0:3e61

PING fe80::20c:29ff:feb0:3e61(fe80::20c:29ff:feb0:3e61) from fe80::20c:29ff:fef2:cb25 eth0: 56 data bytes

64 bytes from fe80::20c:29ff:feb0:3e61: icmp_seq=1 ttl=64 time=18.0 ms


64 bytes from fe80::20c:29ff:feb0:3e61: icmp_seq=2 ttl=64 time=0.212 ms


64 bytes from fe80::20c:29ff:feb0:3e61: icmp_seq=3 ttl=64 time=0.180 ms



Dynamic ADD NODE:



Make sure Rackable Unit Serial is different from Existing Nodes. Newer Factory
installed systems do have different Rackable unit Serial Number.

If you see less than the number of nodes that are ready to be added to existing cluster
, probably they are  not connected to same switch.


From the existing clusters' CVM:
ncli cluster discover-nodes
Cluster Id :
Hypervisor Address : 10.x.y.158
Ip : fe80::20c:29ff:feab:7822%eth0
Ipmi Address : 192.168.2.116
Node Position : A
Node Serial : 487c804a-dd23-49cc-bcc1-f8e7123dc0b3
Rackable Unit Model : NX-2000
Rackable Unit Serial : 2
Service Vm Address : 10.x.y.166
Svm Id :
Svm Version : ServiceVM-1.23_Ubuntu
Cluster Id :
Hypervisor Address : 10.14.23.161
Ip : fe80::20c:29ff:fe4d:5273%eth0
Ipmi Address : 192.x.y.119
Node Position : D
Node Serial : ac77f7cf-a248-46af-9d56-023392978bd9
Rackable Unit Model : NX-2000
Rackable Unit Serial : 2
Service Vm Address : 10.x.y.169
Svm Id :
Svm Version : ServiceVM-1.23_Ubuntu
Cluster Id :
Hypervisor Address : 10.x.y.160
Ip : fe80::20c:29ff:fea3:2642%eth0
Ipmi Address : 192.x.y.118
Node Position : C
Node Serial : b9237d59-4979-4829-9a83-dfaa64dd4b5c
Rackable Unit Model : NX-2000
Rackable Unit Serial : 2
Service Vm Address : 10.x.y.168
Svm Id :
Svm Version : ServiceVM-1.23_Ubuntu
Cluster Id :
Hypervisor Address : 10.x.y.159
Ip : fe80::20c:29ff:fee9:a13c%eth0
Ipmi Address : 192.x.y117
Node Position : B
Node Serial : 6c31971d-fe7b-43e5-979f-2953a48a9d62
Rackable Unit Model : NX-2000
Rackable Unit Serial : 2
Service Vm Address : 10.x.y.167
Svm Id :
Svm Version : ServiceVM-1.23_Ubuntu
ncli cluster add-node ncli cluster add-node node-serial=487c804a-dd23-49cc-bcc1-f8e7123dc0b3;ncli cluster add-node node-serial=6c31971d-fe7b-43e5-979f-2953a48a9d62;ncli cluster add-node node-serial=b9237d59-4979-4829-9a83-dfaa64dd4b5c;ncli cluster add-node node-serial=ac77f7cf-a248-46af-9d56-023392978bd9
Node added successfully
Node added successfully
Node added successfully
Node added successfully
 ncli host list|grep "Service VM Address"
Service VM Address : 10.x.y.49
Service VM Address : 10.x.y.48
Service VM Address : 10.x.y.50
Service VM Address : 10.x.y.51
Service VM Address : 10.x.y.166
Service VM Address : 10.x.y.167
Service VM Address : 10.x.y.168
Service VM Address : 10.x.y.169
cluster status|grep CVM

CVM: 10.x.y.166 Up
CVM: 10.x.y.167 Up
CVM: 10.x.y.168 Up
CVM: 10.x.y.169 Up
CVM: 10.x.y.48 Up
CVM: 10.1x.y.49 Up
CVM: 10.x.y.50 Up
CVM: 10.x.y.51 Up, ZeusLeader
2012-10-21 20:38:10 INFO cluster:906 Success!

 nodetool -h localhost ring ( one by one nodes will get added in Limbo state and they become normal) - wait for all nodes to become normal.
Address Status State Load Owns Token
pVOFDgRpkq7rwjiZf0A7PdlGDLSswKByL8RZOTKcrHowOfT5FYbhPvy7PJvJ
10.x.y3.51 Up Normal 5.2 GB 20.00% 8zNLWFUeWeHJqvTxC9Fwc0CeIXGI5Xx7LnDjM2prxJR7YmfBrU1GnbaHPDnJ
10.x.y.49 Up Normal 4.02 GB 20.00% E1bIw6wcpRQ0XIGqRXkkN1Y5Af0b9ShinS36jxJxH9r56yZMqJxPztsE3Jiz
10.x.y.50 Up Normal 2.96 GB 20.00% TWCak3rlqTeO315iAG3asx0QNlPXfLkiqZswbC91t5TrLz1hsdBRDRCSR2OK
10.x.y.166 Up Limbo 774.37 MB 20.00% eZvBW6nzS9dTKtTMw6HVJ5RVNmeijP0UI2l8OyI76MYQLPsPcOjVoJzLcndo
10.x.y.48 Up Normal 4.35 GB 20.00% pVOFDgRpkq7rwjiZf0A7PdlGDLSswKByL8RZOTKcrHowOfT5FYbhPvy7PJvJ
. nodetool -h localhost ring -- all 8 nodes in Normal mode
 
Connect to nutanix console
(https://CVM_IP). Edit Storage Pool - add SATA drives ( + until 5) and PCIe ( + until 1) on new nodes
 Edit container and add datastore on 4 new nodes
 
add 4 new nodes to vcenter

Friday, June 1, 2012

Vcenter Disconnects due to APD or Firewall

When ESXi5.0 disconnects temporarily from the Vcenter, more often or not, it is related to the Storage issues and other times it could be ESXi5.0 firewall.

Storage Issues: (APD)

Symptoms:

Vcenter Disconnects and Errors in VMKernel Log:

2011-07-30T14:47:41.361Z cpu1:2642)WARNING: NMP: nmpDeviceAttemptFailover:658:Retry world failover device "naa.60a98000572d54724a34642d71325763" - failed to issue command due to Not found (APD), try again...

To prevent APD during Storage Array maintenance:

1. If you have maintenance window, on Vcenter -> ESXi host -> Configuration -> Storage - unmount the datastore  (datastore tab) and then detach the device( devices tab). ( u need VMs to be powered off, no heartbeat datastore and there are quite a few prerequistes)

2. The following commands also can be executed to prevent APD, if you have too many devices to do the unmount.( depending on SCSI timeout values, it is good to follow  step 1)

To enable w/o requiring downtime during storage Array maintenance, execute:  ( this might help in not having to unmount/detach datastores/devices)
# esxcfg-advcfg -s 1/VMFS3/FailVolumeOpenIfAPD

Note: This command might prevent new storage devices to be discovered. So use only during maintenance.
Revert the changes back via the following command.
To check the value of this option, execute:
# esxcfg-advcfg -g /VMFS3/FailVolumeOpenIfAPD
Revert it back after the maintenance:
# esxcfg-advcfg -s 0 /VMFS3/FailVolumeOpenIfAPD

Firewall  Issues Causing Vcenter Disconnects:
Symptoms : Other than vcenter disconnects
 Vpxa.log (on ESXi host):
Stolen/full sync required message:
"2012-02-02T18:32:49.941Z [6101DB90 info 'Default'
opID=HB-host-56@2627-b61e8cd4-e4] [VpxaMoService::GetChangesInt]
Forcing a full host synclastSentMasterGen = 2627 MasterGenNo from vpxd
= 2618
2012-02-02T18:32:49.941Z [6101DB90 verbose 'Default'
opID=HB-host-56@2627-b61e8cd4-e4] [VpxaMoService::GetChangesInt] Vpxa
restarted or stolen by other server. Start a full sync"
Difficulty translating between host and vpxd:
2012-01-24T19:26:05.705Z [FFDACB90 warning 'Default']
[FetchQuickStats] GetTranslators -- host to vpxd translation is empty.
Dropping results
2012-01-24T19:26:05.706Z [FFDACB90 warning 'Default']
[AddEntityMetric] GetTranslators -- host to vpxd translation is empty.
Dropping results
2012-01-24T19:26:05.706Z [FFDACB90 warning 'Default']
[AddEntityMetric] GetTranslators -- host to vpxd translation is empty.
Dropping results
2012-01-24T19:26:05.706Z [FFDACB90 warning 'Default']
[AddEntityMetric] GetTranslators -- host to vpxd translation is empty.
Dropping results
2012-01-24T19:26:05.706Z [FFDACB90 warning 'Default']
[AddEntityMetric] GetTranslators -- host to vpxd translation is empty. Difficulty translating between host and vpxd:
 Vpxd.log (on vCenter Server):
 
Timeouts/Failed to respond and Host Sync failures:
2012-01-24T18:50:15.015-08:00 [00808 error 'HttpConnectionPool']
[ConnectComplete] Connect error A connection attempt failed because
the connected party did not properly respond after a period of time,
or established connection failed be
cause connected host has failed to respond.
After trying few options , our team successfully able to avoid vcenter disconnects with:
esxcli network firewall load  - if it is unloaded , then you can't enable HA, so we have to load it
esxcli network firewall set --enabled false
esxcli network firewall set --default-action true
esxcli network firewall get
 rc.local: (to persist between reboots)

1) echo -e "# disable firewall service\nesxcli network firewall load\nesxcli network firewall set --enabled false\nesxcli network firewall set --default-action true\n# disable firewall service" >> /etc/rc.local
2) auto-backup.sh

Addendum:
We had a useful simple and dirty scripts to monitor dropping, as Vcenter disconnects were intermittent and will recover after couple of minutes and we can't sit infront of the computer all the day

a. cat check_vcenter_disconnect_hep.sh  ( ./check_venter_disconnect_hep.sh >> public_html/disconnect.txt )
#!/bin/sh
while true
do
dddd=`date -u +%Y-%m-%dT%H`
echo $dddd
for seq in `seq 1 4`
do
echo "hep$seq"
~/scripts/hep$seq |grep $dddd
done
sleep 1800
done
b. cat hep1 ( we had hep1-hep4)
#!/usr/bin/expect
spawn ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no root@172.x.y.z
expect "word:"
send "esxi12345"
expect "#"
#send "rm  /var/run/log/vpxa.\[1-9\]\r"
#expect "#"
send "gunzip  /var/run/log/vpxa*.gz\r"
expect "#"
send "egrep stolen /var/run/log/vpxa*\r"
expect "#"
send "egrep -i dropping /var/run/log/vpx*\r"
expect "#"
send "egrep -i performance /var/log/vmkernel.log\r"
expect "#"
send "exit\r"


Useful Webpages:
 http://www.virtualizationpractice.com/all-paths-down-16250/
PDLs/APDs http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2004684

Monday, May 21, 2012

Creating Full clones on Nutanix via NFS VAAI

Aim: to create 320 VMs on Nutanix NFS datastore  and  power on 320 VMs .
Guest VM size - Windows 7, 20G HDD on NFS, 2G mem with VmwareTools.
Number of ESX hosts - 4 ESX hosts ( 80 VMs per node)
Storage - Same ESX servers ( no additional hardware, other than Arista switch interconnecting these
ESX servers)  - Compute and Storage convergence.

Script help and source: http://www.vmdev.info/?p=202, Tabrez and Steve Poitras

Vcenter before running the script:( except for the clone in local datastore, there are no VMs other than Nutanix controller VMs)



 On Nutanix: - Create storage pool, container and NFS datastore - from a clean cluster.
 a. Create Storage Pool


b. Create container
c. Create NFS datastore




ESXi  :now sees the datastore


esxcfg-nas -l
NTNX-ctr1 is /ctr1 from 192.168.5.2 mounted available



Script to create the full clone of thick vdisk from 1Win7-Clone on NFS datastore :(1Win7-clone - has vmware tools and power budget disabled in Windows 7 so that it does not go into standby mode)


Connect-VIServer 10.2.8.59 -User administrator -Password ntnx
1. $vm = Get-VM 1Win7-clone |Get-View 

2. $cloneFolder = $vm.parent 
$cloneSpec = new-object Vmware.Vim.VirtualMachineCloneSpec
$cloneSpec.Location = new-object Vmware.Vim.VirtualMachineRelocateSpec
3. $cloneSpec.Location.DiskMoveType = [Vmware.Vim.VirtualMachineRelocateDiskMoveOptions]::moveAllDiskBackingsAndAllowSharing
4. $cloneSpec.Location.Transform = [Vmware.Vim.VirtualMachineRelocateTransformation]::flat

5.$global:testIterations = 320
for($i=1; $i -le $global:testIterations; $i++){
$cloneName = "Windows7-$i"
$vm.CloneVM( $cloneFolder, $cloneName, $cloneSpec ) }


Explanation:
1.  Get-View of our clone master
2.  use same datastore as clone
3. allow sharing -  To create full clone by copying all disks (but not snapshot metadata), from the root to the child-most disk, except for non-child-most disks previously copied to the target
4.flat -causes it to be created as a thick disk
5. from 1 to 320 creates Windows7-$num with clone spec defined.


The following Vcenter snapshot shows that  clone creation  with NFS VAAI in progress and  320 VMs being created








 


Maintenance:
#To Remove VM
Remove-VM Windows7-*

# To Power on
Start-VM Windows7-*

#To start VMs on specific ESX server:

Get VM-Host ip| Get-VM Windows7-*| where {$_.'PowerState' -eq "PoweredOff"}  | Start-VM  -RunAsync -Confirm:$false

Get VM-Host ip| Get-VM Windows7-*| where {$_.'PowerState' -eq "Suspended"}  | Start-VM



#Migrate VM: (DRS should do it when powering on)

$global:testIterations = 80

for($i=1; $i -le $global:testIterations; $i++){

Get-VM -Name Windows7-$i | Move-VM -Destination (Get-VMHost 10.2.8.51)  -RunAsync
  }$global:testIterations = 240

for($i=161; $i -le $global:testIterations; $i++){
Get-VM -Name Windows7-$i | Move-VM -Destination (Get-VMHost 10.2.8.53)  -RunAsync

  }
 $global:testIterations = 320

for($i=241; $i -le $global:testIterations; $i++){
Get-VM -Name Windows7-$i | Move-VM -Destination (Get-VMHost 10.2.8.54)  -RunAsync

  }


# Get IP from VM to see if it is booted (vmware tools need to be installed ) -
Get-VM NTNX* |where {$_.'PowerState' -eq "PoweredOn"}| %{
write-host $_.Guest.IPAddress[0]}
10.2.8.59
10.2.8.60
10.2.8.55
10.2.8.56
10.2.8.57
10.2.8.58


to get count:

$global:count = 0
Get-VM Windows7-* |where {$_.'PowerState' -eq "PoweredOn"}| %{
$ip = $_.Guest.IPAddress[0]
if ($ip -ne " ") { write-host $ip
$global:count +=1
}}

write-host  "Count of IPs is " $global:count

<snippet>
 169.254.127.112

169.254.165.80
169.254.104.109
169.254.11.254
169.254.248.101
169.254.239.204
169.254.186.164
169.254.127.112
169.254.24.136
169.254.123.158
169.254.129.15
169.254.212.87
169.254.47.86
Count of VMs with ip is 320 ( monitor upto 320 is up)