Monday, May 21, 2012

Creating Full clones on Nutanix via NFS VAAI

Aim: to create 320 VMs on Nutanix NFS datastore  and  power on 320 VMs .
Guest VM size - Windows 7, 20G HDD on NFS, 2G mem with VmwareTools.
Number of ESX hosts - 4 ESX hosts ( 80 VMs per node)
Storage - Same ESX servers ( no additional hardware, other than Arista switch interconnecting these
ESX servers)  - Compute and Storage convergence.

Script help and source: http://www.vmdev.info/?p=202, Tabrez and Steve Poitras

Vcenter before running the script:( except for the clone in local datastore, there are no VMs other than Nutanix controller VMs)



 On Nutanix: - Create storage pool, container and NFS datastore - from a clean cluster.
 a. Create Storage Pool


b. Create container
c. Create NFS datastore




ESXi  :now sees the datastore


esxcfg-nas -l
NTNX-ctr1 is /ctr1 from 192.168.5.2 mounted available



Script to create the full clone of thick vdisk from 1Win7-Clone on NFS datastore :(1Win7-clone - has vmware tools and power budget disabled in Windows 7 so that it does not go into standby mode)


Connect-VIServer 10.2.8.59 -User administrator -Password ntnx
1. $vm = Get-VM 1Win7-clone |Get-View 

2. $cloneFolder = $vm.parent 
$cloneSpec = new-object Vmware.Vim.VirtualMachineCloneSpec
$cloneSpec.Location = new-object Vmware.Vim.VirtualMachineRelocateSpec
3. $cloneSpec.Location.DiskMoveType = [Vmware.Vim.VirtualMachineRelocateDiskMoveOptions]::moveAllDiskBackingsAndAllowSharing
4. $cloneSpec.Location.Transform = [Vmware.Vim.VirtualMachineRelocateTransformation]::flat

5.$global:testIterations = 320
for($i=1; $i -le $global:testIterations; $i++){
$cloneName = "Windows7-$i"
$vm.CloneVM( $cloneFolder, $cloneName, $cloneSpec ) }


Explanation:
1.  Get-View of our clone master
2.  use same datastore as clone
3. allow sharing -  To create full clone by copying all disks (but not snapshot metadata), from the root to the child-most disk, except for non-child-most disks previously copied to the target
4.flat -causes it to be created as a thick disk
5. from 1 to 320 creates Windows7-$num with clone spec defined.


The following Vcenter snapshot shows that  clone creation  with NFS VAAI in progress and  320 VMs being created








 


Maintenance:
#To Remove VM
Remove-VM Windows7-*

# To Power on
Start-VM Windows7-*

#To start VMs on specific ESX server:

Get VM-Host ip| Get-VM Windows7-*| where {$_.'PowerState' -eq "PoweredOff"}  | Start-VM  -RunAsync -Confirm:$false

Get VM-Host ip| Get-VM Windows7-*| where {$_.'PowerState' -eq "Suspended"}  | Start-VM



#Migrate VM: (DRS should do it when powering on)

$global:testIterations = 80

for($i=1; $i -le $global:testIterations; $i++){

Get-VM -Name Windows7-$i | Move-VM -Destination (Get-VMHost 10.2.8.51)  -RunAsync
  }$global:testIterations = 240

for($i=161; $i -le $global:testIterations; $i++){
Get-VM -Name Windows7-$i | Move-VM -Destination (Get-VMHost 10.2.8.53)  -RunAsync

  }
 $global:testIterations = 320

for($i=241; $i -le $global:testIterations; $i++){
Get-VM -Name Windows7-$i | Move-VM -Destination (Get-VMHost 10.2.8.54)  -RunAsync

  }


# Get IP from VM to see if it is booted (vmware tools need to be installed ) -
Get-VM NTNX* |where {$_.'PowerState' -eq "PoweredOn"}| %{
write-host $_.Guest.IPAddress[0]}
10.2.8.59
10.2.8.60
10.2.8.55
10.2.8.56
10.2.8.57
10.2.8.58


to get count:

$global:count = 0
Get-VM Windows7-* |where {$_.'PowerState' -eq "PoweredOn"}| %{
$ip = $_.Guest.IPAddress[0]
if ($ip -ne " ") { write-host $ip
$global:count +=1
}}

write-host  "Count of IPs is " $global:count

<snippet>
 169.254.127.112

169.254.165.80
169.254.104.109
169.254.11.254
169.254.248.101
169.254.239.204
169.254.186.164
169.254.127.112
169.254.24.136
169.254.123.158
169.254.129.15
169.254.212.87
169.254.47.86
Count of VMs with ip is 320 ( monitor upto 320 is up)

Tuesday, May 8, 2012

Bringing Hadoop closer to live data!

Of late, I have been reading and listening to my collegues talk about Hadoop , MapReduce and Twitter's Spout and bolts!

Most important part of Hadoop HDFS - MapReduce is already being implemented in Nutanix. This allows us to bring live data to the Hadoop cluster in read-only mode accessing same vdisks rather than waiting for server to dispatch the data nightly. Instead of Map Reduce, we could run Spot and Bolts to map reduce continuously.

If we plan to run Hadoop Jobs nightly, we could even have an Adaptive Chameleon Compute Clusters, which at day time runs regular jobs (VDI,etc) and in night time it runs Hadoop. VMware has a lot of tools and PowerCLI commands could acheive this by powering off or changing resources to Hadoop VMs.

Just so excited to work with Nutanix and so much more we could do with decoupling from Centralized
Storage to Distributed Storage.

We are just scratching the surface. That is my assignment to read further and step into the future.

Friday, May 4, 2012

ESXi reboot reason:
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1019238

Beyond SAN and FCOE

SAN Introduction:



When Ethernet was 100 Mb/s and 10 Mb/s and Direct Attach Storage was restricting the growth and clustering of servers ( veritas clustering/sun clustering), to provide Networking of Storage Arrays and reliable transportation layer, new technology came into play (fiber channel).  With reliable Fiber Channel, 10/8 bit encoding ( providing elasticity, additional frames for IDLE/K28.5), SCSI CDBs were encoded in FCF.With LOGIN mechanisms (FLOGI/PLOGI/PRLI/TPLS), FC Zoning/Port-Security/Lun masking provided the access control. FSPF/RCF/BF/DIA/Principal Switch to manage multiple SAN/FC switches.
Then came NPIV to reduce the pain of managing multiple domains, that brought the pain of one host flapping causing other host to fail (bad TCAM programming). Even after few years of SAN administration, very select group of engineers understand this complex technology.

FCOE Introduction: 


While this revolution was going on, Gigabit ethernet went through its own metamorphosis into 1G/10G/40G/100G. Question came to Engineers on relevance of FC, as we can have reliable GigE via LossLess Switches with much less Latency and  PAUSE frames ( to manage Congestion). Multiple vendors came up with their own method of encapsulating FC frames (CEE/DCE). Vendors started building CNA adapters, with less thought given whether FC or Ethernet should have MSI or MSI-X.  Only adapter that has all
the functionalities SR_IOV/VNTAG/VIFs is Palo Adapterv(M81KR) that works only with UCS. There are two technologies VEPA/VNTAG competing on CNA and it is not interoperable. A specific FCOE switch is needed to support VEPA or VNTAG.

 http://infrastructureadventures.com/2010/12/08/io-virtualization-overview-cna-sr-iov-vn-tag-and-vepa/

I feel it is force fitting FC in the GigE to extend the life of FC, so the vendors  can sell expensive Storage Arrays and  expensive Switches ( FCOE/Lossless/PFC).
But we don't need storage arrays or SAN switch with advent of powerful CPUs, larger hard drives, faster Solid State Devices, PCI storages where you can bring Storage back to the server.
Storage Array:
Storage Array is multiple disks formatted in vendor specific RAID with front end cache with NFS/FC/CNA adapter connected to expensive SAN switch through FC cables and hosts also have expensive adapters (FC HBAs/GigE or CNAs).
Most of the array vendors take 6 months to a year , to certify the latest HDDs, SDDs , Adapters, memory and have tedious process of upgrade the storage Array.

Nutanix Introduction:

Radical or not so Radical approach, as we have powerful CPUs, bigger network Pipe (10G/40G), with advent of faster spindle (local drives), solid state devices and PCI storage, Nutanix makes it easier to adapt newer technologies in this area without having replace your storage array.
Whenever a new CPU comes into market, HDD with faster spindle, new network adapters, we can easily replace and get better performance.

Nutanix depends on  standard GigE infrastructure to make this all work, without having spend on seperate Storage Processors, Storage Array , FC/FCOE infrastructure and GigE infrastructure.

Instead of hodge podge approach of migrating SAN from FC to FCOE approach with multiple (non)standards, this approach provides future safe and leapfrogging current approach.


Conclusion:
Regular big vendors, don't want disruption rather want slow process of changing one component at a time, so that they have time to reinvent and have their customers buy over provisioned switches/servers/storage array/vendor specific HDDs which will be End of Life in few years.Customers will be stuck in reprovision, upgrade in baby steps, migrating to new storage Array/switch, learning new technologies and spending training dollars, instead of being productive with few basic technology moving parts.
Best of the FCOE technology that needs to be incorporated in future is:
vntag to create VIFs(SRIOV)/ LossLess or LessLoss Ethernet/PFC/SPMA/FPMA are awesome technologies that needs to make it to next Gen Ethernet Switches, but not VFC/FCoE/FIP.

It is time to build a futuristic datacenter with Nutanix Complete Cluster.