Friday, May 4, 2012

Beyond SAN and FCOE

SAN Introduction:

When Ethernet was 100 Mb/s and 10 Mb/s and Direct Attach Storage was restricting the growth and clustering of servers ( veritas clustering/sun clustering), to provide Networking of Storage Arrays and reliable transportation layer, new technology came into play (fiber channel).  With reliable Fiber Channel, 10/8 bit encoding ( providing elasticity, additional frames for IDLE/K28.5), SCSI CDBs were encoded in FCF.With LOGIN mechanisms (FLOGI/PLOGI/PRLI/TPLS), FC Zoning/Port-Security/Lun masking provided the access control. FSPF/RCF/BF/DIA/Principal Switch to manage multiple SAN/FC switches.
Then came NPIV to reduce the pain of managing multiple domains, that brought the pain of one host flapping causing other host to fail (bad TCAM programming). Even after few years of SAN administration, very select group of engineers understand this complex technology.

FCOE Introduction: 

While this revolution was going on, Gigabit ethernet went through its own metamorphosis into 1G/10G/40G/100G. Question came to Engineers on relevance of FC, as we can have reliable GigE via LossLess Switches with much less Latency and  PAUSE frames ( to manage Congestion). Multiple vendors came up with their own method of encapsulating FC frames (CEE/DCE). Vendors started building CNA adapters, with less thought given whether FC or Ethernet should have MSI or MSI-X.  Only adapter that has all
the functionalities SR_IOV/VNTAG/VIFs is Palo Adapterv(M81KR) that works only with UCS. There are two technologies VEPA/VNTAG competing on CNA and it is not interoperable. A specific FCOE switch is needed to support VEPA or VNTAG.

I feel it is force fitting FC in the GigE to extend the life of FC, so the vendors  can sell expensive Storage Arrays and  expensive Switches ( FCOE/Lossless/PFC).
But we don't need storage arrays or SAN switch with advent of powerful CPUs, larger hard drives, faster Solid State Devices, PCI storages where you can bring Storage back to the server.
Storage Array:
Storage Array is multiple disks formatted in vendor specific RAID with front end cache with NFS/FC/CNA adapter connected to expensive SAN switch through FC cables and hosts also have expensive adapters (FC HBAs/GigE or CNAs).
Most of the array vendors take 6 months to a year , to certify the latest HDDs, SDDs , Adapters, memory and have tedious process of upgrade the storage Array.

Nutanix Introduction:

Radical or not so Radical approach, as we have powerful CPUs, bigger network Pipe (10G/40G), with advent of faster spindle (local drives), solid state devices and PCI storage, Nutanix makes it easier to adapt newer technologies in this area without having replace your storage array.
Whenever a new CPU comes into market, HDD with faster spindle, new network adapters, we can easily replace and get better performance.

Nutanix depends on  standard GigE infrastructure to make this all work, without having spend on seperate Storage Processors, Storage Array , FC/FCOE infrastructure and GigE infrastructure.

Instead of hodge podge approach of migrating SAN from FC to FCOE approach with multiple (non)standards, this approach provides future safe and leapfrogging current approach.

Regular big vendors, don't want disruption rather want slow process of changing one component at a time, so that they have time to reinvent and have their customers buy over provisioned switches/servers/storage array/vendor specific HDDs which will be End of Life in few years.Customers will be stuck in reprovision, upgrade in baby steps, migrating to new storage Array/switch, learning new technologies and spending training dollars, instead of being productive with few basic technology moving parts.
Best of the FCOE technology that needs to be incorporated in future is:
vntag to create VIFs(SRIOV)/ LossLess or LessLoss Ethernet/PFC/SPMA/FPMA are awesome technologies that needs to make it to next Gen Ethernet Switches, but not VFC/FCoE/FIP.

It is time to build a futuristic datacenter with Nutanix Complete Cluster.