Storage Wars–HCI edition

Permalink til innebygd bilde

There is alot of fuzz these days around hyperconverged, software defined storage etc.. especially since VMware announced VSAN 6.2 earlier this week, that trigge alot of good old brawling on social media. Since VMware was clearly stating that they are the marked leader in the HCI marked, if that is true or not I don’t know. So therefore I decided to write this post, just to clear up the confusion on what HCI actually is and what the different vendors are delivering in terms of features and how their  architecture differentiates. Just hopefully someone is as confused as I was in the beginning..

Now after a while now I’ve been working for this quite some time now, so in this post I have decided to focus on 4 different vendors in terms of features and what their architecture looks like.

  • VMware
  • Nutanix
  • Microsoft

PS: Things changes, features get updated, if something is wrong or missing let me know!

The term hyper-converged actually comes from the term converged infrastructure, where vendors started to provide a pre-configured bundle of software and hardware into a single chassis. This was to try minimize compability issues that we would have within the traditional way we did infrastructure, and of course make it easy to setup a new fabric.  So within hyperconverged we integrate these components even further so that they cannot be broken down into seperate components. So by using software-defined storage it allows us to deliver high-performance, highly available storage capacity to our infrastructure without the need of particular/special hardware. So instead of having the traditional three-tier architecture, which was the common case in the converged systems. We have servers where we combine the compute and storage, then we have software on the top which aggreagates the storage between mulitple nodes to create a cluster.

So in conclusion of this part, you cannot get hyperconverged without using some sort of software-defined storage solution.

Now back to the vendors. We have of course Microsoft and VMware which are still doing a tug o war with their relases, but their software-defined storage option has one thing in common. It is in the kernel. Now VMware was the first of the two to release a fully hyperconverged solution and as of today they released version 6.2 which added alot of new features. On the other hand Microsoft is playing this safe, and with Windows Server 2016 they are releasing a new version of Storage Spaces which now has an hyperconverged deployment option. Now belive it or not, Microsoft has had alot of success with the Storage Spaces feature, since it has been a pretty cheap setup and included that with some large needed improvements to the SMB protocol. So therefore let us focus on how VSAN 6.2 and Windows Server 2016 Storage Spaces Direct which both have “in-kernel” ways of delivery HCI.

VMware VSAN 6.2

Deployment types: Hybrid (Using SSD and spinning disks) or All-flash
Protocol support: Uses its own proprietary protocol within the cluster
License required: License either Hybrid or All-Flash
Supported workloads: Virtual machine storage
Hypervisor support: ESXi
Minimum nodes in a cluster: 2 (With a third node as witness)https://blogs.vmware.com/virtualblocks/2015/09/11/vmware-virtual-san-robo-edition/)
Hardware supported: VSAN Ready Nodes, EVO:RAIL and Build Your Own based on the HCL –> http://www.vmware.com/resources/compatibility/search.php?deviceCategory=vsan
Disk requirements: Atleast one SSD and one HDD
Deduplication support: Yes, starting from 6.2 near-line (only within an all flash array only
Compression support: Yes, starting from 6.2 near-line (only within an all flash array only)
Resilliency factor: Resiliency,  Fault Tolerance Method (FTM) Raid-1 Mirroring. Raid 5/6 are in the 6.2 release
Disk scrubbing: Yes, as of 6.2 release.
Storage QoS: Yes, as of 6.2 release. (Based upon a 32KB block size ratio) can be attached to virtual machines or datastores.
Read Cache: 0,4% of Host memory is used for read cache, where the VMs are located.
Data Locality: Sort of, it does not do client-side local read cache.
Network infrastructure needed: 1Gb or 10Gb ethernet network. (10GB only for all-flash) multicast enabled
Maximum number of nodes: 64 nodes pr cluster

Things that are important to remember is that VMware VSAN stores data within an object. So for instance if we are to create a virtual machine on a Virtual SAN datastore, VSAN would create an object for each virtual disk, snapshot and so on. It also creates a container object that stores all the metadata files of the virtual machine. So the availability factor can be configured pr object.  These objects are stored on one or multiple magnetic disks and hosts, and VSAN can access these objects remotely both read and write wise. VSAN does not have the concept of a pure data locality model like others do, a machine can be running on one host but the objects be stored on another, this gives a consistent performance if we for instance were to migrate a virtual machine from one host to another. VSAN has the ability to read for multiple mirror copies at the same time to distribute the IO equally.

Also VSAN has the concept of stripe width, since in many cases we may need to stripe and object across multiple disks. the largest component size in VSAN is 255 GB, so if we have an VMDK which is 1 TB, VSAN needs to stripe that VDMK file out to 4 components. The maximum strip width is 12. The SSD within VSAN act as an read cache and for a write buffer.

image

Windows Server 2016 Storage Spaces Direct*

*Still only in Tech Preview 4

Deployment types: Hybrid (Using SSD and spnning disks) or All-flash
Protocol support: SMB 3
License required: Windows Server 2016 Datacenter
Supported workloads: Virtual machine storage, SQL database, General purpose fileserver support
Hypervisor support: Hyper-V
Hardware supported: Storage Spaces HCL (Not published yet for Windows Server 2016)
Deduplication support: Yes but still only limited support workloads (VDI etc)
Compression support: No
Minimum nodes in a cluster: 2 *And using some form a witness to maintain quorom)
Resilliency factor: Two way mirror, three way mirror and Dual Parity
Disk scrubbing: Yes, part of chkdisk
Storage QoS: Yes, can be attached to virtual machines or shares
Read Cache: CSV Read cache (Which is part of the RAM on the host) also depending on deployment type. In Hybrid mode, SSD is READ & WRITE cache, therefore SSD is not used for persistent storage.
Data Locality: No
Network infrastructure needed: RDMA enabled network adapters, including iWARP and RoCE
Maximum number of nodes: 12 nodes pr cluster as of TP4
You can read more about under the hood about Storage Spaces Direct here –> http://blogs.technet.com/b/clausjor/archive/2015/11/19/storage-spaces-direct-under-the-hood-with-the-software-storage-bus.aspx
Hardware info: http://blogs.technet.com/b/clausjor/archive/2015/11/23/hardware-options-for-evaluating-storage-spaces-direct-in-technical-preview-4.aspx

Important thing to remember here is that we have an CSV volume which is created on top of a SMB file share. Using Storage Spaces Direct, Microsoft leverages mulitple features of the SMB 3 protocol using SMB Direct and SMB Multichannel. Another thing to think about is that since there is no form for data locality here, Microsoft is dependant on using RDMA based technology to with low-overhead read and write data from another host in the network. With has much less overhead then TCP based networks. Unlike VMware, Microsoft uses extents to spread data across nodes these are by default on 1 GB each.

image

Now in terms of difference between these two, well first of its the way the manage reads and writes of their objects. VMware has a distributed read cache, while on the other hand Microsoft requires RDMA but allows to read/write with very low overhead and latency from different hosts. Microsoft does not have any virtual machine policies that define how resillient the virtual machine is, but this is placed on the share (which is virtual disk) which defines what type of redundacy level it is. Now there are still things that are still not documentet on the Storage Spaces Direct solution.

So let us take a closer look at Nutanix.

Nutanix

Deployment types: Hybrid (Using SSD and spinning disks) or All-flash
Protocol support: SMB 3, NFS, iSCSI
Editions: http://www.nutanix.com/products/software-editions/
Supported workloads: Virtual machine storage, general purpose file service* (Tech Preview n
Hypervisor support: ESX, Hyper-V, Acropolis (Cent OS KVM custom build)
Hardware supported: Nutanix uses Supermicro general purpose hardware for their own models, but they have an OEM deal with Dell and Lenovo
Deduplication support: Yes, both Inline and post clusted based
Compression support: Yes both Inline and post process
Resilliency factor: RF2, RF3 and Erasure-Coding
Storage QoS: No, equal share
Read Cache: Unified Cache (Consists of RAM and SSD from the CVM)
Data Locality: Yes, read and writes are aimed at running on the local host which the compute resources is running on.
Network infrastructure needed: 1Gb or 10Gb ethernet network. (10GB only for all-flash)
Maximum number of nodes: ?? (Not sure if there are any max numbers here.

The objects on Nutanix are broken down to vDisks, which are composed of multiple extents.

Source: http://nutanixbible.com/

Unlike Microsoft, Nutanix operates with an extent size of 1MB, and the IO path is in most cases locally on the physical host.

image

When a virtual machine running on a virtualization platform does and write operations it will write to a part of the SSD on the physical machine called the OpLog (Depending on the resilliency factor, OpLog will then replicate the data to other node Oplog to achive the replication factor that is defined in the cluster. Reads are served from the Unified Cache which consists of RAM and SSD from the Controller VM which it runs on. If the data is not available on the cache it can get it from the extent store, or from another node in the cluster.

Source: http://nutanixbible.com/

Now all three vendors all have different ways to achive this. In case of Vmware and Microsoft which both have their solution in-kernel, Microsoft focused on using RDMA based technology to allow for low latency, high bandwidth backbone (which might work in the advantage, when doing alot of reads from other hosts in the network (when the traffic is becoming in-balanced)

Now VMware and Nutanix on the otherhand only require regular ethernet 1/10 GB network. Now Nutanix uses data locality and with new hardware becoming faster and faster that might work in their advantage since the internal data buses on a host can then generate more & more troughput, the issues that might occur which VMware published in their VSAN whitepaper on why they didn’t create VSAN with data locality in mind is when doing alot of vMotion which would then require alot of data to be moved between the different hosts to maintain data locality again.

So what is the right call? don’t know, but boy these are fun times to be in IT!

NB: Thanks to Christian Mohn for some clarity on VSAN! (vNinja.net)

#microsoft, #nutanix, #storage-spaces-direct, #vmware, #vsan