Software defined Storage options for Hyper-V

As I see that Hyper-V gaining more and more traction, I also see that we are in the need for better storage solutions around it. Now Microsoft has Storage Spaces which came in 2012 and introduces features like Dedup as well. Problem with the deduplication feature is that it was mostly aimed at VDI enviroments (for Hyper-V) and not tradisional servers and was limited to one thread, in Windows Server 2016 this is expanded with support for backup workloads. Storage Spaces was also enhanced with tiering in 2012 R2 which gives the abiility to add SSD disks add move data between tiers on a storage spaces setup and also gives us the ability to do Write-back cache for random writes. In the upcoming Windows Server 2016, we know that we will have the option to do Storage Spaces Direct (Meaning local attached disks on server nodes to work as a streched cluster) just like VSAN and so on) which can either act as a Scale-out file server cluster or as an hyper converged solution combining SMB and Hyper-V on the same roles. Which gives an architectual advantage since it allows us to scale much simpler (to the amount of nodes supported which is set to 32)

Microsoft also introduced SMB 3.0 protocol which allows for scale out communications with features such as

  • SMB Multichannel (Which allows to use multiple network connections as the same time)
  • SMB Direct (Which gives low-latency conections over RDMA)
  • Usage for SQL and Hyper-V over SMB

So SMB is good for fault-tolerance and high troughput options, and with RDMA it gives us low latency connections but it is still limited to the disks and controllers which are behind the SMB file servers, and using SMB with regular network cards is still TCP(Which has about 5 –8% overhead if not configured properly), which in most cases will perform slower then localized virtual machines on individual hosts, so what about other options and using memory as a tier?

Here are some numbers to chew on (From Jeff Dean) about speed where Memory is a bit of the equation.

L1 cache reference                             0.5 ns
Branch mispredict                              5 ns
L2 cache reference                             7 ns
Mutex lock/unlock                            100 ns (25)
Main memory reference                        100 ns
Compress 1K bytes with Zippy              10,000 ns (3,000)
Send 2K bytes over 1 Gbps network         20,000 ns
Read 1 MB sequentially from memory       250,000 ns
Round trip within same datacenter        500,000 ns
Disk seek                             10,000,000 ns
Read 1 MB sequentially from network   10,000,000 ns

Microosft also introduce something called CSV cache (Which was available from Server 2012) which allows us to allocate system memory as a write-trough cache. The CSV Cache provides caching of read-only unbuffered I/O Which in essence makes it work good with Hyper-V clusters and Scale-out file servers using CSV

Problem with CSV cache is does not work with.

  • Tiered Storage Space with heat map tracking enabled
  • Deduplicated files using in-box Windows Server Data Deduplication feature (Note:  Data will instead be cached by the dedup cache) 
  • ReFS volume with integrity streams enabled (Note:  NTFS is the recommended file system as the backend for virtual machine VHDs in production deployments)

Means that we cannot get the best of both worlds, where we could combine Memory, SSD, and HDD in the same storage pool.

Another thing is that Microsoft does not offer inline-dedup for storage traffic, their dedup engine runs as a background task (post process)

With Windows Server 2016 Im saying that Microsoft is moving towards a feature set which gives their customers a basic feature set of what they need in the software defined storage space

  • Hyper convereged (Storage Spaces Direct)
  • Tiering capabilities
  • Enhanced decuplication
  • High troughput on SMB
  • Low cost

So for those that require more Performance, Feature and so on for Hyper-V, in terms of what options are there?

For Vmware there are already a long list of different vendors that deliver storage optimization / SDS / HCI solutions

  • Atlantis
  • Pernixdata
  • Nexenta
  • Nutanix
  • SimpliVity
  • VSAN
  • DataCore

Both Atlantis and SimpliVity have stated that they will have support Hyper-V “Soon”. Atlantis does have support for Hyper-V on their ILIO product but not for USX.

As of now only Nutanix and DataCore have full support for Hyper-V and SMB 3.0 both of them offer more flexibility in terms of features and better performance with use of memory as a tier which is just of the basic stuff. Tune in as I will explore these features troughout the next blogposts and show how they differ from the built-in features in Microsoft.

NOTE: The vendors that are in the list, are the ones I know about, I didnt do a very long check so if someone knows about someone else please let me know.

#datacore, #hyper-v-storage-spaces, #nutanix