Software defined storage and delivering performance

I had no idea what kind of title I should use for this post, since this is more about to talk about different solutions which I find interesting for the time beeing.

The last couple of years have shown a huge growth in both converged solutions and software defined X solutions (Where the X can stand for different types of hardware layers, such as Storage, Networking etc)

With this huge growth, there are alot of new “player in the field” which are in this space, this post is more to show some of these new players and what their capabilities are, and most importantly where they fit in. Now I work mostly with Citrix/Microsoft products and as such there is often a discussion of VDI(meaning stateless/persistent/rdsh/remote app functionality)

and a couple of years ago when deploying a VDI solution you needed to have a clustered virtual infrastructure running on a SAN, and the VMs where constricted to the troughput of the SAN.

Now traditional SAN’s mostly run with spindel drives since they are cheap, and has huge storage spaces. For instace a PS6110E Array http://www.dell.com/us/business/p/equallogic-ps6110e/pd Has the ability to house up to 24x 3,5” 7,200 RPM disks.

Which can then be upwards to 96TB of data. Now if you think about it, regular spindel disks have about roughly 120 IOPS (Depending on buffers, latency and spindels) and we should have a kind of RAID set running on the array for redundancy across disks as well. Using 24x drivers with RAID 6 and double parity (not really a good example but just to prove a point) gives us a total IOPS of 2380, which is lower then my SSD drive in my laptop. Now of course most arrays come with buffers and caches in different forms and flavors so my calculation is not 100% accurate. Another issue with using a regular SAN deployment is that you are dependant on having a solid networking infrastructure and if you have some latency there as well it affects the speed of the virtual machines. So in summary

* regular SAN’s are built for storage space and not for speed
* SAN’s also in most cases need their own backend networking infrastructure

And based upon these two “issues” many new companies have their starting grounds. One thing I need to cover first is that both Microsoft and VMware have both created their own way to deal with these issues. First Microsoft has created a solution with Storage Spaces with SMB 3.0. Storage Spaces is a kind of software raid solution running on top of the operating system and with features such as deduplication and storage tiering which allows data to be moved from fast SSD’s to regular HDD depending on if the data is hot or not. Storage spaces can either be using JBOD SAS or internal disks depending on the setup you want.  And with using SMB 3.0 we have features such as multichannel, RDMA. Both of these solutions makes it easier for us to build our own “SAN” using our regular networking infrastructure. But note that this still requires we have a solid networking infrastructure, but this allows us to create a low cost SAN with a solid performance.

Vmware has choosen a different approach with the VSAN technology. Instead of having the storage layer on the “other” side of the network, they built the storage layer right into the hypervisor.

Meaning that the storage layer is on the physical machine running the hypervisor meaning that we don’t have to think about the network for the virtual machines performance (even thou it is important to have a good networking infrastructure for the VM’s to replicate across different hosts for availability)

Now with VSAN, you need to fullfill some requirements in order to get started, since this solution runs locally on each server you need for instance to have a SSD drive for just the caching part of it, you can read more about the requirements here –> http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2058424

So its fun to see that for one,
* Microsoft still has the storage layer outside of the host but dramatically improves the networking protocol and add storage features on the file server.
* VMware moves the storage layer ontop of the hypervisor to move the data closer to the compute roles.

Now based on these ideas there are multiple different vendors which in essence bases their solution on the same.

First of we have Atlantis ILIO http://www.atlantiscomputing.com/products/, which is a virtual applicance which runs on top of the hypervisor. Now I’ve written about Atlantis before  https://msandbu.wordpress.com/2013/05/02/atlantis-ilio-2/ but in essence what it does is create a RAM disk on each host, and has the ability to use the SAN for persistent data (of course after the data has been compressed and deduped leaving a very small footprint) Now this solution allows virtual machines to run completely in RAM meaning that each VM has access to huge amounts of IOPS. So Atlantis also runs ontop of each hypervisor so it runs to close to the compute layer as possible and is not dependant on having high-end SAN infrastructure for persistence.

Atlantis has also recently released a new product called USX which is a more software-defined storage solution which allows to create pools of storage containing both local drives and or SAN/NAS (and not just a place to dump persistent data for VDI)

Secondly we have Nutanix, which unlike the others is not a regular software based approach, they deliver a hardware+software platform http://www.nutanix.com/the-nutanix-solution/architecture/#nav which has a kind of lego based approach, where you buy a node and compute and storage are locally and you can add more nodes to scale upwards. With Nutanix there are controller VM’s running on each node which are used for redundancy and availability. So in essence Nutanix have a solution which resembles alot of VSAN since you have the storage locally to the hypervisor and you have logic which is used for redundancy/availability.

And we also have PernixData which has their FVP product, which caches and accelerates reads & writes to the backend storage. Writes and reads are stored on the aggregated cache (which consists of either a flash drive such as Fusion-IO or SSD drives locally on each node) which allows IO traffic to be removed from the backend SAN.

image

 

Now there are also a bunch of other vendors, which I will cover in time. Gunnar Berger from Gartner also made a blogpost, showing the cost of VDI on different storage vendors http://blogs.gartner.com/gunnar-berger/the-real-cost-of-vdi-storage/ But most importantly this post is to give a bit awareness of some of the different products and vendors out there which allows you to think differently. You don’t always need to invest in a new SAN or buy expensive hardware to get the performance needed. There is a bunch of cool products out there just waiting for a test-drive Smilefjes

#atlantis, #dell-dvs, #nexenta, #nutanix, #pernix-data