So the last couple of days I’ve been spending time wrapping myself up in some of the data availability features in the Nutanix platform. Now with Nutanix we have the possibility to setup local cluster snapshots schedules. We also have the option to define remote site snapshots (Which can either be a physical cluster or the cloud, AWS supports backup and DR, while Azure supports backup)
Now there are some important rules to rememeber. Nutanix can do local snapshots but it cannot be shipped out of the cluster.. If we want to live by the 3-2-1 rules (3 copies, 2 media and 1 off-site) we should still leverage some sort of backup solution.
Now before I go ahead and describe the setup I want to share some observations. If you are using AHV on Nutanix, Commvault is the only vendor supports direct backup of AHV as of now. If you are using Hyper-V or VMware ESXi I would highly recommend Veeam! Now another observation is if you are using Hyper-V on Nutanix. By default Nutanix exposes an SMB 3.0 share to Hyper-V, the problem with SMB 3.0 is that Microsoft themselves do not have a built-in change block tracking mechanism…… Now Veeam has built a filter driver that fixes this for Microsoft based SMB shares, but it does not work on non-microsoft SMB shares like the one that Nutanix uses, therefore you will got get CBT feature when doing backup from SMB shares on Nutanix (Which will then in most likelyhood increase your backup window) Important to remember thou that this IS going to be fixed in Windows Server 2016, still embarresed that Microsoft didn’t have this feature in 2012 R2.
Now back to the Nutanix part, now there are some concepts that you need to be aware of in Nutanix. First of they use the term Protection Domains which is a grouping of one or more virtual machines. Then again we can have one or more consistency group within a protection domain, which again has 1 or more more virtual machines in them. After defining a protection domain we can then create one or more schedules, which can then be pointed locally and to a remote site.
This visio drawning should summurize a lot better what I just wrote
Now there are somethings we need to be aware of, a protection domain can of now only have 50 machines in them. A machine can also only be a part of one protection domain. Now by default machines in a consistency group are taken crash-consistently (Meaning that all data is captured at the exactly the same time) and is the most commonly backup technique that is used today and is sufficient for applications that do not rely on a database.
Now the consistency group is s subset of entities in a production domain and are treated as a single group and backed up collectivly in a snapshot. If you think about this in the example I have above. When Nutanix is going to do a snapshot what will happen is that from a storage perspective it will run it in serial order
Snapshot (Consistency Group 1 then;
Snapshot (Consistency Group 2 then;
Snapshot (Consistency Group 3
and then the schedule how many retention point that we should have stored locally.
NOTE: Have multiple virtual machines in a consistency group is only supported for VMware ESXi or AHV, it is not supported for Hyper-V in that case you need to have a single-virtual machine within its own its consistency group.
Now the next option we have when defining snapshots is application consistent snapshot, which is the common case when dealing with SQL databases, Exchange and so on.
Dealing with application consistent snapshots also involve working with the virtual guest VMs VSS providers. In that case Nutanix with commnunicate with the Guest VM tools using the Hypervisor and then trigger an shadow copy.
This option is available only available for ESXi and AHV, no Hyper-V. In case you want to have application consistent backup you should install the Nutanix guest tools on them. For ESXi if the guest tools are not installed it will leverage the VMware guest tools. For AHV deployments it requires the Nutanix guest tools or else you will not get an application consistent snapshot
And also if you want to have this feature enabled for an consistency group you can only have one virtual machine in a consistency group
Another thing you need to think about is that we need to define application consistent snapshots on the schedule as well. This is because in some cases we might not want to have an application consistent backup because the SQL database or solution might be to busy to be able to trigger an VSS copy. So even if we enable application consistent snapshost on the consistency group but not on the schedule we define. The end result is going to be a non application consistent snapshot.
This table shows the end-results depending on what we choose
Now when we define retention points in the schedule we define how long a snapshot should be retained. If we have a schedule which defines a snapshot should be taken each night at 02:00 and we store two retention points locally, it means that we have the option restore a virtual machine back for two days, when the schedule is triggered the third time, the first snapshot is removed from the cluster.
Important thing to also rememeber is that a snapshot is still attached to the original vDisk. When a snapshot occurs, the base vDisk is marked immutable and another vDisk is created as read/write, the base one is just for reads.
Important to remember that using the local snapshot feature we can also leverage file-level restore using the Nutanix guest tools –> https://msandbu.wordpress.com/2015/10/22/enabling-file-level-restore-on-nutanix-in-nos-4-5-1/