Putting ThinWire and Framehawk to the test!

Framehawk and Thinwire – It’s all about the numbers

Recently me and Mikael @mikael_modin attended a Citrix User Group Conference in Norway, where Mikael held a session regarding when and when to use Framehawk, you can read his entire blogpost here –> http://bit.ly/1PV3104 and I have already done some details regarding Framehawk from a networking perspective.

The main point in Mikael’s presentation was that although using Framehawk in situations when packet loss is tremendously better, Thinwire Advance will often be “enough” or even more useful when there is only latency involved. This is because of the use of CPU, RAM and most of all bandwidth.
Another thing he pointed out was that Framehawk needs “a lot” of bandwidth to be at its best.
The recommendations for Thinwire is a minimum of 1,5MBps + 150kbps per user while recommendations for Framehawk is a minimum of 4-5Mbps + 150kbps per user.

There is a lot of naming conventions when it comes to Thinwire. Although we can see Thinwire as one protocol, there are different versions of it.
Thinwire is all about compressing data before sending it. The methods for this are:

· Legacy Thinwire (Pre win8 / Server 2012R2)

· Thinwire Compatibility Mode (New with FP3, also known as Thinwire +, Win8 / Server 2012R2 and later. This version takes advantage of how new operating systems constructs the graphics.
For more info read the following blog post written by Muhammad Dawood http://bit.ly/WEnSDN

· Thinwire Advance (uses H.264 to compress the data)

For a more detailed overview when to use each technology, you can refer to the following table:

clip_image002

When we came back home we decided to take a closer look at what impact had on CPU, RAM and bandwidth Thinwire or Framehawk had and we have found some very interesting data.

Our tests includes the following user workload;

· Logging in and waiting 1 minute for the uberagent to gather data and getting the session up and ready.

· Open a PDF file, scrolling up and down for 1 minute. (The PDF is located locally on the VM to exclude network I/O)

· Connect to a webpage www.vg.no, which is a Norwegian newspaper which contains a lot of different objects and high graphics, and scrolling up and down for a 1 minute. 

· We then open Microsoft Word and type randomly for 1 minute.

· Last but not least our favorite opening of the Avengers trailer in fullscreen using Chrome for the full duration of 2 minutes.

This allows us to see which workloads generate how much bandwidth, CPU- and RAM usage with each of the different protocols.

To collect and analyze the data we were using the following tools

· Splunk – Uberagent (Get info we didn’t even think was possible!)

· Netbalancer (Show bandwidth, set packet loss, define bandwidth limits and define latency)

· Citrix Director

– Displaystatus (to verify the protocol status)

The sample video below shows how the tests is being run. This allows us to closer analyze the sample data from Netbalancer as well.

NOTE: During the testing there might be some slight alterations from test to test since this not an automated test but running as an typical enduser experience, but these were so minor that we can conclude that the numbers are within +/-5%

We had two Windows 10 VDI running the latest release of XenDesktop 7.6 FP3 during the testing phase.

· MCS1002 is for the test02 user, which is not using Framehawk

· MCS1003 is for the test01 user, which has Framehawk, enabled using policies

· Use of Codec were deactivated through policy to ensure that Thinwire was used

The internett connection is a solid 100 MBps, the average connection to the Citrix enviroment is about 10 – 20 MS latency.

The sample video in this URL https://www.youtube.com/watch?v=F89eQPd7shs shows how the tests is being run. This allows us to closer analyze the sample data from Netbalancer as well.

Some notes so far: Some Framehawk sessions get stuck on the Netscaler, we can see existing connections not being dropped correctly, we can see this in the Netscaler GUI under Gateway –> DTLS sessions

After we changed the TCP profiles on the Netscaler we were unable to use Framehawk.
We then needed to reconfigure the DTLS and Certificate settings on the vServer and setup a new connection and Framehawk worked again as expected.

So after the initial run, we can note the following from the Netbalancer data;

We begin with looking at how Framehawk handles bandwidth.

We can see that the total session, which was about 7 minutes, Framehawk uses about 240 MBs of bandwidth to be able to deliver the graphics.
However, it was during the PDF and Webpage part of the test which really pushed it in terms of bandwidth, not the Youtube trailer.

clip_image003

Thinwire on the other hand, used only 47 MBs of bandwidth, and like we would expect more data was being used when showing the trailer than the PDF- and webpage section.

clip_image004

Using Splunk we care able to get a closer look at the Framehawk numbers.
Average CPU usage for the VDA agent was close up to 16% on average.

clip_image005

While using ThinWire the CPU usage was only 6% on average.

clip_image006

But the maximum amount of CPU usage came from Framehawk, which was close to 50% CPU usage at one point.

clip_image007

While ThinWire on the other hand, was only up to 18%

clip_image008

We can conclude that Framehawk uses much more CPU cycles in order to process the bandwidth, but from our testing we could see that the PDF part which generated a lot more traffic, allowed for a much more smooth experience. Not just from scrolling the document but also zooming in.

On the other side we can also see that Framehawk uses a bit more RAM then ThinWire does, about 400 MB was the maximum number

clip_image009

While Thinwire was about 300 MB

clip_image010

So this was the initial test, which shows that Thinwire uses less bandwidth, less memory and less CPU, but we can see that Framehawk on applications like PDF deliver a better user experience. So now, let us see how they fare when taking into account of latency and packet loss.

2% Packet loss
Framehawk

We started by testing Framehawk at 2% packet loss.
Looking at the bandwidth test we could see that is uses about 16 MB of bandwidth less with the packet loss. It’s still the PDF and Webpage that consumes the most resources, and now it is down to 224 MBs of bandwidth usage

The Maximum CPU usage peaked at 45%

And the average CPU usage was 19%

The amount of RAM used was a slight increase with 4MB

clip_image011

clip_image012

clip_image013

clip_image014


ThinWire

Now here comes the interesting part, using Thinwire at 2% packet loss, (up and down) will trigger a lot of TCP retransmissions because of the packet drops

clip_image015

(Remember that this is using an optimized Netscaler) we can see that ThinWire uses only 12 MBs of bandwidth! This is because of the TCP retransmissions, it will never be able to send large enough packets before the packet loss occurs.

So with Thinwire and 2% packet loss we could see that the bandwidth usage dropped with about 59 MB when we had the packet loss. The maximum bandwidth used in this session was 12Mbps

The maximum was also 50% lower than the reference test and showed only 3%

The average CPU usage was now only 3% (that is 50% of the reference test)

The RAM usage was about 30MB more than earlier

clip_image016

clip_image017

clip_image018

clip_image019

5% Packet loss
Framehawk

At 5% packet loss we can see that is uses about 50 MB of bandwidth extra. It’s still the PDF and Webpage that consumes the most resources, but now it is up to 300 MBs of bandwidth

We can also see that from a resource perspective, it still uses almost the same amount of max CPU %, but this might vary from test to test, but it is close to the 50%)

On average CPU usage we can see that it went up 4% from the initial testing, which makes sense since it needs to send more network packets which uses CPU cycles.

The RAM usage is the same as with 2% packet loss

clip_image020

clip_image021

clip_image023

clip_image024

5% Packet loss
ThinWire

Looking at the bandwidth usage with 5% packet loss and use of Thinwire the number is slightly lower and now uses 11MB

This can also be seen in the CPU usage of the protocol, since the packet loss occurs, the VDA does not need to send so much packets and hence the CPU usage is lower and stops at 7%

Average CPU usage is now just under 3%

RAM however is a bit larger with 330MB

clip_image025

clip_image026

clip_image027

clip_image028

End-user perspective
From an end-user perspective we can safely say that Framehawk delivered a much better experience, if we tried to follow the test from minute to minute, the ThinWire test took about 40 seconds longer just because of the delay from a mouse click to occur and doing things like zooming into a PDF file took so much time that it caused the test to take a longer time to complete.

Winner: Framehawk!

10% Packet loss
Framehawk

clip_image029

With 10% packet loss, we could see that the bandwidth usage went down a bit. That might again be that the packet loss was so high that it was unable to process all the data and hence the total bandwidth usage was lower than it was with 5%, and with the decrease in bandwidth, we can also see the CPU usage go down as well.

The max CPU usage was about the same with 47%

The average CPU usage was 19%

The RAM usage is the same at 404 MB

clip_image030

clip_image031

clip_image032

10% Packet loss
ThinWire

With 10% packet loss Thinwire was down to 6MB and the CPU usage also reflected this by only use 4% at peak and 1.6 % at average
RAM usage was still about the same as earlier and peaked at 326MB

clip_image033

clip_image034

clip_image035

clip_image036

End-user perspective
What we noticed here is that most of the different graphic intensive testing became unresponsive and that the ICA connection froze. The only thing that was really workable was using Word. Opening the PDF, Webpage and youtube became so unresponsive that is was not really workable.

Winner: Framehawk!

CPU Stats on Framehawk and Thinwire
NOTE: We have taken multiple samples of the CPU statistics on the Netscaler so this screenshots represent the average number we saw.
What we can see is that a framehawk which uses more bandwidth also will increase the CPU usage on the packet engines. The Netscaler from an idle state uses about 0 – 1,5 % CPU, which can be seen here à

clip_image037

NOTE: This is a VPX 1000 with 2 vCPU (Where we have only 1 packet engine) starting an ICA proxy session with the defaults over thin wire and starting the process that generates the most bandwidth (PDF scrolling and zooming) the packet CPU rises to about <1%

clip_image038

So it’s a minor increase which is expected since ThinWire uses a small amount of bandwidth. Now Framehawk on the other hand will use about 4% of the packet engine CPU. Note again that this was when we kept working with the PDF documentet.
We can conclude that using Framehawk will put a lot more strain on the Netscaler packet engine and therefore we cannot have as many users on the Netscaler.

clip_image039

RDP usage:
We also wanted to give RDP a test under different scenarios. We have some issues fetching out CPU and memory usage since RDP uses DWM and MSTSC which can appear as a sub-process of svchost
We therefore skipped that part and only focused on the bandwidth usage and end-user experience.

First we started out with a test where we have no limitations in form of latency and packet loss (This was using regular RDP against a Windows 10 with TCP/UDP

The initial test shows as we expected, RDP uses 53 MB of bandwidth

clip_image041

We also noticed that under the YouTube part that the Progressive rendering engine kicked in order to ensure optimal delivery but the graphics was ok.

RDP, 2% Packet loss

With 2% Packet loss the bandwidth usage was basically half 26MB of bandwidth

clip_image043

Keystrokes and some operations was a bit delayed, but still workable, on the other hand the progressive rendering engine on the youtube part made the graphics nearly impossible to see what actually happened, even thou audio worked fine.

RDP 5% Packet loss

RDP used about 17MB of bandwidth PDF scrolling and zooming made a huge delay in how the end-user could work. Surfing on the webpage which has a huge amount of graphics, freezed up for a couple of seconds. Youtube itself, well it didn’t work very well.

clip_image045

We can conlude that RDP uses more bandwidth that Thinwire under normal circumstances, but when coming to packet loss it does not deal with that pretty well.

So what does all these data tell us?
We can clearly see that Framehawk and Thinwire has its own use cases.
While Thinwire is the preferred method of delivering graphics, even with high latency, as soon as we experience packet loss off 3% or higher, Framehawk will definitively give a better use experience. Just remember to keep an eye on the resource usage on the VDI.
Especially when using it with XenApp since a spike in the CPU usage will have a great impact on the users who are logged on and will decrease the numenbr of users you can have on each server.

#bandwidth-usage, #citrix, #framehawk, #netscaler, #rdp, #thinwire, #thinwire-legacy, #thinwire-vs-framehawk