So been spending a couple of days now troubleshooting a enviroment where external users “sometimes” had issues with connecting (which was completely random) and getting the 1110 and 1030 error messages
and also that Wyze thin clients had issues (sometimes) as well. There was only an issue on the specific hosts which were creating the problem. So this is a post of things to check
First of this was a Hyper-V enviroment running PVS and the Netscaler was on another enviroment. There was two DCC which had Storefront installed. So for some reasons users had issues connecting to the enviroment. As in most cases I doublechecked the STA settings on the Storefront and Netscaler, and didn’t notice any error messages on the Storefront. Next thing I noticed was that the Hyper-V hosts had old HP Network drivers which in most cases had issues with VMQ. This would explain the sudden drop of a existing connection when inside, so we installed the latest NIC drivers and verified that VMQ was working as intended. Then I did a conclusion the case was resolved.
But the next day there were still users who were having issues with 1110 and 1030 issues, after some more troubleshooting I noticed that there was a host file entry (on one of the hosts) that was conflicting with the FQDN of the Storefront server, so when the Wyze clients were fetching the PNclient settings they were redirect in a loop which meant that they were never able to get the config.
So I noticed also that this meant that callback was not working to the Netscaler from that particular controller. The only problem is that even thou it was easy to fix, it would not resolve the 1110 and 1030 issues that we were experiencing.
Now what I noticed was that the 1030 and 1110 was at random, so I did a check while connecting to the Netscaler to make sure that the Netscaler was actually communicating with the correct VDA servers. Then I saw it, in the DNS records.
By default the Netscaler caches the DNS records for any VDA for 20 minutes, and by default the STA ticket will respond with hostnames. So when a external users was trying to gain access externally the ICAfile would contain what VDA agent to contact, the Netscaler would then ask DNS to get IP addresses of that VDA agent. For some reason the Netscaler got to addresses pr VDA agent from DNS. This would explain why external users would get that error message at random. Since the IP-addressses to the hosts, only one was active and the other was not.
So why was PVS servers registered with two IP-addresses? well that was the easy part.. Noticed I mentioned that this was running Hyper-V ?
Setting up PVS on Hyper-V requires two nics on each server. Legacy for PXE boot and then Synthentic to do the real-traffic. When setting up PVS 7.1 Citrix will “switch” between legacy and synthentic NIC after the OS is finished booting and the sythentic NIC is up and running. The only issue was that the Legacy NIC was able to update DNS before “being disabled”
So by updating the image and removing this feature and setting the Desktop Service to delay boot I made sure that the registration to the DDC was working as inteded and that no bad records in DNS.