Last post was about clever app Latch and how to create a simple plugin to integrate it on Windows logon. It was good fun to create, write and share but one issue arose during the test and was discussed on the comments section: the issue I had with the server responding slowly was not right, or at least it wasn’t shared by other users.
Decided to check if the problem was on my side I boot up again the virtual machine I downloaded from modern.ie to develop and test the Latch plugin and tried to recreate the issue. My scenario is as follow:
pGina Simulator Tool -> Fiddler -> Internet Connection -> Latch Servers
Fiddler offers a detailed list of the time spent for each step of the request/response communication and, as previously, my results were:
Request Count: 1 Bytes Sent: 261 (headers:261; body:0) Bytes Received: 803 (headers:425; body:378) ACTUAL PERFORMANCE -------------- ClientConnected: 00:47:23.719 ClientBeginRequest: 00:47:24.078 GotRequestHeaders: 00:47:24.078 ClientDoneRequest: 00:47:24.078 Determine Gateway: 0ms DNS Lookup: 0ms TCP/IP Connect: 0ms HTTPS Handshake: 0ms ServerConnected: 00:47:23.875 FiddlerBeginRequest: 00:47:24.078 ServerGotRequest: 00:47:24.078 ServerBeginResponse: 00:47:30.078 GotResponseHeaders: 00:47:30.078 ServerDoneResponse: 00:47:30.078 ClientBeginResponse: 00:47:30.078 ClientDoneResponse: 00:47:30.078 Overall Elapsed: 0:00:06.000
My first impression, when I first saw that, was to think Latch servers were the ones having some issues, as the action taking more time to complete was between ServerGotRequest and ServerBeginResponse. Case closed!
Or maybe not. So which one was the next step? Trying to reproduce it from another machine and different internet connection (previous one was Central London, now trying from sunny Madrid). And oh dear… it was a blast! Response received in less than a second. Solid 0.5 seg. response, every time.
So looks like Fiddler was lying to me (or maybe not Fiddler, but the way Microsoft raises different events that Fiddler might be listening when intercepting/analysing the HTTP requests) and I was on the quest to identify my network problems.
First thing was to reproduce the issue again, but this time running something else than Fiddler, so I executed Procmon on the physical machine, to check the network related events to the virtual machine:
And with some rules to highlight any network related activity, to easily identify when the connection was performed and when the response was received.
After that was a matter of dig into the long list of generated events to try to identify the root of the problem. In my case the event corresponding with the ServerGotRequest in Fiddler was:
08:59:00.6788252 VirtualBox.exe 9044 TCP Send pato.hitronhub.home:3306 -> ec2-54-72-11-190.eu-west-1.compute.amazonaws.com:https SUCCESS Length: 293, startime: 77060, endtime: 77061, seqnum: 0, connid: 0
And then, for more than five seconds, no more TCP activity, until I got this:
08:59:05.9079370 VirtualBox.exe 9044 TCP Receive pato.hitronhub.home:3306 -> ec2-54-72-11-190.eu-west-1.compute.amazonaws.com:https SUCCESS Length: 837, seqnum: 0, connid: 0
Now my quest was to identify the guilty threat taking all those 5 seconds and not honouring the Amazon Cloud servers’ speed. Lucky for us Process Monitor offers a nice resume of the events happening for each process, so we can easily identify where we should start looking:
As we can clearly see on the picture above we have some crazy peak of registry actions just before each network action (the request and the response are both represented on the graph) but I was only noticing it on the response, as Fiddler only measured after the request was initiated.
Checking the events of the registry I found that VirtualBox process was searching for different registry keys that didn’t exist. Those keys were related to Network Interfaces and checking my network interfaces quickly revealed the problem:
At least four different services were active (before taking that screenshot) on my Internet connection, some of them, like the DNE LightWeight Filter or the Microsoft Network Monitor 3 Driver which I was not longer using (and forgot to unistall). Disabling/deleting them solved my problem and I was able now to enjoy full speed Latch servers once again.
Actually I found this quite a while, but didn’t have proper time to write it here until now. I posted it briefly on my twitter account:
installing three virtualization platforms and 4 different VPNs software on my machine is finally paying off! Crazy network errors everywhere
— Pedro Laguna (@p_laguna) February 20, 2014